Skip to main content

Module scanner

Module scanner 

Source
Expand description

Scanner module for directory traversal and file hashing.

This module provides functionality for:

  • Parallel directory walking using jwalk
  • Content hashing with BLAKE3
  • Hardlink detection
  • Unicode path normalization

§Architecture

The scanner is divided into submodules:

  • walker: Directory traversal and file discovery
  • hasher: BLAKE3 file hashing (streaming)

§Example

use rustdupe::scanner::{Walker, WalkerConfig, FileEntry};
use std::path::Path;

// Configure the walker
let config = WalkerConfig {
    min_size: Some(1024),  // Skip files under 1KB
    skip_hidden: true,     // Skip hidden files
    ..Default::default()
};

// Walk the directory
let walker = Walker::new(Path::new("."), config);
for entry in walker.walk() {
    match entry {
        Ok(file) => println!("{}: {} bytes", file.path.display(), file.size),
        Err(e) => eprintln!("Warning: {}", e),
    }
}

Re-exports§

pub use hardlink::HardlinkTracker;
pub use hasher::hash_to_hex;
pub use hasher::hex_to_hash;
pub use hasher::Hash;
pub use hasher::Hasher;
pub use hasher::PREHASH_SIZE;
pub use path_utils::is_nfc;
pub use path_utils::normalize_path_str;
pub use path_utils::normalize_path_str_cow;
pub use path_utils::normalize_pathbuf;
pub use path_utils::path_key;
pub use path_utils::paths_equal;
pub use path_utils::paths_equal_normalized;
pub use walker::Walker;

Modules§

hardlink
Hardlink detection for avoiding false duplicate identification.
hasher
BLAKE3 file hasher with streaming support.
path_utils
Unicode path normalization utilities.
walker
Directory walker implementation using jwalk for parallel traversal.

Structs§

FileEntry
Metadata for a discovered file.
WalkerConfig
Configuration for directory walking.

Enums§

FileCategory
File categories for filtering.
HashError
Errors that can occur during file hashing.
ScanError
Errors that can occur during directory scanning.