Crate rclean

Source
Expand description

A high-performance disk cleanup library with parallel processing.

This library provides functionality to scan directories, find duplicate files based on MD5 hashes, detect storage outliers, and generate detailed reports using Polars DataFrames.

Re-exports§

pub use comfy_table;
pub use globset;
pub use regex;

Modules§

clustering
Clustering module for detecting groups of similar files
mcp_server
models
outliers
Outlier detection module for finding files that consume disproportionate disk space.

Structs§

FileInfo
Information about a file including metadata and hash.
Glob
Glob represents a successfully parsed shell glob pattern.
GlobSet
GlobSet represents a group of globs that can be matched together in a single pass.
GlobSetBuilder
GlobSetBuilder builds a group of patterns that can be used to simultaneously match a file path.
Regex
A compiled regular expression for searching Unicode haystacks.
WalkOptions
Options for directory walking.

Enums§

PatternType
Pattern matching type for file filtering.

Functions§

calculate_similarity
Calculate similarity between two fuzzy hashes.
checksum
Compute checksums for files in parallel.
collect_file_info
Collect detailed file information in parallel.
create_dataframe
display_thread_info
Display threading information including CPU cores and thread pool size.
find
Find files matching a pattern.
find_advanced
Find files matching an advanced pattern.
find_duplicates
find_similar_files
Vector of groups of similar files with their similarity scores
generate_csv_report
generate_statistics
Generate file statistics summary.
run
Run the deduplication process.
run_with_advanced_options
Run deduplication with DataFrame support using advanced pattern matching.
run_with_dataframe
Run deduplication with DataFrame support and optional CSV output.
run_with_similarity
Run deduplication with similarity detection using fuzzy hashing.
validate_duplicates
walk
walk_with_options
Walk a directory recursively with gitignore support and return all file paths.

Type Aliases§

SimilarFileGroup
Find similar files based on fuzzy hashing and similarity threshold.