# RustDupe
[](https://github.com/MasuRii/RustDupe/actions/workflows/ci.yml)
[](https://crates.io/crates/rustdupe)
[](https://crates.io/crates/rustdupe)
[](https://opensource.org/licenses/MIT)
[](https://www.rust-lang.org)
**Smart Duplicate File Finder** — A high-performance, cross-platform duplicate file finder built in Rust with an interactive TUI.

---
## Table of Contents
- [Features](#features)
- [Installation](#installation)
- [Usage](#usage)
- [CLI Reference](#cli-reference)
- [Performance](#performance)
- [Contributing](#contributing)
- [License](#license)
## Features
- **High Performance**: Parallel directory walking and BLAKE3 hashing for maximum speed.
- **Interactive TUI**: Review duplicate groups, preview files, and select copies for deletion in a navigable interface.
- **Multi-Phase Optimization**:
1. Group by file size (instant filtering).
2. Compare 4KB pre-hashes (fast rejection).
3. Full content hash for final confirmation.
4. Optional byte-by-byte verification (paranoid mode).
- **Safe Deletion**: Moves files to system trash by default (cross-platform support).
- **Hardlink Aware**: Automatically detects and skips hardlinks (same inode) to prevent false positives.
- **Unicode Support**: Handles macOS NFD vs. Windows/Linux NFC normalization issues.
- **Machine Readable**: Export results to JSON or CSV for scripting and automation.
## Installation
### From crates.io (Recommended)
```bash
cargo install rustdupe
```
> **Requires Rust 1.85 or later.** Install Rust via [rustup](https://rustup.rs/).
### Pre-built Binaries
Download the latest release for your platform from the [GitHub Releases](https://github.com/MasuRii/RustDupe/releases) page.
| Linux | x86_64 | `rustdupe-*-x86_64-unknown-linux-gnu` |
| Linux (musl) | x86_64 | `rustdupe-*-x86_64-unknown-linux-musl` |
| macOS | x86_64 | `rustdupe-*-x86_64-apple-darwin` |
| macOS | Apple Silicon | `rustdupe-*-aarch64-apple-darwin` |
| Windows | x86_64 | `rustdupe-*-x86_64-pc-windows-msvc.exe` |
### From Source
```bash
git clone https://github.com/MasuRii/RustDupe.git
cd rustdupe
cargo build --release
```
The binary will be available at `target/release/rustdupe`.
## Usage
### Basic Scan (Interactive TUI)
```bash
rustdupe scan ~/Downloads
```
### Non-Interactive Modes (Automation)
```bash
# Export to JSON
rustdupe scan ~/Documents --output json > duplicates.json
# Export to CSV
rustdupe scan /path/to/media --output csv > duplicates.csv
```
### Advanced Options
```bash
# Filter by size
rustdupe scan . --min-size 1MB --max-size 1GB
# Ignore specific patterns
rustdupe scan . --ignore "*.tmp" --ignore "node_modules"
# Enable paranoid byte-by-byte verification
rustdupe scan . --paranoid
# Custom I/O threads (default: 4)
rustdupe scan . --io-threads 8
```
## CLI Reference
```text
Usage: rustdupe [OPTIONS] <COMMAND>
Commands:
scan Scan a directory for duplicate files
help Print this message or the help of the given subcommand(s)
Arguments:
<PATH> Directory path to scan for duplicates
Options:
-v, --verbose... Increase verbosity level (-v for debug, -vv for trace)
-q, --quiet Suppress all output except errors
--no-color Disable colored output
-h, --help Print help
-V, --version Print version
Scan Subcommand Options:
-o, --output <OUTPUT> Output format (tui, json, csv) [default: tui]
--min-size <SIZE> Minimum file size to consider (e.g., 1KB, 1MB)
--max-size <SIZE> Maximum file size to consider (e.g., 1KB, 1MB)
-i, --ignore <PATTERN> Glob patterns to ignore
--follow-symlinks Follow symbolic links
--skip-hidden Skip hidden files and directories
--io-threads <N> Number of I/O threads for hashing [default: 4]
--paranoid Enable byte-by-byte verification
--permanent Use permanent deletion instead of trash
-y, --yes Skip confirmation prompts
```
## Performance
RustDupe is optimized for speed through several techniques:
| **BLAKE3 hashing** | 2.8-10x faster than SHA-256, with multi-threaded scaling |
| **Parallel directory walking** | Uses `jwalk` for 4x faster traversal than sequential walking |
| **Multi-phase deduplication** | Early rejection via size grouping and 4KB pre-hashes |
| **Work-stealing thread pool** | Near-linear scaling with CPU cores via Rayon |
### Benchmarks
On a typical workstation (8-core CPU, NVMe SSD):
| Home directory | ~50,000 | 100 GB | ~15s |
| Photo library | ~20,000 | 200 GB | ~25s |
| Source code | ~100,000 | 10 GB | ~5s |
> **Note**: Actual performance varies based on disk speed, file sizes, and duplicate ratio.
## Contributing
Contributions are welcome! Please read our [Contributing Guidelines](CONTRIBUTING.md) before submitting a Pull Request.
### Quick Start
1. Fork the repository
2. Create your feature branch (`git checkout -b feature/amazing-feature`)
3. Commit your changes (`git commit -m 'feat: add amazing feature'`)
4. Push to the branch (`git push origin feature/amazing-feature`)
5. Open a Pull Request
See [CONTRIBUTING.md](CONTRIBUTING.md) for detailed guidelines on:
- Development setup
- Code style and linting
- Testing requirements
- Commit message conventions
## Security
For security vulnerabilities, please see our [Security Policy](SECURITY.md).
## License
Distributed under the MIT License. See [LICENSE](LICENSE) for more information.
---
<p align="center">
Made with ❤️ in Rust
</p>