find_duplicate_files 0.9.6

find duplicate files according to their size and hashing algorithm
Documentation

find_duplicate_files

Find duplicate files according to their size and hashing algorithm.

"A hash function is a mathematical algorithm that takes an input (in this case, a file) and produces a fixed-size string of characters, known as a hash value or checksum. This hash value is unique to the input data, meaning even a slight change in the input will result in a completely different hash value."

Hash algorithm options are:

  1. blake version 3 (default)

  2. fxhash (maybe the fastest algorithm)

  3. sha256

  4. sha512

To find duplicate files in a directory, run the command:

find_duplicate_files

Another example.

To find duplicate files with fxhash algorithm and yaml format:

find_duplicate_files -tsa fxhash -r yaml

Help

Type in the terminal find_duplicate_files -h to see the help messages and all available options:

find duplicate files according to their size and hashing algorithm

Usage: find_duplicate_files [OPTIONS]

Options:
  -a, --algorithm <ALGORITHM>
          Choose the hash algorithm [default: blake3] [possible values: blake3, fxhash, sha256, sha512]
  -f, --full_path
          Prints full path of duplicate files, otherwise relative path
  -g, --generate <GENERATOR>
          If provided, outputs the completion file for given shell [possible values: bash, elvish, fish, powershell, zsh]
  -m, --max_depth <MAX_DEPTH>
          Set the maximum depth to search for duplicate files
  -o, --omit_hidden
          Omit hidden files (starts with '.'), otherwise search all files
  -p, --path <PATH>
          Set the path where to look for duplicate files, otherwise use the current directory
  -r, --result_format <RESULT_FORMAT>
          Print the result in the chosen format [default: personal] [possible values: json, yaml, personal]
  -s, --sort
          Sort result by file size, otherwise sort by number of duplicate files
  -t, --time
          Show total execution time
  -h, --help
          Print help (see more with '--help')
  -V, --version
          Print version

Building

To build and install from source, run the following command:

cargo install find_duplicate_files

Another option is to clone/copy the project from github, compile and generate the executable:

git clone https://github.com/claudiofsr/find_duplicate_files.git

cd find_duplicate_files

cargo b -r && cargo install --path=.

Mutually exclusive features

Walking a directory recursively: jwalk or walkdir.

In general, jwalk (default) is faster than walkdir.

But if you prefer to use walkdir:

cargo b -r && cargo install --path=. --features walkdir