get_dir_hash
Deterministic directory hashing with glob ignores and optional metadata โ powered by BLAKE3.
Tiny, fast, and predictable. Great for cache keys, change detection, CI, and reproducible builds.
Features
- โ Deterministic: stable walk order & path framing โ identical trees โ identical digests
- ๐ Fast: streams file contents; BLAKE3 under the hood
- ๐งน Ignores: simple
.gitignore-like glob rules (viaglobset) - ๐งพ Metadata (opt-in): include file mode (Unix) & mtime (secs/nanos)
- ๐๏ธ Symlinks: optionally follow symlinks during traversal
- ๐งฐ Tiny: zero heavy deps (just
blake3,globset,walkdir, tiny CLI parser)
Install
# CLI
# Library
CLI usage
# hash current directory
# pick a dir
# ignore patterns (can be repeated)
# load patterns from a file
# follow symlinks and include basic metadata (mode + mtime)
# disable auto-loading of .get_dir_hashignore in root
get_dir_hash also auto-loads .get_dir_hash_ignore from the root directory unless --no-dotfile is passed.
Example .get_dir_hash_ignore:
# ignore build artifacts and logs
target/**
**/*.log
*.tmp
Output format:
<hex-digest> <path>
Library usage
use ;
use Path;
What exactly is hashed?
For every regular file (after ignore rules):
-
Framing: we feed the outer BLAKE3 hasher with a domain tag
b"get_dir_hash-v1\0"and, per file, a record:b"F\0" + <normalized-relative-path> + b"\0" + <BLAKE3(content)> -
Optional metadata (
--include-metadata/Options::include_metadata):- Unix: file mode is included.
- All platforms: mtime as
(secs, nanos)is included.
Relative paths are normalized to Unix-style separators (/).
Ordering is stable (sorted by normalized path). You can also opt into case-insensitive path ordering via Options if needed for Windows-like behavior in caches.
Ignore rules
-
Syntax provided by
globset: supports**,*,?, etc. -
Patterns are evaluated relative to the root.
-
Not supported:
!-negations. -
Sources of patterns:
- Inline via
--ignore/Options::ignore_patterns - Files via
--ignore-file/Options::ignore_files - Auto-loaded
.get_dir_hash_ignorein root (unless--no-dotfile)
- Inline via
Why BLAKE3?
- Cryptographically strong and very fast
- Designed for parallelism and modern CPUs
- Widely used in the Rust ecosystem (
blake3crate)
Determinism
- Path normalization and sorted relative paths ensure stable input order.
- Hash framing with domain tags and zero byte separators removes ambiguity.
- Ignores and metadata flags must be identical across runs for equal outputs.
Notes & caveats
- Only regular files are hashed. Directories and device nodes are skipped.
- Symlinks are not followed by default (
Options::follow_symlinks = false). - Metadata inclusion is optional. If enabled, the digest can change even when contents stay the same (e.g., mtime updates).
- Paths are normalized to use
/as a separator in the digest framing.
Rust Version
Tested with Rust v1.88
CI & Releases
- CI runs on Linux/macOS/Windows (build, test, clippy, fmt).
- GitHub Releases attach prebuilt binaries for common targets when pushing a tag like
v0.1.0.
License
Licensed under MIT.
Contributing
Issues and PRs are welcome! Please keep changes minimal and deterministic, and avoid heavy dependencies. Cheers!