Hash a file by path. Uses mmap for large files (zero-copy, no read() syscalls),
single-read + single-shot hash for small files, and streaming read as fallback.
Hash a file without fstat — just open, read until EOF, hash.
For many-file workloads (100+ tiny files), skipping fstat saves ~5µs/file.
Uses a two-tier buffer strategy: small stack buffer (4KB) for the initial read,
then falls back to a larger stack buffer (64KB) or streaming hash for bigger files.
For benchmark’s 55-byte files: one read() fills the 4KB buffer, hash immediately.
Batch-hash multiple files with SHA-256/MD5 using std::thread::scope.
Uses lightweight OS threads instead of rayon’s work-stealing pool.
For single-invocation tools, this saves ~300µs of rayon thread pool
initialization (spawning N-1 threads + setting up work-stealing deques).
Returns results in input order.
Parse a BSD-style tag line: “ALGO (filename) = hash”
Returns (expected_hash, filename, optional_bits).
bits is the hash length parsed from the algo name (e.g., BLAKE2b-256 -> Some(256)).
Issue readahead hints for a list of file paths to warm the page cache.
Uses POSIX_FADV_WILLNEED which is non-blocking and batches efficiently.
Only issues hints for files >= 1MB; small files are read fast enough
that the fadvise syscall overhead isn’t worth it.
Check if parallel hashing is worthwhile for the given file paths.
Always parallelize with 2+ files — rayon’s thread pool is lazily initialized
once and reused, so per-file work-stealing overhead is negligible (~1µs).
Removing the stat()-based size check eliminates N extra syscalls for N files.
Build and write the standard GNU hash output line in a single write() call.
Format: “hash filename\n” or “hash *filename\n” (binary mode).
For escaped filenames: “\hash escaped_filename\n”.