Skip to main content

Module hash

Module hash 

Source

Structs§

CheckOptions
Options for check mode.
CheckResult
Result of check mode verification.

Enums§

HashAlgorithm
Supported hash algorithms.

Functions§

blake2b_hash_data
Hash raw data with BLAKE2b variable output length. output_bytes is the output size in bytes (e.g., 32 for 256-bit).
blake2b_hash_file
Hash a file with BLAKE2b variable output length. Uses mmap for large files (zero-copy), single-read for small files, and streaming read as fallback.
blake2b_hash_files_many
Batch-hash multiple files with BLAKE2b using multi-buffer SIMD.
blake2b_hash_reader
Hash a reader with BLAKE2b variable output length. Uses thread-local buffer for cache-friendly streaming.
blake2b_hash_stdin
Hash stdin with BLAKE2b variable output length. Tries fadvise if stdin is a regular file (shell redirect), then streams.
check_file
Verify checksums from a check file. Each line should be “hash filename” or “hash *filename” or “ALGO (filename) = hash”.
hash_bytes
Compute hash of a byte slice directly (zero-copy fast path).
hash_file
Hash a file by path. Uses mmap for large files (zero-copy, no read() syscalls), single-read + single-shot hash for small files, and streaming read as fallback.
hash_file_nostat
Hash a file without fstat — just open, read until EOF, hash. For many-file workloads (100+ tiny files), skipping fstat saves ~5µs/file. Uses a two-tier buffer strategy: small stack buffer (4KB) for the initial read, then falls back to a larger stack buffer (64KB) or streaming hash for bigger files. For benchmark’s 55-byte files: one read() fills the 4KB buffer, hash immediately.
hash_files_parallel
Batch-hash multiple files with SHA-256/MD5 using std::thread::scope. Uses lightweight OS threads instead of rayon’s work-stealing pool. For single-invocation tools, this saves ~300µs of rayon thread pool initialization (spawning N-1 threads + setting up work-stealing deques). Returns results in input order.
hash_reader
Compute hash of data from a reader, returning hex string.
hash_stdin
Hash stdin. Uses fadvise for file redirects, streaming for pipes.
parse_check_line
Parse a checksum line in any supported format.
parse_check_line_tag
Parse a BSD-style tag line: “ALGO (filename) = hash” Returns (expected_hash, filename, optional_bits). bits is the hash length parsed from the algo name (e.g., BLAKE2b-256 -> Some(256)).
print_hash
Print hash result in GNU format: “hash filename\n” Uses raw byte writes to avoid std::fmt overhead.
print_hash_tag
Print hash result in BSD tag format: “ALGO (filename) = hash\n”
print_hash_tag_b2sum
Print hash in BSD tag format with BLAKE2b length info: “BLAKE2b (filename) = hash” for 512-bit, or “BLAKE2b-256 (filename) = hash” for other lengths.
print_hash_tag_b2sum_zero
Print hash in BSD tag format with BLAKE2b length info and NUL terminator.
print_hash_tag_zero
Print hash in BSD tag format with NUL terminator.
print_hash_zero
Print hash in GNU format with NUL terminator instead of newline.
readahead_files
Issue readahead hints for a list of file paths to warm the page cache. Uses POSIX_FADV_WILLNEED which is non-blocking and batches efficiently. Only issues hints for files >= 1MB; small files are read fast enough that the fadvise syscall overhead isn’t worth it.
readahead_files_all
Issue readahead hints for ALL file paths (no size threshold). For multi-file benchmarks, even small files benefit from batched readahead.
should_use_parallel
Check if parallel hashing is worthwhile for the given file paths. Always parallelize with 2+ files — rayon’s thread pool is lazily initialized once and reused, so per-file work-stealing overhead is negligible (~1µs). Removing the stat()-based size check eliminates N extra syscalls for N files.
write_hash_line
Build and write the standard GNU hash output line in a single write() call. Format: “hash filename\n” or “hash *filename\n” (binary mode). For escaped filenames: “\hash escaped_filename\n”.
write_hash_tag_line
Build and write BSD tag format output in a single write() call. Format: “ALGO (filename) = hash\n”