Skip to main content

Crate hibp_verifier

Crate hibp_verifier 

Source
Expand description

High-performance library for checking passwords against the Have I Been Pwned breach database using binary search on a compact 6-byte (sha1t48) format.

This library provides sub-microsecond password breach checking by reading pre-processed HIBP dataset files and performing binary search on sorted records. The hot path is zero-allocation for maximum performance.

§Quick Start

use hibp_verifier::BreachChecker;
use std::path::Path;

let checker = BreachChecker::new(Path::new("/path/to/hibp-data"));

match checker.is_breached("password123") {
    Ok(true) => println!("Password found in breach database"),
    Ok(false) => println!("Password not found"),
    Err(e) => eprintln!("Error: {}", e),
}

§Dataset Setup

This library requires a pre-downloaded dataset in sha1t48 binary format. Use hibp-bin-fetch to download and convert the data:

cargo install hibp-bin-fetch
hibp-bin-fetch --output /path/to/hibp-data

§Binary Format

The library expects a directory containing 1,048,576 files named 00000.bin through FFFFF.bin. Each file contains sorted 6-byte records (bytes 2-7 of SHA1 hashes) for the corresponding prefix.

This format reduces storage from 77 GB (original text) to 13 GB while enabling O(log n) binary search with direct indexing—no parsing overhead.

§Performance

High concurrency benchmark (10k concurrent lookups, 24 worker threads):

APIPer check
is_breached_async (tokio)~3.1 us
is_breached_compio (io-uring)~4.6 us
is_breached (sync threads)~19.8 us

The sync API is fastest for isolated serial lookups (~1.4 us) but performs poorly under concurrency due to OS thread creation overhead. For concurrent workloads, use is_breached_async which leverages tokio’s blocking thread pool with work-stealing for optimal throughput.

§Async Support

Enable the tokio feature for async support:

[dependencies]
hibp-verifier = { version = "0.1", features = ["tokio"] }
use hibp_verifier::BreachChecker;
use std::path::Path;

#[tokio::main]
async fn main() -> std::io::Result<()> {
    let checker = BreachChecker::new(Path::new("/path/to/hibp-data"));

    if checker.is_breached_async("password123").await? {
        println!("Password found in breach database!");
    }

    Ok(())
}

The async API performs SHA1 hashing and path construction on the async thread, then uses spawn_blocking only for file I/O. This is faster than tokio::fs::File because it uses a single blocking call instead of multiple calls per I/O operation.

§Compio Support (io-uring)

Enable the compio feature for native io-uring async support:

[dependencies]
hibp-verifier = { version = "0.1", features = ["compio"] }

This uses compio’s native io-uring file I/O. Note that benchmarks show this is ~1.5x slower than the tokio spawn_blocking approach due to the non-work-stealing model required by io-uring’s thread-local buffer requirements.

Structs§

BreachChecker
Checks if a password has been found in known data breaches.

Constants§

HEX_CHARS
Hex lookup table for prefix conversion.
HIBP_DATA_DIR_ENV
Environment variable name for specifying the HIBP dataset directory.
PREFIX_LEN
The length of a SHA1 hash prefix used for file naming (5 hex characters).
RECORD_SIZE
The length of a sha1t64 record in bytes (truncated 64-bit hash).

Functions§

dataset_path_from_env
Returns the dataset path from the HIBP_DATA_DIR environment variable, or falls back to the default location (pwnedpasswords-bin sibling directory).