Skip to main content

Crate bloom_lib

Crate bloom_lib 

Source
Expand description

§bloom-lib

Probabilistic data structures for Rust.

This crate provides space-efficient structures that answer set-membership, cardinality, frequency, and similarity questions with bounded, tunable error in a fraction of the memory an exact structure would require. They are built for streaming workloads: insertions are allocation-free, state is serializable, and compatible structures can be merged.

§Available structures

  • BloomFilter — probabilistic set membership with a tunable false-positive rate.
  • CuckooFilter — approximate membership that also supports deletion.
  • CountMinSketch — approximate frequency estimation for a stream.
  • HyperLogLog — distinct-count (cardinality) estimation in tiny memory.
  • MinHash — Jaccard similarity estimation between sets.
  • TopK — the most frequent items (heavy hitters) in a stream.

§Example

use bloom_lib::BloomFilter;

// A filter sized for 100,000 items at a 0.1% false-positive rate.
let mut filter = BloomFilter::new(100_000, 0.001).unwrap();

filter.insert("session-token");
assert!(filter.contains("session-token"));
assert!(!filter.contains("never-seen"));

§Hashing

Every structure is generic over core::hash::BuildHasher and defaults to the deterministic hash::DefaultHashBuilder. Determinism makes filters reproducible, mergeable, and stable across serialization. Supply a randomly-seeded hasher when the inputs are adversarial. See the hash module for details.

§Feature flags

  • std (default) — enables every structure and the std::error::Error implementation for Error.
  • alloc — enables every structure without requiring std, for heap-capable no_std targets. Implied by std.
  • serde — derives Serialize/Deserialize for every structure. Implies alloc.

With none of these features the crate exposes only VERSION and Error.

§License

Dual-licensed under Apache-2.0 OR MIT.

Modules§

hash
Hashing infrastructure shared by every structure.
prelude
Convenient re-exports for typical usage.

Structs§

BloomFilteralloc
A space-efficient probabilistic set membership test.
CountMinSketchalloc
A sublinear-space frequency estimator.
CuckooFilteralloc
A probabilistic set that supports deletion.
HyperLogLogalloc
Estimates the number of distinct items in a stream in fixed, tiny memory.
MinHashalloc
A fixed-size signature that estimates the Jaccard similarity of the set it summarises against another sketch.
TopKalloc
Tracks the k most frequent items in a stream using bounded memory.

Enums§

Error
Errors returned by fallible operations across the crate.

Constants§

VERSION
Crate version string, populated by Cargo at build time.