Crate b100m_filter

source ·
Expand description

The fastest bloom filter in Rust. No accuracy compromises. Use any hasher.

§Background

Bloom filters are space efficient approximate membership set data structure. False positives from contains are possible, but false negatives are not, i.e. contains for all items in the set is guaranteed to return true, while contains for all items not in the set probably return false.

Blocked bloom filters are supported by an underlying bit vector, chunked into 512, 256, 128, or 64 bit “blocks”, to track item membership. To insert, a number of bits, based on the item’s hash, are set in the underlying bit vector. To check membership, a number of bits, based on the item’s hash, are checked in the underlying bit vector.

Once constructed, neither the bloom filter’s underlying memory usage nor number of bits per item change.

§Implementation

b100m-filter is blazingly fast because it uses L1 cache friendly blocks and efficiently derives many index bits from only one hash per value. Compared to traditional implementations, b100m-filter is 2-5 times faster for small sets of items, and hundreds of times faster for larger item sets. In all cases, b100m-filter maintains competitive false positive rates.

§Examples

Basic usage:

use b100m_filter::BloomFilter;

let num_blocks = 4; // by default, each block is 512 bits

let filter = BloomFilter::builder(num_blocks).items(["42", "🦀"].iter());
assert!(filter.contains("42"));
assert!(filter.contains("🦀"));

Use any hasher:

use b100m_filter::BloomFilter;
use ahash::RandomState;

let num_blocks = 4; // by default, each block is 512 bits

let filter = BloomFilter::builder(num_blocks)
    .hasher(RandomState::default())
    .items(["42", "🦀"].iter());

§References

Structs§

  • A space efficient approximate membership set data structure. False positives from contains are possible, but false negatives are not, i.e. contains for all items in the set is guaranteed to return true, while contains for all items not in the set probably return false.
  • A bloom filter builder.