pub struct BloomFilter<T: ?Sized, S = DefaultHashBuilder> { /* private fields */ }alloc only.Expand description
A space-efficient probabilistic set membership test.
A Bloom filter answers “have I seen this item?” using a fraction of the
memory a real set would need. The trade-off is one-sided error:
contains never reports a false negative (an inserted
item always tests positive) but may report a false positive (an item that
was never inserted may test positive). The probability of a false positive
is tunable at construction time.
Items are not stored, so a Bloom filter cannot enumerate or remove its
contents. When deletion is required, reach for
CuckooFilter instead.
The filter is generic over the item type T and a
BuildHasher S, which defaults to the
deterministic DefaultHashBuilder.
§Examples
use bloom_lib::BloomFilter;
// Size for 10,000 items at a 1% false-positive rate.
let mut filter = BloomFilter::new(10_000, 0.01).unwrap();
filter.insert("alice");
filter.insert("bob");
assert!(filter.contains("alice"));
assert!(filter.contains("bob"));
assert!(!filter.contains("carol")); // very likely absentImplementations§
Source§impl<T: ?Sized> BloomFilter<T, DefaultHashBuilder>
impl<T: ?Sized> BloomFilter<T, DefaultHashBuilder>
Sourcepub fn new(capacity: usize, rate: f64) -> Result<Self, Error>
pub fn new(capacity: usize, rate: f64) -> Result<Self, Error>
Creates a filter sized for capacity items at the target false-positive
rate, using the default hasher.
capacity is the number of distinct items you expect to insert; the
false-positive rate is honoured at that fill level and degrades
gracefully beyond it. The bit count and hash count are derived from the
standard Bloom filter formulas.
§Parameters
capacity: expected number of distinct insertions. Must be non-zero.rate: desired false-positive probability. Must lie in(0.0, 1.0).
§Errors
Returns Error::InvalidParameter if capacity is zero or rate is
not a finite value strictly between 0.0 and 1.0.
§Examples
use bloom_lib::BloomFilter;
let filter = BloomFilter::<&str>::new(1_000, 0.001).unwrap();
assert!(filter.num_bits() >= 1_000);
assert!(filter.num_hashes() >= 1);Sourcepub fn with_dimensions(num_bits: u64, num_hashes: u32) -> Result<Self, Error>
pub fn with_dimensions(num_bits: u64, num_hashes: u32) -> Result<Self, Error>
Creates a filter with an explicit bit count and hash count, using the default hasher.
Use this when you want direct control over the filter’s geometry rather than deriving it from a capacity and rate.
§Parameters
num_bits: size of the bit array. Must be non-zero.num_hashes: number of hash positions probed per item. Must be non-zero.
§Errors
Returns Error::InvalidParameter if either argument is zero.
§Examples
use bloom_lib::BloomFilter;
let filter = BloomFilter::<u64>::with_dimensions(8_192, 5).unwrap();
assert_eq!(filter.num_bits(), 8_192);
assert_eq!(filter.num_hashes(), 5);Source§impl<T: ?Sized, S: BuildHasher> BloomFilter<T, S>
impl<T: ?Sized, S: BuildHasher> BloomFilter<T, S>
Sourcepub fn with_hasher(capacity: usize, rate: f64, hasher: S) -> Result<Self, Error>
pub fn with_hasher(capacity: usize, rate: f64, hasher: S) -> Result<Self, Error>
Creates a filter sized for capacity items at the target false-positive
rate, using a caller-supplied hasher.
Identical to new but lets you plug in any
BuildHasher — for example a randomly-seeded
one for resistance against adversarial inputs.
§Errors
Returns Error::InvalidParameter if capacity is zero or rate is
not a finite value strictly between 0.0 and 1.0.
§Examples
use std::collections::hash_map::RandomState;
use bloom_lib::BloomFilter;
let filter: BloomFilter<&str, RandomState> =
BloomFilter::with_hasher(1_000, 0.01, RandomState::new()).unwrap();Sourcepub fn with_dimensions_and_hasher(
num_bits: u64,
num_hashes: u32,
hasher: S,
) -> Result<Self, Error>
pub fn with_dimensions_and_hasher( num_bits: u64, num_hashes: u32, hasher: S, ) -> Result<Self, Error>
Creates a filter with an explicit geometry and a caller-supplied hasher.
§Errors
Returns Error::InvalidParameter if num_bits or num_hashes is
zero.
§Examples
use bloom_lib::{hash::DefaultHashBuilder, BloomFilter};
let filter = BloomFilter::<u64>::with_dimensions_and_hasher(
4_096,
4,
DefaultHashBuilder,
)
.unwrap();
assert_eq!(filter.num_hashes(), 4);Sourcepub fn insert(&mut self, item: &T) -> boolwhere
T: Hash,
pub fn insert(&mut self, item: &T) -> boolwhere
T: Hash,
Inserts item, returning true if it was probably not present before.
A return of true means at least one of the item’s bits was previously
unset, so the item had definitely not been inserted. A return of false
means every bit was already set, so the item was probably already
present (subject to the filter’s false-positive rate). This makes
insert a convenient deduplicating primitive for streaming input.
§Examples
use bloom_lib::BloomFilter;
let mut filter = BloomFilter::new(100, 0.01).unwrap();
assert!(filter.insert("first")); // newly added
assert!(!filter.insert("first")); // already presentSourcepub fn contains(&self, item: &T) -> boolwhere
T: Hash,
pub fn contains(&self, item: &T) -> boolwhere
T: Hash,
Tests whether item is in the filter.
Returns false only if the item was definitely never inserted. A return
of true means the item is probably present, with the filter’s
configured false-positive probability.
§Examples
use bloom_lib::BloomFilter;
let mut filter = BloomFilter::new(100, 0.01).unwrap();
filter.insert(&7u32);
assert!(filter.contains(&7u32));
assert!(!filter.contains(&99u32));Sourcepub fn clear(&mut self)
pub fn clear(&mut self)
Removes every item, leaving an empty filter with the same geometry.
The bit allocation is retained, so reusing a cleared filter avoids a fresh allocation.
§Examples
use bloom_lib::BloomFilter;
let mut filter = BloomFilter::new(100, 0.01).unwrap();
filter.insert("x");
filter.clear();
assert!(filter.is_empty());
assert!(!filter.contains("x"));Sourcepub fn num_hashes(&self) -> u32
pub fn num_hashes(&self) -> u32
The number of hash positions probed per item.
Sourcepub fn count_ones(&self) -> u64
pub fn count_ones(&self) -> u64
The number of set bits in the filter.
This is the raw population count of the bit array, useful for monitoring
fill level. See estimated_len for an estimate of
the number of distinct items inserted.
Sourcepub fn is_empty(&self) -> bool
pub fn is_empty(&self) -> bool
Returns true if no bits are set, i.e. nothing has been inserted.
§Examples
use bloom_lib::BloomFilter;
let mut filter = BloomFilter::new(100, 0.01).unwrap();
assert!(filter.is_empty());
filter.insert("x");
assert!(!filter.is_empty());Sourcepub fn estimated_len(&self) -> u64
pub fn estimated_len(&self) -> u64
Estimates the number of distinct items inserted so far.
Derived from the fraction of set bits using the standard estimator
n ≈ -(m / k) · ln(1 − X / m), where m is the bit count, k the hash
count, and X the number of set bits. The estimate is approximate and
loses accuracy as the filter saturates.
§Examples
use bloom_lib::BloomFilter;
let mut filter = BloomFilter::new(10_000, 0.01).unwrap();
for i in 0..1_000u32 {
filter.insert(&i);
}
let estimate = filter.estimated_len();
// Within a few percent of the true count of 1,000.
assert!((900..=1_100).contains(&estimate));Sourcepub fn estimated_false_positive_rate(&self) -> f64
pub fn estimated_false_positive_rate(&self) -> f64
Estimates the current false-positive probability at the present fill level.
Computed as (X / m)^k, where X is the number of set bits, m the bit
count, and k the hash count. This reflects the actual fill, so it
rises above the configured target rate once the filter holds more than
its design capacity.
§Examples
use bloom_lib::BloomFilter;
let filter = BloomFilter::<u32>::new(1_000, 0.01).unwrap();
// An empty filter cannot produce a false positive.
assert_eq!(filter.estimated_false_positive_rate(), 0.0);Sourcepub fn merge(&mut self, other: &Self) -> Result<(), Error>
pub fn merge(&mut self, other: &Self) -> Result<(), Error>
Merges other into self by unioning their bit arrays.
After a successful merge, the filter reports positive for every item that was in either operand. Both filters must have been built with identical dimensions (bit count and hash count); the hasher is assumed to match, which holds automatically for the default hasher.
§Errors
Returns Error::IncompatibleParameters if the two filters differ in
bit count or hash count.
§Examples
use bloom_lib::BloomFilter;
let mut a = BloomFilter::new(1_000, 0.01).unwrap();
let mut b = BloomFilter::new(1_000, 0.01).unwrap();
a.insert("from-a");
b.insert("from-b");
a.merge(&b).unwrap();
assert!(a.contains("from-a"));
assert!(a.contains("from-b"));Trait Implementations§
Source§impl<T: Clone + ?Sized, S: Clone> Clone for BloomFilter<T, S>
impl<T: Clone + ?Sized, S: Clone> Clone for BloomFilter<T, S>
Source§fn clone(&self) -> BloomFilter<T, S>
fn clone(&self) -> BloomFilter<T, S>
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more