Skip to main content

BloomFilter

Struct BloomFilter 

Source
pub struct BloomFilter<T: ?Sized, S = DefaultHashBuilder> { /* private fields */ }
Available on crate feature alloc only.
Expand description

A space-efficient probabilistic set membership test.

A Bloom filter answers “have I seen this item?” using a fraction of the memory a real set would need. The trade-off is one-sided error: contains never reports a false negative (an inserted item always tests positive) but may report a false positive (an item that was never inserted may test positive). The probability of a false positive is tunable at construction time.

Items are not stored, so a Bloom filter cannot enumerate or remove its contents. When deletion is required, reach for CuckooFilter instead.

The filter is generic over the item type T and a BuildHasher S, which defaults to the deterministic DefaultHashBuilder.

§Examples

use bloom_lib::BloomFilter;

// Size for 10,000 items at a 1% false-positive rate.
let mut filter = BloomFilter::new(10_000, 0.01).unwrap();

filter.insert("alice");
filter.insert("bob");

assert!(filter.contains("alice"));
assert!(filter.contains("bob"));
assert!(!filter.contains("carol")); // very likely absent

Implementations§

Source§

impl<T: ?Sized> BloomFilter<T, DefaultHashBuilder>

Source

pub fn new(capacity: usize, rate: f64) -> Result<Self, Error>

Creates a filter sized for capacity items at the target false-positive rate, using the default hasher.

capacity is the number of distinct items you expect to insert; the false-positive rate is honoured at that fill level and degrades gracefully beyond it. The bit count and hash count are derived from the standard Bloom filter formulas.

§Parameters
  • capacity: expected number of distinct insertions. Must be non-zero.
  • rate: desired false-positive probability. Must lie in (0.0, 1.0).
§Errors

Returns Error::InvalidParameter if capacity is zero or rate is not a finite value strictly between 0.0 and 1.0.

§Examples
use bloom_lib::BloomFilter;

let filter = BloomFilter::<&str>::new(1_000, 0.001).unwrap();
assert!(filter.num_bits() >= 1_000);
assert!(filter.num_hashes() >= 1);
Source

pub fn with_dimensions(num_bits: u64, num_hashes: u32) -> Result<Self, Error>

Creates a filter with an explicit bit count and hash count, using the default hasher.

Use this when you want direct control over the filter’s geometry rather than deriving it from a capacity and rate.

§Parameters
  • num_bits: size of the bit array. Must be non-zero.
  • num_hashes: number of hash positions probed per item. Must be non-zero.
§Errors

Returns Error::InvalidParameter if either argument is zero.

§Examples
use bloom_lib::BloomFilter;

let filter = BloomFilter::<u64>::with_dimensions(8_192, 5).unwrap();
assert_eq!(filter.num_bits(), 8_192);
assert_eq!(filter.num_hashes(), 5);
Source§

impl<T: ?Sized, S: BuildHasher> BloomFilter<T, S>

Source

pub fn with_hasher(capacity: usize, rate: f64, hasher: S) -> Result<Self, Error>

Creates a filter sized for capacity items at the target false-positive rate, using a caller-supplied hasher.

Identical to new but lets you plug in any BuildHasher — for example a randomly-seeded one for resistance against adversarial inputs.

§Errors

Returns Error::InvalidParameter if capacity is zero or rate is not a finite value strictly between 0.0 and 1.0.

§Examples
use std::collections::hash_map::RandomState;
use bloom_lib::BloomFilter;

let filter: BloomFilter<&str, RandomState> =
    BloomFilter::with_hasher(1_000, 0.01, RandomState::new()).unwrap();
Source

pub fn with_dimensions_and_hasher( num_bits: u64, num_hashes: u32, hasher: S, ) -> Result<Self, Error>

Creates a filter with an explicit geometry and a caller-supplied hasher.

§Errors

Returns Error::InvalidParameter if num_bits or num_hashes is zero.

§Examples
use bloom_lib::{hash::DefaultHashBuilder, BloomFilter};

let filter = BloomFilter::<u64>::with_dimensions_and_hasher(
    4_096,
    4,
    DefaultHashBuilder,
)
.unwrap();
assert_eq!(filter.num_hashes(), 4);
Source

pub fn insert(&mut self, item: &T) -> bool
where T: Hash,

Inserts item, returning true if it was probably not present before.

A return of true means at least one of the item’s bits was previously unset, so the item had definitely not been inserted. A return of false means every bit was already set, so the item was probably already present (subject to the filter’s false-positive rate). This makes insert a convenient deduplicating primitive for streaming input.

§Examples
use bloom_lib::BloomFilter;

let mut filter = BloomFilter::new(100, 0.01).unwrap();
assert!(filter.insert("first")); // newly added
assert!(!filter.insert("first")); // already present
Source

pub fn contains(&self, item: &T) -> bool
where T: Hash,

Tests whether item is in the filter.

Returns false only if the item was definitely never inserted. A return of true means the item is probably present, with the filter’s configured false-positive probability.

§Examples
use bloom_lib::BloomFilter;

let mut filter = BloomFilter::new(100, 0.01).unwrap();
filter.insert(&7u32);
assert!(filter.contains(&7u32));
assert!(!filter.contains(&99u32));
Source

pub fn clear(&mut self)

Removes every item, leaving an empty filter with the same geometry.

The bit allocation is retained, so reusing a cleared filter avoids a fresh allocation.

§Examples
use bloom_lib::BloomFilter;

let mut filter = BloomFilter::new(100, 0.01).unwrap();
filter.insert("x");
filter.clear();
assert!(filter.is_empty());
assert!(!filter.contains("x"));
Source

pub fn num_bits(&self) -> u64

The size of the underlying bit array.

Source

pub fn num_hashes(&self) -> u32

The number of hash positions probed per item.

Source

pub fn count_ones(&self) -> u64

The number of set bits in the filter.

This is the raw population count of the bit array, useful for monitoring fill level. See estimated_len for an estimate of the number of distinct items inserted.

Source

pub fn is_empty(&self) -> bool

Returns true if no bits are set, i.e. nothing has been inserted.

§Examples
use bloom_lib::BloomFilter;

let mut filter = BloomFilter::new(100, 0.01).unwrap();
assert!(filter.is_empty());
filter.insert("x");
assert!(!filter.is_empty());
Source

pub fn estimated_len(&self) -> u64

Estimates the number of distinct items inserted so far.

Derived from the fraction of set bits using the standard estimator n ≈ -(m / k) · ln(1 − X / m), where m is the bit count, k the hash count, and X the number of set bits. The estimate is approximate and loses accuracy as the filter saturates.

§Examples
use bloom_lib::BloomFilter;

let mut filter = BloomFilter::new(10_000, 0.01).unwrap();
for i in 0..1_000u32 {
    filter.insert(&i);
}
let estimate = filter.estimated_len();
// Within a few percent of the true count of 1,000.
assert!((900..=1_100).contains(&estimate));
Source

pub fn estimated_false_positive_rate(&self) -> f64

Estimates the current false-positive probability at the present fill level.

Computed as (X / m)^k, where X is the number of set bits, m the bit count, and k the hash count. This reflects the actual fill, so it rises above the configured target rate once the filter holds more than its design capacity.

§Examples
use bloom_lib::BloomFilter;

let filter = BloomFilter::<u32>::new(1_000, 0.01).unwrap();
// An empty filter cannot produce a false positive.
assert_eq!(filter.estimated_false_positive_rate(), 0.0);
Source

pub fn merge(&mut self, other: &Self) -> Result<(), Error>

Merges other into self by unioning their bit arrays.

After a successful merge, the filter reports positive for every item that was in either operand. Both filters must have been built with identical dimensions (bit count and hash count); the hasher is assumed to match, which holds automatically for the default hasher.

§Errors

Returns Error::IncompatibleParameters if the two filters differ in bit count or hash count.

§Examples
use bloom_lib::BloomFilter;

let mut a = BloomFilter::new(1_000, 0.01).unwrap();
let mut b = BloomFilter::new(1_000, 0.01).unwrap();
a.insert("from-a");
b.insert("from-b");

a.merge(&b).unwrap();
assert!(a.contains("from-a"));
assert!(a.contains("from-b"));

Trait Implementations§

Source§

impl<T: Clone + ?Sized, S: Clone> Clone for BloomFilter<T, S>

Source§

fn clone(&self) -> BloomFilter<T, S>

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<T: Debug + ?Sized, S: Debug> Debug for BloomFilter<T, S>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de, T: ?Sized, S> Deserialize<'de> for BloomFilter<T, S>
where S: Default,

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl<T: ?Sized, S> Serialize for BloomFilter<T, S>

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

§

impl<T, S> Freeze for BloomFilter<T, S>
where S: Freeze, T: ?Sized,

§

impl<T, S> RefUnwindSafe for BloomFilter<T, S>
where S: RefUnwindSafe, T: ?Sized,

§

impl<T, S> Send for BloomFilter<T, S>
where S: Send, T: ?Sized,

§

impl<T, S> Sync for BloomFilter<T, S>
where S: Sync, T: ?Sized,

§

impl<T, S> Unpin for BloomFilter<T, S>
where S: Unpin, T: ?Sized,

§

impl<T, S> UnsafeUnpin for BloomFilter<T, S>
where S: UnsafeUnpin, T: ?Sized,

§

impl<T, S> UnwindSafe for BloomFilter<T, S>
where S: UnwindSafe, T: ?Sized,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,