Skip to main content

BloomFilter

Struct BloomFilter 

Source
pub struct BloomFilter<H: Hasher = Sha256> { /* private fields */ }
Expand description

A Bloom Filter.

This implementation uses the Kirsch-Mitzenmacher optimization to derive k hash functions from two hash values, which are in turn derived from a single hash digest. This provides efficient hashing for BloomFilter::insert and BloomFilter::contains operations.

§Hasher Selection

The H type parameter specifies the hash function to use. It defaults to Sha256. The hasher’s digest must be at least 16 bytes (128 bits) long, this is enforced at compile time.

When choosing a hasher, consider:

  • Security: If the bloom filter accepts untrusted input, use a cryptographically secure hash function to prevent attackers from crafting inputs that cause excessive collisions (degrading the filter to always return true).

  • Determinism: If the bloom filter must produce consistent results across runs or machines (e.g. for serialization or consensus-critical applications), avoid keyed or randomized hash functions. Both Sha256 and Blake3 are deterministic.

  • Performance: Hash function performance varies with the size of items inserted and queried. Sha256 is faster for smaller items (up to ~2KB), while Blake3 is faster for larger items (4KB+).

Implementations§

Source§

impl<H: Hasher> BloomFilter<H>

Source

pub fn new(hashers: NonZeroU8, bits: NonZeroUsize) -> Self

Creates a new BloomFilter with hashers hash functions and bits bits.

The number of bits will be rounded up to the next power of 2. If that would overflow, the maximum power of 2 for the platform (2^63 on 64-bit) is used.

Source

pub fn with_rate(expected_items: NonZeroUsize, fp_rate: BigRational) -> Self

Creates a new BloomFilter with optimal parameters for the expected number of items and desired false positive rate.

Uses exact rational arithmetic for full determinism across all platforms.

§Arguments
  • expected_items - Number of items expected to be inserted
  • fp_rate - False positive rate as a rational (e.g., BigRational::from_frac_u64(1, 100) for 1%)
§Panics

Panics if fp_rate is not in (0, 1).

Source

pub const fn hashers(&self) -> NonZeroU8

Returns the number of hashers used by the filter.

Source

pub const fn bits(&self) -> NonZeroUsize

Returns the number of bits used by the filter.

Source

pub fn insert(&mut self, item: &[u8])

Inserts an item into the BloomFilter.

Source

pub fn contains(&self, item: &[u8]) -> bool

Checks if an item is possibly in the BloomFilter.

Returns true if the item is probably in the set, and false if it is definitely not.

Source

pub fn estimated_false_positive_rate(&self) -> BigRational

Estimates the current false positive probability.

This approximates the false positive rate as f^k where f is the fill ratio (proportion of bits set to 1) and k is the number of hash functions.

Returns a BigRational for exact representation and cross-platform determinism.

Source

pub fn estimated_count(&self) -> BigRational

Estimates the number of items that have been inserted.

Uses the formula n = -(m/k) * ln(1 - x/m) where m is the number of bits, k is the number of hash functions, and x is the number of bits set to 1.

Returns a BigRational using log2_floor for the logarithm computation.

Source

pub fn optimal_hashers(expected_items: usize, bits: usize) -> u8

Calculates the optimal number of hash functions for a given capacity and bit count.

Uses BigRational for determinism. The result is clamped to [1, 16] since beyond ~10-12 hashes provides negligible improvement while increasing CPU cost.

Source

pub fn optimal_bits(expected_items: usize, fp_rate: &BigRational) -> usize

Calculates the optimal number of bits for a given capacity and false positive rate.

Uses exact rational arithmetic for full determinism across all platforms. The result is rounded up to the next power of 2. If that would overflow, the maximum power of 2 for the platform (2^63 on 64-bit) is used.

Formula: m = -n * log2(p) / ln(2)

§Panics

Panics if fp_rate is not in (0, 1).

Trait Implementations§

Source§

impl<H: Clone + Hasher> Clone for BloomFilter<H>

Source§

fn clone(&self) -> BloomFilter<H>

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<H: Debug + Hasher> Debug for BloomFilter<H>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<H: Hasher> EncodeSize for BloomFilter<H>

Source§

fn encode_size(&self) -> usize

Returns the encoded size of this value (in bytes).
Source§

impl<H: Hasher> PartialEq for BloomFilter<H>

Source§

fn eq(&self, other: &Self) -> bool

Tests for self and other values to be equal, and is used by ==.
1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.
Source§

impl<H: Hasher> Read for BloomFilter<H>

Source§

type Cfg = (NonZero<u8>, NonZero<u64>)

The Cfg type parameter allows passing configuration during the read process. This is crucial for safely decoding untrusted data, for example, by providing size limits for collections or strings. Read more
Source§

fn read_cfg( buf: &mut impl Buf, (hashers_cfg, bits_cfg): &Self::Cfg, ) -> Result<Self, CodecError>

Reads a value from the buffer using the provided configuration cfg. Read more
Source§

impl<H: Hasher> Write for BloomFilter<H>

Source§

fn write(&self, buf: &mut impl BufMut)

Writes the binary representation of self to the provided buffer buf. Read more
Source§

impl<H: Hasher> Eq for BloomFilter<H>

Auto Trait Implementations§

§

impl<H> Freeze for BloomFilter<H>

§

impl<H> RefUnwindSafe for BloomFilter<H>
where H: RefUnwindSafe,

§

impl<H> Send for BloomFilter<H>

§

impl<H> Sync for BloomFilter<H>

§

impl<H> Unpin for BloomFilter<H>
where H: Unpin,

§

impl<H> UnwindSafe for BloomFilter<H>
where H: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> Decode for T
where T: Read,

Source§

fn decode_cfg(buf: impl Buf, cfg: &Self::Cfg) -> Result<Self, Error>

Decodes a value from buf using cfg, ensuring the entire buffer is consumed. Read more
Source§

impl<T> Encode for T
where T: Write + EncodeSize,

Source§

fn encode(&self) -> Bytes

Encodes self into a new Bytes buffer. Read more
Source§

fn encode_mut(&self) -> BytesMut

Encodes self into a new BytesMut buffer. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> Codec for T
where T: Encode + Decode,

Source§

impl<T> CodecShared for T
where T: Codec + Send + Sync,

Source§

impl<T> EncodeShared for T
where T: Encode + Send + Sync,