Struct Filter

Source
pub struct Filter { /* private fields */ }
Expand description

Approximate Membership Query Filter (AMQ-Filter) based on the Rank Select Quotient Filter (RSQF).

This data structure is similar to a hash table that stores fingerprints in a very compact way. Fingerprints are similar to a hash values, but are possibly truncated. The reason for false positives is that multiple items can map to the same fingerprint. For more information see the quotient filter Wikipedia page that describes a similar but less optimized version of the data structure. The actual implementation is based on the Rank Select Quotient Filter (RSQF).

The public API also exposes a fingerprint API, which can be used to succinctly store u64 hash values.

Implementations§

Source§

impl Filter

Source

pub fn new(capacity: u64, fp_rate: f64) -> Result<Self, Error>

Creates a new filter that can hold at least capacity items and with a desired error rate of fp_rate (clamped to (0, 0.5]).

Errors if capacity is >= `u64::MAX / 20`` or if the specified filter isn’t achievable using 64 bit hashes.

Source

pub fn new_resizeable( initial_capacity: u64, max_capacity: u64, fp_rate: f64, ) -> Result<Self, Error>

Creates a new filter that can hold at least initial_capacity items initially and can resize to hold at least max_capacity when fully grown. The desired error rate fp_rate (clamped to (0, 0.5]) applies to the fully grown filter.

This works by storing fingerprints large enough to satisfy the maximum requirements, so smaller filters will actually have lower error rates, which will increase (up to fp_rate) as the filter grows. In practice every time the filter doubles in capacity its error rate also doubles.

Errors if capacity is >= `u64::MAX / 20`` or if the specified filter isn’t achievable using 64 bit hashes.

Source

pub fn with_fingerprint_size( initial_capacity: u64, fingerprint_bits: u8, ) -> Result<Filter, Error>

Creates a new resizeable filter that can hold at least initial_capacity items initially while utilizing a fingerprint bit size of fingerprint_bits (7..=64). Normally this function is only useful if the filter is being used to manually store fingerprints.

Source

pub fn fingerprint_size(&self) -> u8

The internal fingerprint size in bits.

Source

pub fn is_empty(&self) -> bool

Whether the filter is empty.

Source

pub fn len(&self) -> u64

Current number of fingerprints admitted to the filter.

Source

pub fn clear(&mut self)

Resets/Clears the filter.

Source

pub fn capacity_resizeable(&self) -> u64

Maximum filter capacity.

Source

pub fn capacity(&self) -> u64

Current filter capacity.

Source

pub fn max_error_ratio_resizeable(&self) -> f64

Max error ratio when at the resizeable capacity (len == resizeable_capacity).

Source

pub fn max_error_ratio(&self) -> f64

Max error ratio when at full capacity (len == capacity).

Source

pub fn current_error_ratio(&self) -> f64

Current error ratio at the current occupancy.

Source

pub fn contains<T: Hash>(&self, item: T) -> bool

Returns whether item is present (probabilistically) in the filter.

Source

pub fn contains_fingerprint(&self, hash: u64) -> bool

Returns whether the fingerprint is present (probabilistically) in the filter.

Source

pub fn count<T: Hash>(&mut self, item: T) -> u64

Returns the number of times the item appears (probabilistically) in the filter.

Source

pub fn count_fingerprint(&mut self, hash: u64) -> u64

Returns the amount of times the fingerprint appears (probabilistically) in the filter.

Source

pub fn remove<T: Hash>(&mut self, item: T) -> bool

Removes item from the filter. Returns whether item was actually found and removed.

Note that removing an item who wasn’t previously added to the filter may introduce false negatives. This is because it could be removing fingerprints from a colliding item!

Source

pub fn remove_fingerprint(&mut self, hash: u64) -> bool

Removes the fingerprint specified by hash was from the filter. Returns whether a fingerprint was actually found and removed.

Note that removing a fingerprint that wasn’t previously added to the filter may introduce false negatives. This is because it could be removing fingerprints from a colliding hash!

Source

pub fn insert_duplicated<T: Hash>(&mut self, item: T) -> Result<(), Error>

Inserts item in the filter, even if already appears to be in the filter. This works by inserting a possibly duplicated fingerprint in the filter.

This function should be used when the filter is also subject to removals and the item is known to not have been added to the filter before (or was removed).

Returns Err(Error::CapacityExceeded) if the filter cannot admit the new item.

Source

pub fn insert<T: Hash>(&mut self, item: T) -> Result<bool, Error>

Inserts item in the filter.

Returns Ok(true) if the item was successfully added to the filter. Returns Ok(false) if the item is already contained (probabilistically) in the filter. Returns Err(Error::CapacityExceeded) if the filter cannot admit the new item.

Source

pub fn insert_fingerprint( &mut self, duplicate: bool, hash: u64, ) -> Result<bool, Error>

Inserts the fingerprint specified by hash in the filter. duplicate specifies if the fingerprint should be added even if it’s already in the filter.

Returns Ok(true) if the item was successfully added to the filter. Returns Ok(false) if the item is already contained (probabilistically) in the filter. Returns Err(Error::CapacityExceeded) if the filter cannot admit the new item.

Source

pub fn fingerprints(&self) -> FingerprintIter<'_>

Returns an iterator over the fingerprints stored in the filter.

Fingerprints will be returned in ascending order.

Source

pub fn shrink_to_fit(&mut self)

Shrinks the capacity of the filter as much as possible while preserving the false positive ratios and fingerprint size.

Source

pub fn merge( &mut self, keep_duplicates: bool, other: &Self, ) -> Result<(), Error>

Merges other filter into self.

keep_duplicates specifies whether duplicated fingerprints should be store, this is normally only useful is the filter is being used for counting.

Note that the other filter must have a fingerprint >= self fingerprint size, otherwise the function will fail with Err(Error::IncompatibleFingerprintSize). This is the case for filters created with the same parameters or if the other filter has a lower target false positive ratio.

Returns Err(Error::CapacityExceeded) if the filter cannot merge all items. Note that in this case items could have already been added and the filter is left full but in an otherwise valid state.

Trait Implementations§

Source§

impl Clone for Filter

Source§

fn clone(&self) -> Filter

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for Filter

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

§

impl Freeze for Filter

§

impl RefUnwindSafe for Filter

§

impl Send for Filter

§

impl Sync for Filter

§

impl Unpin for Filter

§

impl UnwindSafe for Filter

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dst: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dst. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.