pub struct Filter { /* private fields */ }Expand description
Approximate Membership Query Filter (AMQ-Filter) based on the Rank Select Quotient Filter (RSQF).
This data structure is similar to a hash table that stores fingerprints in a very compact way. Fingerprints are similar to a hash values, but are possibly truncated. The reason for false positives is that multiple items can map to the same fingerprint. For more information see the quotient filter Wikipedia page that describes a similar but less optimized version of the data structure. The actual implementation is based on the Rank Select Quotient Filter (RSQF).
The public API also exposes a fingerprint API, which can be used to succinctly store u64 hash values.
Implementations§
Source§impl Filter
impl Filter
Sourcepub fn new(capacity: u64, fp_rate: f64) -> Result<Self, Error>
pub fn new(capacity: u64, fp_rate: f64) -> Result<Self, Error>
Creates a new filter that can hold at least capacity items
and with a desired error rate of fp_rate (clamped to (0, 0.5]).
Errors if capacity is >= `u64::MAX / 20`` or if the specified filter isn’t achievable using 64 bit hashes.
Sourcepub fn new_resizeable(
initial_capacity: u64,
max_capacity: u64,
fp_rate: f64,
) -> Result<Self, Error>
pub fn new_resizeable( initial_capacity: u64, max_capacity: u64, fp_rate: f64, ) -> Result<Self, Error>
Creates a new filter that can hold at least initial_capacity items initially
and can resize to hold at least max_capacity when fully grown.
The desired error rate fp_rate (clamped to (0, 0.5]) applies to the fully grown filter.
This works by storing fingerprints large enough to satisfy the maximum requirements,
so smaller filters will actually have lower error rates, which will increase
(up to fp_rate) as the filter grows. In practice every time the filter doubles in
capacity its error rate also doubles.
Errors if capacity is >= `u64::MAX / 20`` or if the specified filter isn’t achievable using 64 bit hashes.
Sourcepub fn with_fingerprint_size(
initial_capacity: u64,
fingerprint_bits: u8,
) -> Result<Filter, Error>
pub fn with_fingerprint_size( initial_capacity: u64, fingerprint_bits: u8, ) -> Result<Filter, Error>
Creates a new resizeable filter that can hold at least initial_capacity items initially while
utilizing a fingerprint bit size of fingerprint_bits (7..=64). Normally this function is only
useful if the filter is being used to manually store fingerprints.
Sourcepub fn fingerprint_size(&self) -> u8
pub fn fingerprint_size(&self) -> u8
The internal fingerprint size in bits.
Sourcepub fn capacity_resizeable(&self) -> u64
pub fn capacity_resizeable(&self) -> u64
Maximum filter capacity.
Sourcepub fn max_error_ratio_resizeable(&self) -> f64
pub fn max_error_ratio_resizeable(&self) -> f64
Max error ratio when at the resizeable capacity (len == resizeable_capacity).
Sourcepub fn max_error_ratio(&self) -> f64
pub fn max_error_ratio(&self) -> f64
Max error ratio when at full capacity (len == capacity).
Sourcepub fn current_error_ratio(&self) -> f64
pub fn current_error_ratio(&self) -> f64
Current error ratio at the current occupancy.
Sourcepub fn contains<T: Hash>(&self, item: T) -> bool
pub fn contains<T: Hash>(&self, item: T) -> bool
Returns whether item is present (probabilistically) in the filter.
Sourcepub fn contains_fingerprint(&self, hash: u64) -> bool
pub fn contains_fingerprint(&self, hash: u64) -> bool
Returns whether the fingerprint is present (probabilistically) in the filter.
Sourcepub fn count<T: Hash>(&mut self, item: T) -> u64
pub fn count<T: Hash>(&mut self, item: T) -> u64
Returns the number of times the item appears (probabilistically) in the filter.
Sourcepub fn count_fingerprint(&mut self, hash: u64) -> u64
pub fn count_fingerprint(&mut self, hash: u64) -> u64
Returns the amount of times the fingerprint appears (probabilistically) in the filter.
Sourcepub fn remove<T: Hash>(&mut self, item: T) -> bool
pub fn remove<T: Hash>(&mut self, item: T) -> bool
Removes item from the filter.
Returns whether item was actually found and removed.
Note that removing an item who wasn’t previously added to the filter may introduce false negatives. This is because it could be removing fingerprints from a colliding item!
Sourcepub fn remove_fingerprint(&mut self, hash: u64) -> bool
pub fn remove_fingerprint(&mut self, hash: u64) -> bool
Removes the fingerprint specified by hash was from the filter.
Returns whether a fingerprint was actually found and removed.
Note that removing a fingerprint that wasn’t previously added to the filter may introduce false negatives. This is because it could be removing fingerprints from a colliding hash!
Sourcepub fn insert_duplicated<T: Hash>(&mut self, item: T) -> Result<(), Error>
pub fn insert_duplicated<T: Hash>(&mut self, item: T) -> Result<(), Error>
Inserts item in the filter, even if already appears to be in the filter.
This works by inserting a possibly duplicated fingerprint in the filter.
This function should be used when the filter is also subject to removals and the item is known to not have been added to the filter before (or was removed).
Returns Err(Error::CapacityExceeded) if the filter cannot admit the new item.
Sourcepub fn insert<T: Hash>(&mut self, item: T) -> Result<bool, Error>
pub fn insert<T: Hash>(&mut self, item: T) -> Result<bool, Error>
Inserts item in the filter.
Returns Ok(true) if the item was successfully added to the filter.
Returns Ok(false) if the item is already contained (probabilistically) in the filter.
Returns Err(Error::CapacityExceeded) if the filter cannot admit the new item.
Sourcepub fn insert_fingerprint(
&mut self,
duplicate: bool,
hash: u64,
) -> Result<bool, Error>
pub fn insert_fingerprint( &mut self, duplicate: bool, hash: u64, ) -> Result<bool, Error>
Inserts the fingerprint specified by hash in the filter.
duplicate specifies if the fingerprint should be added even if it’s already in the filter.
Returns Ok(true) if the item was successfully added to the filter.
Returns Ok(false) if the item is already contained (probabilistically) in the filter.
Returns Err(Error::CapacityExceeded) if the filter cannot admit the new item.
Sourcepub fn fingerprints(&self) -> FingerprintIter<'_> ⓘ
pub fn fingerprints(&self) -> FingerprintIter<'_> ⓘ
Returns an iterator over the fingerprints stored in the filter.
Fingerprints will be returned in ascending order.
Sourcepub fn shrink_to_fit(&mut self)
pub fn shrink_to_fit(&mut self)
Shrinks the capacity of the filter as much as possible while preserving the false positive ratios and fingerprint size.
Sourcepub fn merge(
&mut self,
keep_duplicates: bool,
other: &Self,
) -> Result<(), Error>
pub fn merge( &mut self, keep_duplicates: bool, other: &Self, ) -> Result<(), Error>
Merges other filter into self.
keep_duplicates specifies whether duplicated fingerprints should be store,
this is normally only useful is the filter is being used for counting.
Note that the other filter must have a fingerprint >= self fingerprint size,
otherwise the function will fail with Err(Error::IncompatibleFingerprintSize).
This is the case for filters created with the same parameters or if the other
filter has a lower target false positive ratio.
Returns Err(Error::CapacityExceeded) if the filter cannot merge all items.
Note that in this case items could have already been added and the filter is left
full but in an otherwise valid state.