Struct ofilter::Stream

source ·
pub struct Stream<T> { /* private fields */ }
Expand description

Streaming Bloom filter.

This is based on a pair of Bloom filters, and is designed to serve as a “streaming” filter. What does this mean?

In a regular Bloom filter, after “some time” which corresponds to nb_items different items set info the filter, the filter stops working well.

In this implementation, we use a pair of Bloom filters, once they are “wasted” because they have too many items, we drop one of them, and only keep the other. This way, we keep a trace of what was in the filter, but avoid growing infinitely and reach a state where any item would pass the filter.

In practice it is much more imprecise than a regular Bloom filter and the false positive rate is quite high. However it never reaches the point where it is totally unusable, it always refuses some items, and keeps the set within a reasonable limit.

§Examples

use ofilter::Stream;
use std::hash::Hash;

#[derive(Hash)]
struct Obj {
    some: usize,
    thing: usize,
}

let mut filter: Stream<Obj> = Stream::new(100);
let obj = Obj{ some: 1, thing: 2};

assert!(!filter.check(&obj)); // object is not in filter
filter.set(&obj);             // put object in filter
assert!(filter.check(&obj));  // object is in filter

Implementations§

source§

impl<T> Stream<T>

source

pub fn new(capacity: usize) -> Self

Create a new streaming Bloom filter, with given capacity.

All other parameters are set to defaults, or aligned to match capacity.

This is different from a classic Bloom filter, it maintains a set of items, in this set you can be sure that the last capacity items are always kept in the filter. In other words, it works like a Bloom filter for the most recent items. Ancient items are progressively removed from the set.

§Examples
use ofilter::Stream;

let filter: Stream<usize> = Stream::new(100);
assert_eq!(100, filter.capacity());
source

pub fn new_with_params(params: Params) -> Self

Create a new streaming Bloom filter, with specific parameters.

§Examples
use ofilter::{Stream, Params};

let filter: Stream<usize> = Stream::new_with_params(Params::with_nb_items_and_fp_rate(100, 0.1));
assert_eq!(100, filter.capacity());
source

pub fn params(&self) -> Params

Get filter params.

This can be useful because when creating the filter, the .adjust() func is called, and may decide to fine-tune some parameters. With this, one can know exactly what is used by the filter.

§Examples
use ofilter::Stream;

let filter: Stream<usize> = Stream::new(100);
println!("{}", filter.params());
source

pub fn capacity(&self) -> usize

Get filter capacity.

Returns the value of params.nb_items, that is the number of items the filter is designed for.

use ofilter::{Stream, Params};

let filter: Stream<usize> = Stream::new_with_params(Params::with_bit_len(1_000_000));
assert_eq!(52681, filter.capacity());
source

pub fn clear(&mut self)

Clear the filter.

Clears the bit vector, but keeps parameters.

use ofilter::Stream;

let mut filter: Stream<usize> = Stream::new(1_000);
filter.set(&10);
assert!(filter.check(&10));
filter.clear();
assert!(!filter.check(&10));
source

pub fn is_empty(&self) -> bool

Returns true if filter is empty.

use ofilter::Stream;

let mut filter: Stream<usize> = Stream::new(1_000);
assert!(filter.is_empty());
filter.set(&10);
assert!(!filter.is_empty());
source

pub fn max_fp_rate(&self) -> f64

Returns the current max false positive rate.

In theory the false positive rate fp_rate is known at filter creation. But that, is the theoretical fp_rate that the filter reaches when it is “wasted” because it has too many entries. Until then, it performs better than that, statistically.

As this filter uses two filters under the hood, there are technically 2 values for the actual false positive rate.

This function returns the max value, which is usually the most significant, as it’s reflecting the current behavior of the filter.

use ofilter::Stream;

let mut filter: Stream<usize> = Stream::new(1_000);
assert_eq!(0.0, filter.max_fp_rate());
filter.set(&10);
assert!(filter.max_fp_rate() > 0.0); // will be params.fp_rate when filter is full
source

pub fn fp_rates(&self) -> (f64, f64)

Returns the current false positive rates.

In theory the false positive rate fp_rate is known at filter creation. But that, is the theoretical fp_rate that the filter reaches when it is “wasted” because it has too many entries. Until then, it performs better than that, statistically.

As this filter uses two filters under the hood, there are technically 2 values for the actual false positive rate.

This function returns both values, starting with the highest one, which is usually the most significant as it’s reflecting the current behavior of the filter.

use ofilter::Stream;

let mut filter: Stream<usize> = Stream::new(1_000);
assert_eq!((0.0, 0.0), filter.fp_rates());
filter.set(&10);
let fp_rates = filter.fp_rates();
assert!(fp_rates.0 > 0.0); // will be greater than params.fp_rate when filter is full
assert!(fp_rates.1 > 0.0); // will be params.fp_rate when filter is full
source

pub fn min_level(&self) -> f64

Returns the current ratios between real min fp_rate, and theoretical fp_rate..

This is a helper to quickly compare the real min fp_rate with the theoretical fp_rate. It compares the min value and not the max, as it is when the min value reaches 1.0 that the filters are swapped after one is cleared.

When this value greater than 1.0, the filter should soon “age” and clear the most filled underlying Bloom filter.

use ofilter::Stream;

let mut filter: Stream<usize> = Stream::new(1_000);
assert_eq!((0.0, 0.0), filter.levels());
filter.set(&10);
let levels = filter.levels();
assert!(levels.0 > 0.0); // will be greater than 1.0 when filter is full
assert!(levels.1 > 0.0); // will be 1.0 when filter is full
source

pub fn levels(&self) -> (f64, f64)

Returns the current ratios between real fp_rates, and theoretical fp_rates.

This is a helper to quickly compare the real fp_rates with the theoretical fp_rates.

When both of these values are greater than 1.0, the filter should soon “age” and clear the most filled underlying Bloom filter.

use ofilter::Stream;

let mut filter: Stream<usize> = Stream::new(1_000);
assert_eq!((0.0, 0.0), filter.levels());
filter.set(&10);
let levels = filter.levels();
assert!(levels.0 > 0.0); // will be greater than 1.0 when filter is full
assert!(levels.1 > 0.0); // will be 1.0 when filter is full
source

pub fn age(&self) -> usize

Returns the age of the filter.

The age is the number of times the underlying Bloom filters have been cleared and swapped.

In the beginning this is 0, then when filters are full, one of them is cleared, and age becomes one. Everytime this happens again, age increases.

As a consequence, the age gives a raw estimation of how many different items have been through the cache. With a given capacity, the number of different items the cache has seen is roughly age * capacity. This is just an approximation.

use ofilter::{Stream, Params};

let params = Params{
    nb_hash: 2,
    bit_len: 0,
    nb_items: 1_000,
    fp_rate: 0.1,
    predict: true,
};
let mut filter: Stream<usize> = Stream::new_with_params(params);
assert_eq!(0, filter.age());
for i in 0..100_000 {
    filter.set(&i);
}
println!("{}", filter.age());
assert!(filter.age() >= 90);
assert!(filter.age() <= 100);
source

pub fn resize(&mut self, capacity: usize)

Resize the filter.

As the underlying Bloom filters are recycled in a streaming filter, it is possible to resize it.

What it means is that next time the filter ages, the new created Bloom filter will have the new size. It will keep the same number of hash nb_hash and false positive rate fp_rate. Only the number of bits bit_len and number of items nb_items change.

use ofilter::Stream;

let mut filter: Stream<usize> = Stream::new(1_000);
filter.set(&1);
filter.resize(100);
assert_eq!(100, filter.capacity());
assert!(filter.check(&1));
source§

impl<T> Stream<T>
where T: Hash,

source

pub fn set(&mut self, item: &T)

Record an item in the set.

Once this has been called, any call to check() will return true, as there are no false negatives. However some other items may test positive as a consequence of recording this one.

However, after some time, the item will ultimately disappear from the set and be replaced by new entries. Only the most recent entries are guaranteed to be here.

use ofilter::Stream;

let mut filter: Stream<usize> = Stream::new(1_000);
filter.set(&42);
source

pub fn check(&self, item: &T) -> bool

Guess whether an item is likely to be in the set.

If set() has been called before with value, then this returns true, as there are no false negatives. However it may respond true even if the item has never been recorded in the set.

However, if set() has been called too long ago and too many items have been added since, it will test negative anyway.

use ofilter::Stream;

let mut filter: Stream<usize> = Stream::new(1_000);
filter.set(&42);
assert!(filter.check(&42));
source

pub fn check_and_set(&mut self, item: &T) -> bool

Record an item in the set and returns its previous value.

Equivalent to calling get() then set() but performs hash lookup only once so it’s a bit more efficient.

use ofilter::Stream;

let mut filter: Stream<usize> = Stream::new(1_000);
assert!(!filter.check_and_set(&42));
assert!(filter.check(&42));

Trait Implementations§

source§

impl<T: Clone> Clone for Stream<T>

source§

fn clone(&self) -> Stream<T>

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl<T: Debug> Debug for Stream<T>

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl<'de, T> Deserialize<'de> for Stream<T>
where T: Deserialize<'de>,

source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
source§

impl<T> Display for Stream<T>

Pretty-print the filter.

§Examples

use ofilter::Stream;

let filter: Stream<usize> = Stream::new(100);

assert_eq!("{ age: 0, fp_rates: (0.000000, 0.000000), params: { nb_hash: 2, bit_len: 1899, nb_items: 100, fp_rate: 0.009992, predict: false } }" , format!("{}", filter));
source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl<T> Serialize for Stream<T>
where T: Serialize,

source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more

Auto Trait Implementations§

§

impl<T> Freeze for Stream<T>

§

impl<T> RefUnwindSafe for Stream<T>
where T: RefUnwindSafe,

§

impl<T> Send for Stream<T>
where T: Send,

§

impl<T> Sync for Stream<T>
where T: Sync,

§

impl<T> Unpin for Stream<T>
where T: Unpin,

§

impl<T> UnwindSafe for Stream<T>
where T: UnwindSafe,

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> Conv for T

source§

fn conv<T>(self) -> T
where Self: Into<T>,

Converts self into T using Into<T>. Read more
source§

impl<T> FmtForward for T

source§

fn fmt_binary(self) -> FmtBinary<Self>
where Self: Binary,

Causes self to use its Binary implementation when Debug-formatted.
source§

fn fmt_display(self) -> FmtDisplay<Self>
where Self: Display,

Causes self to use its Display implementation when Debug-formatted.
source§

fn fmt_lower_exp(self) -> FmtLowerExp<Self>
where Self: LowerExp,

Causes self to use its LowerExp implementation when Debug-formatted.
source§

fn fmt_lower_hex(self) -> FmtLowerHex<Self>
where Self: LowerHex,

Causes self to use its LowerHex implementation when Debug-formatted.
source§

fn fmt_octal(self) -> FmtOctal<Self>
where Self: Octal,

Causes self to use its Octal implementation when Debug-formatted.
source§

fn fmt_pointer(self) -> FmtPointer<Self>
where Self: Pointer,

Causes self to use its Pointer implementation when Debug-formatted.
source§

fn fmt_upper_exp(self) -> FmtUpperExp<Self>
where Self: UpperExp,

Causes self to use its UpperExp implementation when Debug-formatted.
source§

fn fmt_upper_hex(self) -> FmtUpperHex<Self>
where Self: UpperHex,

Causes self to use its UpperHex implementation when Debug-formatted.
source§

fn fmt_list(self) -> FmtList<Self>
where &'a Self: for<'a> IntoIterator,

Formats each item in a sequence. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> Pipe for T
where T: ?Sized,

source§

fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> R
where Self: Sized,

Pipes by value. This is generally the method you want to use. Read more
source§

fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> R
where R: 'a,

Borrows self and passes that borrow into the pipe function. Read more
source§

fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> R
where R: 'a,

Mutably borrows self and passes that borrow into the pipe function. Read more
source§

fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
where Self: Borrow<B>, B: 'a + ?Sized, R: 'a,

Borrows self, then passes self.borrow() into the pipe function. Read more
source§

fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R ) -> R
where Self: BorrowMut<B>, B: 'a + ?Sized, R: 'a,

Mutably borrows self, then passes self.borrow_mut() into the pipe function. Read more
source§

fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
where Self: AsRef<U>, U: 'a + ?Sized, R: 'a,

Borrows self, then passes self.as_ref() into the pipe function.
source§

fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
where Self: AsMut<U>, U: 'a + ?Sized, R: 'a,

Mutably borrows self, then passes self.as_mut() into the pipe function.
source§

fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
where Self: Deref<Target = T>, T: 'a + ?Sized, R: 'a,

Borrows self, then passes self.deref() into the pipe function.
source§

fn pipe_deref_mut<'a, T, R>( &'a mut self, func: impl FnOnce(&'a mut T) -> R ) -> R
where Self: DerefMut<Target = T> + Deref, T: 'a + ?Sized, R: 'a,

Mutably borrows self, then passes self.deref_mut() into the pipe function.
source§

impl<T> Tap for T

source§

fn tap(self, func: impl FnOnce(&Self)) -> Self

Immutable access to a value. Read more
source§

fn tap_mut(self, func: impl FnOnce(&mut Self)) -> Self

Mutable access to a value. Read more
source§

fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
where Self: Borrow<B>, B: ?Sized,

Immutable access to the Borrow<B> of a value. Read more
source§

fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
where Self: BorrowMut<B>, B: ?Sized,

Mutable access to the BorrowMut<B> of a value. Read more
source§

fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
where Self: AsRef<R>, R: ?Sized,

Immutable access to the AsRef<R> view of a value. Read more
source§

fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
where Self: AsMut<R>, R: ?Sized,

Mutable access to the AsMut<R> view of a value. Read more
source§

fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
where Self: Deref<Target = T>, T: ?Sized,

Immutable access to the Deref::Target of a value. Read more
source§

fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
where Self: DerefMut<Target = T> + Deref, T: ?Sized,

Mutable access to the Deref::Target of a value. Read more
source§

fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self

Calls .tap() only in debug builds, and is erased in release builds.
source§

fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self

Calls .tap_mut() only in debug builds, and is erased in release builds.
source§

fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
where Self: Borrow<B>, B: ?Sized,

Calls .tap_borrow() only in debug builds, and is erased in release builds.
source§

fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
where Self: BorrowMut<B>, B: ?Sized,

Calls .tap_borrow_mut() only in debug builds, and is erased in release builds.
source§

fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
where Self: AsRef<R>, R: ?Sized,

Calls .tap_ref() only in debug builds, and is erased in release builds.
source§

fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
where Self: AsMut<R>, R: ?Sized,

Calls .tap_ref_mut() only in debug builds, and is erased in release builds.
source§

fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
where Self: Deref<Target = T>, T: ?Sized,

Calls .tap_deref() only in debug builds, and is erased in release builds.
source§

fn tap_deref_mut_dbg<T>(self, func: impl FnOnce(&mut T)) -> Self
where Self: DerefMut<Target = T> + Deref, T: ?Sized,

Calls .tap_deref_mut() only in debug builds, and is erased in release builds.
source§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T> ToString for T
where T: Display + ?Sized,

source§

default fn to_string(&self) -> String

Converts the given value to a String. Read more
source§

impl<T> TryConv for T

source§

fn try_conv<T>(self) -> Result<T, Self::Error>
where Self: TryInto<T>,

Attempts to convert self into T using TryInto<T>. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,