pub struct Stream<T> { /* private fields */ }
Expand description
Streaming Bloom filter.
This is based on a pair of Bloom filters, and is designed to serve as a “streaming” filter. What does this mean?
In a regular Bloom filter, after “some time” which corresponds
to nb_items
different items set info the filter, the filter
stops working well.
In this implementation, we use a pair of Bloom filters, once they are “wasted” because they have too many items, we drop one of them, and only keep the other. This way, we keep a trace of what was in the filter, but avoid growing infinitely and reach a state where any item would pass the filter.
In practice it is much more imprecise than a regular Bloom filter and the false positive rate is quite high. However it never reaches the point where it is totally unusable, it always refuses some items, and keeps the set within a reasonable limit.
§Examples
use ofilter::Stream;
use std::hash::Hash;
#[derive(Hash)]
struct Obj {
some: usize,
thing: usize,
}
let mut filter: Stream<Obj> = Stream::new(100);
let obj = Obj{ some: 1, thing: 2};
assert!(!filter.check(&obj)); // object is not in filter
filter.set(&obj); // put object in filter
assert!(filter.check(&obj)); // object is in filter
Implementations§
source§impl<T> Stream<T>
impl<T> Stream<T>
sourcepub fn new(capacity: usize) -> Self
pub fn new(capacity: usize) -> Self
Create a new streaming Bloom filter, with given capacity.
All other parameters are set to defaults, or aligned to match capacity.
This is different from a classic Bloom filter, it maintains a set
of items, in this set you can be sure that the last capacity
items
are always kept in the filter. In other words, it works like a Bloom
filter for the most recent items. Ancient items are progressively
removed from the set.
§Examples
use ofilter::Stream;
let filter: Stream<usize> = Stream::new(100);
assert_eq!(100, filter.capacity());
sourcepub fn new_with_params(params: Params) -> Self
pub fn new_with_params(params: Params) -> Self
Create a new streaming Bloom filter, with specific parameters.
§Examples
use ofilter::{Stream, Params};
let filter: Stream<usize> = Stream::new_with_params(Params::with_nb_items_and_fp_rate(100, 0.1));
assert_eq!(100, filter.capacity());
sourcepub fn params(&self) -> Params
pub fn params(&self) -> Params
Get filter params.
This can be useful because when creating the filter, the .adjust()
func is called, and may decide to fine-tune some parameters. With this,
one can know exactly what is used by the filter.
§Examples
use ofilter::Stream;
let filter: Stream<usize> = Stream::new(100);
println!("{}", filter.params());
sourcepub fn capacity(&self) -> usize
pub fn capacity(&self) -> usize
Get filter capacity.
Returns the value of params.nb_items
, that is the number of
items the filter is designed for.
use ofilter::{Stream, Params};
let filter: Stream<usize> = Stream::new_with_params(Params::with_bit_len(1_000_000));
assert_eq!(52681, filter.capacity());
sourcepub fn clear(&mut self)
pub fn clear(&mut self)
Clear the filter.
Clears the bit vector, but keeps parameters.
use ofilter::Stream;
let mut filter: Stream<usize> = Stream::new(1_000);
filter.set(&10);
assert!(filter.check(&10));
filter.clear();
assert!(!filter.check(&10));
sourcepub fn is_empty(&self) -> bool
pub fn is_empty(&self) -> bool
Returns true if filter is empty.
use ofilter::Stream;
let mut filter: Stream<usize> = Stream::new(1_000);
assert!(filter.is_empty());
filter.set(&10);
assert!(!filter.is_empty());
sourcepub fn max_fp_rate(&self) -> f64
pub fn max_fp_rate(&self) -> f64
Returns the current max false positive rate.
In theory the false positive rate fp_rate
is known at
filter creation. But that, is the theoretical fp_rate
that the filter reaches when it is “wasted” because it
has too many entries. Until then, it performs better
than that, statistically.
As this filter uses two filters under the hood, there are technically 2 values for the actual false positive rate.
This function returns the max value, which is usually the most significant, as it’s reflecting the current behavior of the filter.
use ofilter::Stream;
let mut filter: Stream<usize> = Stream::new(1_000);
assert_eq!(0.0, filter.max_fp_rate());
filter.set(&10);
assert!(filter.max_fp_rate() > 0.0); // will be params.fp_rate when filter is full
sourcepub fn fp_rates(&self) -> (f64, f64)
pub fn fp_rates(&self) -> (f64, f64)
Returns the current false positive rates.
In theory the false positive rate fp_rate
is known at
filter creation. But that, is the theoretical fp_rate
that the filter reaches when it is “wasted” because it
has too many entries. Until then, it performs better
than that, statistically.
As this filter uses two filters under the hood, there are technically 2 values for the actual false positive rate.
This function returns both values, starting with the highest one, which is usually the most significant as it’s reflecting the current behavior of the filter.
use ofilter::Stream;
let mut filter: Stream<usize> = Stream::new(1_000);
assert_eq!((0.0, 0.0), filter.fp_rates());
filter.set(&10);
let fp_rates = filter.fp_rates();
assert!(fp_rates.0 > 0.0); // will be greater than params.fp_rate when filter is full
assert!(fp_rates.1 > 0.0); // will be params.fp_rate when filter is full
sourcepub fn min_level(&self) -> f64
pub fn min_level(&self) -> f64
Returns the current ratios between real min fp_rate, and theoretical fp_rate..
This is a helper to quickly compare the real min fp_rate
with
the theoretical fp_rate
. It compares the min value and not the max,
as it is when the min value reaches 1.0 that the filters are swapped
after one is cleared.
When this value greater than 1.0, the filter should soon “age” and clear the most filled underlying Bloom filter.
use ofilter::Stream;
let mut filter: Stream<usize> = Stream::new(1_000);
assert_eq!((0.0, 0.0), filter.levels());
filter.set(&10);
let levels = filter.levels();
assert!(levels.0 > 0.0); // will be greater than 1.0 when filter is full
assert!(levels.1 > 0.0); // will be 1.0 when filter is full
sourcepub fn levels(&self) -> (f64, f64)
pub fn levels(&self) -> (f64, f64)
Returns the current ratios between real fp_rates, and theoretical fp_rates.
This is a helper to quickly compare the real fp_rates
with
the theoretical fp_rates
.
When both of these values are greater than 1.0, the filter should soon “age” and clear the most filled underlying Bloom filter.
use ofilter::Stream;
let mut filter: Stream<usize> = Stream::new(1_000);
assert_eq!((0.0, 0.0), filter.levels());
filter.set(&10);
let levels = filter.levels();
assert!(levels.0 > 0.0); // will be greater than 1.0 when filter is full
assert!(levels.1 > 0.0); // will be 1.0 when filter is full
sourcepub fn age(&self) -> usize
pub fn age(&self) -> usize
Returns the age of the filter.
The age is the number of times the underlying Bloom filters have been cleared and swapped.
In the beginning this is 0, then when filters are full, one of them is cleared, and age becomes one. Everytime this happens again, age increases.
As a consequence, the age gives a raw estimation of how
many different items have been through the cache. With a
given capacity
, the number of different items the cache
has seen is roughly age * capacity
. This is just an
approximation.
use ofilter::{Stream, Params};
let params = Params{
nb_hash: 2,
bit_len: 0,
nb_items: 1_000,
fp_rate: 0.1,
predict: true,
};
let mut filter: Stream<usize> = Stream::new_with_params(params);
assert_eq!(0, filter.age());
for i in 0..100_000 {
filter.set(&i);
}
println!("{}", filter.age());
assert!(filter.age() >= 90);
assert!(filter.age() <= 100);
sourcepub fn resize(&mut self, capacity: usize)
pub fn resize(&mut self, capacity: usize)
Resize the filter.
As the underlying Bloom filters are recycled in a streaming filter, it is possible to resize it.
What it means is that next time the filter ages,
the new created Bloom filter will have the new
size. It will keep the same number of hash nb_hash
and
false positive rate fp_rate
. Only the number
of bits bit_len
and number of items nb_items
change.
use ofilter::Stream;
let mut filter: Stream<usize> = Stream::new(1_000);
filter.set(&1);
filter.resize(100);
assert_eq!(100, filter.capacity());
assert!(filter.check(&1));
source§impl<T> Stream<T>where
T: Hash,
impl<T> Stream<T>where
T: Hash,
sourcepub fn set(&mut self, item: &T)
pub fn set(&mut self, item: &T)
Record an item in the set.
Once this has been called, any call to check()
will return
true, as there are no false negatives. However some other items
may test positive as a consequence of recording this one.
However, after some time, the item will ultimately disappear from the set and be replaced by new entries. Only the most recent entries are guaranteed to be here.
use ofilter::Stream;
let mut filter: Stream<usize> = Stream::new(1_000);
filter.set(&42);
sourcepub fn check(&self, item: &T) -> bool
pub fn check(&self, item: &T) -> bool
Guess whether an item is likely to be in the set.
If set()
has been called before with value, then this returns
true, as there are no false negatives. However it may respond
true even if the item has never been recorded in the set.
However, if set()
has been called too long ago and too many
items have been added since, it will test negative anyway.
use ofilter::Stream;
let mut filter: Stream<usize> = Stream::new(1_000);
filter.set(&42);
assert!(filter.check(&42));
sourcepub fn check_and_set(&mut self, item: &T) -> bool
pub fn check_and_set(&mut self, item: &T) -> bool
Record an item in the set and returns its previous value.
Equivalent to calling get()
then set()
but performs
hash lookup only once so it’s a bit more efficient.
use ofilter::Stream;
let mut filter: Stream<usize> = Stream::new(1_000);
assert!(!filter.check_and_set(&42));
assert!(filter.check(&42));
Trait Implementations§
source§impl<'de, T> Deserialize<'de> for Stream<T>where
T: Deserialize<'de>,
impl<'de, T> Deserialize<'de> for Stream<T>where
T: Deserialize<'de>,
source§fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where
__D: Deserializer<'de>,
source§impl<T> Display for Stream<T>
impl<T> Display for Stream<T>
Pretty-print the filter.
§Examples
use ofilter::Stream;
let filter: Stream<usize> = Stream::new(100);
assert_eq!("{ age: 0, fp_rates: (0.000000, 0.000000), params: { nb_hash: 2, bit_len: 1899, nb_items: 100, fp_rate: 0.009992, predict: false } }" , format!("{}", filter));
Auto Trait Implementations§
impl<T> Freeze for Stream<T>
impl<T> RefUnwindSafe for Stream<T>where
T: RefUnwindSafe,
impl<T> Send for Stream<T>where
T: Send,
impl<T> Sync for Stream<T>where
T: Sync,
impl<T> Unpin for Stream<T>where
T: Unpin,
impl<T> UnwindSafe for Stream<T>where
T: UnwindSafe,
Blanket Implementations§
source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
source§impl<T> FmtForward for T
impl<T> FmtForward for T
source§fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
fn fmt_binary(self) -> FmtBinary<Self>where
Self: Binary,
self
to use its Binary
implementation when Debug
-formatted.source§fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
fn fmt_display(self) -> FmtDisplay<Self>where
Self: Display,
self
to use its Display
implementation when
Debug
-formatted.source§fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
fn fmt_lower_exp(self) -> FmtLowerExp<Self>where
Self: LowerExp,
self
to use its LowerExp
implementation when
Debug
-formatted.source§fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
fn fmt_lower_hex(self) -> FmtLowerHex<Self>where
Self: LowerHex,
self
to use its LowerHex
implementation when
Debug
-formatted.source§fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
fn fmt_octal(self) -> FmtOctal<Self>where
Self: Octal,
self
to use its Octal
implementation when Debug
-formatted.source§fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
fn fmt_pointer(self) -> FmtPointer<Self>where
Self: Pointer,
self
to use its Pointer
implementation when
Debug
-formatted.source§fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
fn fmt_upper_exp(self) -> FmtUpperExp<Self>where
Self: UpperExp,
self
to use its UpperExp
implementation when
Debug
-formatted.source§fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
fn fmt_upper_hex(self) -> FmtUpperHex<Self>where
Self: UpperHex,
self
to use its UpperHex
implementation when
Debug
-formatted.source§impl<T> Pipe for Twhere
T: ?Sized,
impl<T> Pipe for Twhere
T: ?Sized,
source§fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> Rwhere
Self: Sized,
source§fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> Rwhere
R: 'a,
self
and passes that borrow into the pipe function. Read moresource§fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> Rwhere
R: 'a,
self
and passes that borrow into the pipe function. Read moresource§fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
source§fn pipe_borrow_mut<'a, B, R>(
&'a mut self,
func: impl FnOnce(&'a mut B) -> R
) -> R
fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R ) -> R
source§fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
self
, then passes self.as_ref()
into the pipe function.source§fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
self
, then passes self.as_mut()
into the pipe
function.source§fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
self
, then passes self.deref()
into the pipe function.source§impl<T> Tap for T
impl<T> Tap for T
source§fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
Borrow<B>
of a value. Read moresource§fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
BorrowMut<B>
of a value. Read moresource§fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
AsRef<R>
view of a value. Read moresource§fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
AsMut<R>
view of a value. Read moresource§fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
Deref::Target
of a value. Read moresource§fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
Deref::Target
of a value. Read moresource§fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
fn tap_dbg(self, func: impl FnOnce(&Self)) -> Self
.tap()
only in debug builds, and is erased in release builds.source§fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
fn tap_mut_dbg(self, func: impl FnOnce(&mut Self)) -> Self
.tap_mut()
only in debug builds, and is erased in release
builds.source§fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
.tap_borrow()
only in debug builds, and is erased in release
builds.source§fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
.tap_borrow_mut()
only in debug builds, and is erased in release
builds.source§fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
.tap_ref()
only in debug builds, and is erased in release
builds.source§fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
.tap_ref_mut()
only in debug builds, and is erased in release
builds.source§fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
.tap_deref()
only in debug builds, and is erased in release
builds.