Struct Params

Source

pub struct Params {
    pub nb_hash: usize,
    pub bit_len: usize,
    pub nb_items: usize,
    pub fp_rate: f64,
    pub predict: bool,
}

Expand description

Parameters used to build bloom filters.

This is a utility to set up various bloom filter parameters. There are typically 4 parameters:

n (nb_items): number of items in the filter
p (fp_rate): probability of false positives, fraction between 0 and 1
m (bit_len): number of bits in the filter
k (nb_hash): number of hash functions

All of them are linked, if one changes the size of the filter, obviously, it changes the number of items it can hold.

A few things to keep in mind:

for a pure Bloom filter, the only technical parameters that matter are nb_hash and bit_len. They fully define the filter.
from a functional point of view, most of the time you are interested in a given nb_items and fp_rate. From there the 2 others are derived.

You may partially fill this struct, anything which contains zero will be inferred either by deducing it from other defined params, or by replacing it with a default value. This is done by calling .adjust().

Also, there is a predict field which can be used to turn the filter into a predicatable filter. This is convenient for testing, and may be of interest in edge cases but most of the time a real random version is prefered. It avoids bias and also can protect against some DDOS attacks based on hash collision.

A few helpers are defined as well, for common use-cases.

§A bit of theory

This interactive Bloom filter params calulator proved very useful while developping this. Also it recaps the usual formulas, with:

n -> number of items in the filter
p -> probability of false positives, fraction between 0 and 1
m -> number of bits in the filter
k -> number of hash functions

We have:

n = ceil(m / (-k / log(1 - exp(log(p) / k))))
p = pow(1 - exp(-k / (m / n)), k)
m = ceil((n * log(p)) / log(1 / pow(2, log(2))))
k = round((m / n) * log(2))

A command line equivalent would be this practical bloom-filter-calculator.rb gist

Note that there are corner cases, for example, the formula that gives bit_len (m) from nb_items (n) and fp_rate (p), corresponds to the optimal case, when nb_hash (k) has been chosen optimally. If not, it has to be revisited and adapted to the real value of nb_hash (k). In practice, unless you impose it, what this package does is enforce a nb_hash (k) of 2, which is generally optimal if you consider CPU time and not memory usage.

§Links

§Examples

Getting a filter for a given number of items, everything else default:

use ofilter::Params;

let params = Params::with_nb_items(10_000);

assert_eq!("{ nb_hash: 2, bit_len: 189825, nb_items: 10000, fp_rate: 0.010000, predict: false }", format!("{}", &params));

Getting a filter for a given number of items, false positive rate, and enforcing number of hash:

use ofilter::Params;

let params = Params{
    nb_hash: 3,
    bit_len: 0,
    nb_items: 100_000,
    fp_rate: 0.1,
    predict: false,
}.adjust();

assert_eq!("{ nb_hash: 3, bit_len: 480833, nb_items: 100000, fp_rate: 0.100000, predict: false }", format!("{}", &params));

Fields§

§nb_hash: usize

Number of hash functions used by the filter.

Also referred to as k is most Bloom filter papers.

§bit_len: usize

Length of the bit vector used by the filter.

Also referred to as m is most Bloom filter papers.

§nb_items: usize

Number of items the Bloom filter is designed for.

Also referred to as n is most Bloom filter papers.

§fp_rate: f64

False positive rate of the Bloom filter.

Also referred to as p is most Bloom filter papers.

§predict: bool

If set to true, Bloom filter is predictable.

Use this for testing, when you want something that is 100% predictable and avoid flaky behavior. In production, it would be safer to rely on the random, statistical default behavior.

One reason is security, among other examples, using a predictable hash for a cache may expose you to some sort of DDOS attack.

If in doubt, leave it to false.

Struct Params Copy item path

§A bit of theory

§Links

§Examples

Fields§

Implementations§

impl Params

pub fn with_nb_items(nb_items: usize) -> Params

§Examples:

pub fn with_nb_items_and_fp_rate(nb_items: usize, fp_rate: f64) -> Params

§Examples:

pub fn with_bit_len(bit_len: usize) -> Params

§Examples:

pub fn with_bit_len_and_nb_items(bit_len: usize, nb_items: usize) -> Params

§Examples:

pub fn estimate_nb_items(nb_hash: usize, bit_len: usize, fp_rate: f64) -> usize

§Examples:

pub fn estimate_fp_rate(nb_hash: usize, bit_len: usize, nb_items: usize) -> f64

§Examples:

pub fn optimal_bit_len(nb_items: usize, fp_rate: f64) -> usize

§Examples:

pub fn optimal_nb_hash(bit_len: usize, nb_items: usize) -> usize

§Examples:

pub fn guess_bit_len_for_fp_rate( nb_hash: usize, nb_items: usize, fp_rate: f64, ) -> usize

§Examples:

pub fn guess_nb_hash_for_fp_rate( bit_len: usize, nb_items: usize, fp_rate: f64, ) -> usize

§Examples:

pub fn adjust(self) -> Self

§Examples:

Trait Implementations§

impl Clone for Params

fn clone(&self) -> Params

fn clone_from(&mut self, source: &Self)

impl Debug for Params

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl Default for Params

fn default() -> Self

impl<'de> Deserialize<'de> for Params

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>where __D: Deserializer<'de>,

impl Display for Params

§Examples

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl PartialEq for Params

§Examples

fn eq(&self, other: &Self) -> bool

fn ne(&self, other: &Rhs) -> bool

impl Serialize for Params

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>where __S: Serializer,

impl Eq for Params

Auto Trait Implementations§

impl Freeze for Params

impl RefUnwindSafe for Params

impl Send for Params

impl Sync for Params

impl Unpin for Params

impl UnwindSafe for Params

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CloneToUninit for Twhere T: Clone,

unsafe fn clone_to_uninit(&self, dest: *mut u8)

impl<T> Conv for T

fn conv<T>(self) -> Twhere Self: Into<T>,

impl<T> FmtForward for T

fn fmt_binary(self) -> FmtBinary<Self>where Self: Binary,

fn fmt_display(self) -> FmtDisplay<Self>where Self: Display,

fn fmt_lower_exp(self) -> FmtLowerExp<Self>where Self: LowerExp,

fn fmt_lower_hex(self) -> FmtLowerHex<Self>where Self: LowerHex,

fn fmt_octal(self) -> FmtOctal<Self>where Self: Octal,

fn fmt_pointer(self) -> FmtPointer<Self>where Self: Pointer,

fn fmt_upper_exp(self) -> FmtUpperExp<Self>where Self: UpperExp,

fn fmt_upper_hex(self) -> FmtUpperHex<Self>where Self: UpperHex,

fn fmt_list(self) -> FmtList<Self>where &'a Self: for<'a> IntoIterator,

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

Struct Params

fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where __D: Deserializer<'de>,

fn serialize<S>(&self, serializer: S) -> Result<S::Ok, S::Error>
where S: Serializer,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T> CloneToUninit for T
where T: Clone,

fn conv<T>(self) -> T
where Self: Into<T>,

fn fmt_binary(self) -> FmtBinary<Self>
where Self: Binary,

fn fmt_display(self) -> FmtDisplay<Self>
where Self: Display,

fn fmt_lower_exp(self) -> FmtLowerExp<Self>
where Self: LowerExp,

fn fmt_lower_hex(self) -> FmtLowerHex<Self>
where Self: LowerHex,

fn fmt_octal(self) -> FmtOctal<Self>
where Self: Octal,

fn fmt_pointer(self) -> FmtPointer<Self>
where Self: Pointer,

fn fmt_upper_exp(self) -> FmtUpperExp<Self>
where Self: UpperExp,

fn fmt_upper_hex(self) -> FmtUpperHex<Self>
where Self: UpperHex,

fn fmt_list(self) -> FmtList<Self>
where &'a Self: for<'a> IntoIterator,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> Pipe for T
where T: ?Sized,

fn pipe<R>(self, func: impl FnOnce(Self) -> R) -> R
where Self: Sized,

fn pipe_ref<'a, R>(&'a self, func: impl FnOnce(&'a Self) -> R) -> R
where R: 'a,

fn pipe_ref_mut<'a, R>(&'a mut self, func: impl FnOnce(&'a mut Self) -> R) -> R
where R: 'a,

fn pipe_borrow<'a, B, R>(&'a self, func: impl FnOnce(&'a B) -> R) -> R
where Self: Borrow<B>, B: 'a + ?Sized, R: 'a,

fn pipe_borrow_mut<'a, B, R>( &'a mut self, func: impl FnOnce(&'a mut B) -> R, ) -> R
where Self: BorrowMut<B>, B: 'a + ?Sized, R: 'a,

fn pipe_as_ref<'a, U, R>(&'a self, func: impl FnOnce(&'a U) -> R) -> R
where Self: AsRef<U>, U: 'a + ?Sized, R: 'a,

fn pipe_as_mut<'a, U, R>(&'a mut self, func: impl FnOnce(&'a mut U) -> R) -> R
where Self: AsMut<U>, U: 'a + ?Sized, R: 'a,

fn pipe_deref<'a, T, R>(&'a self, func: impl FnOnce(&'a T) -> R) -> R
where Self: Deref<Target = T>, T: 'a + ?Sized, R: 'a,

fn pipe_deref_mut<'a, T, R>( &'a mut self, func: impl FnOnce(&'a mut T) -> R, ) -> R
where Self: DerefMut<Target = T> + Deref, T: 'a + ?Sized, R: 'a,

fn tap_borrow<B>(self, func: impl FnOnce(&B)) -> Self
where Self: Borrow<B>, B: ?Sized,

fn tap_borrow_mut<B>(self, func: impl FnOnce(&mut B)) -> Self
where Self: BorrowMut<B>, B: ?Sized,

fn tap_ref<R>(self, func: impl FnOnce(&R)) -> Self
where Self: AsRef<R>, R: ?Sized,

fn tap_ref_mut<R>(self, func: impl FnOnce(&mut R)) -> Self
where Self: AsMut<R>, R: ?Sized,

fn tap_deref<T>(self, func: impl FnOnce(&T)) -> Self
where Self: Deref<Target = T>, T: ?Sized,

fn tap_deref_mut<T>(self, func: impl FnOnce(&mut T)) -> Self
where Self: DerefMut<Target = T> + Deref, T: ?Sized,

fn tap_borrow_dbg<B>(self, func: impl FnOnce(&B)) -> Self
where Self: Borrow<B>, B: ?Sized,

fn tap_borrow_mut_dbg<B>(self, func: impl FnOnce(&mut B)) -> Self
where Self: BorrowMut<B>, B: ?Sized,

fn tap_ref_dbg<R>(self, func: impl FnOnce(&R)) -> Self
where Self: AsRef<R>, R: ?Sized,

fn tap_ref_mut_dbg<R>(self, func: impl FnOnce(&mut R)) -> Self
where Self: AsMut<R>, R: ?Sized,

fn tap_deref_dbg<T>(self, func: impl FnOnce(&T)) -> Self
where Self: Deref<Target = T>, T: ?Sized,

fn tap_deref_mut_dbg<T>(self, func: impl FnOnce(&mut T)) -> Self
where Self: DerefMut<Target = T> + Deref, T: ?Sized,

impl<T> ToOwned for T
where T: Clone,

impl<T> ToString for T
where T: Display + ?Sized,

fn try_conv<T>(self) -> Result<T, Self::Error>
where Self: TryInto<T>,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,