[−][src]Enum bloom2::FilterSize

pub enum FilterSize {
    KeyBytes1,
    KeyBytes2,
    KeyBytes3,
    KeyBytes4,
    KeyBytes5,
}

FilterSize bounds the allocated size and false-positive rate of a Bloom2 instance.

The false positive probability for a bloom filter increases as the number of entries increases. This relationship is demonstrated using 64bit hashes as keys for each possible filter configuration below - you should choose a filter size for your expected load level and hash size.

The value of FilterSize controls the k property of the filter: k = input_length_bytes / FilterSize.

Variants

KeyBytes1

1 byte / 8 bits per key results in a bloom filter with a minimum memory usage of ~4 bytes and a maximum memory usage of 36 bytes.

The false positive probability using k=1 (a single byte key per entry) grows proportionally to the number of entries in the filter:

          +--+------------+------------+-----------+------------+------------+-----+
        1 +                                                *                   *   +
          |                                                                        |
          |                                   *                                    |
    P 0.8 +                                                                        +
    r     |                                                                        |
    o     |                                                                        |
    b 0.6 +                         *                                              +
    a     |                                                                        |
    b     |                                                                        |
    i 0.4 +                                                                        +
    l     |                  *                                                     |
    i     |                                                                        |
    t 0.2 +                                                                        +
    y     |                                                                        |
          |              *                                                         |
        0 +  ***** * *                                                             +
          +--+------------+------------+-----------+------------+------------+-----+
             0           50           100         150          200          250     
                                      Number of Entries

The probability of false positives reaches 1-in-2 after 80 entries.

An empty sparse bloom filter would require 1x64 bit block map entries (8 bytes) to map 4 64 bit blocks, containing a total of 256 bits (memory saving: 75%)

KeyBytes2

2 bytes / 16 bits per key results in a bloom filter with a minimum memory usage of ~1024 bytes and a maximum memory usage of ~8KB when fully populated.

When using a 64bit hash (4x2 byte keys, k=4) the probability of a false positive is:

          +--+------------------------+-----------------------+--------------------+
        1 +                                                *                  *    +
          |                                  *                                     |
          |                                                                        |
    P 0.8 +                         *                                              +
    r     |                                                                        |
    o     |                                                                        |
    b 0.6 +                                                                        +
    a     |                  *                                                     |
    b     |                                                                        |
    i 0.4 +                                                                        +
    l     |              *                                                         |
    i     |                                                                        |
    t 0.2 +                                                                        +
    y     |          *                                                             |
          |        *                                                               |
        0 +  *****                                                                 +
          +--+------------------------+-----------------------+--------------------+
             0                      50000                   1e+05                   
                                      Number of Entries

The probability of false positives reaches 1-in-2 after 30118 entries.

An empty sparse bloom filter would require 16x64 bit block map entries (128 bytes) to map 1024 64 bit blocks, containing a total of 65536 bits (memory saving: 98.4375%)

KeyBytes3

3 bytes / 24 bits per key results in a bloom filter with a minimum memory usage of ~262KB bytes and a maximum memory usage of ~2MB when fully populated.

When using a 64bit hash (2x3 byte keys, k=2) the probability of a false positive is:

        1 +--+------------------+-------------------+------------------+-----------+
          |                                                                   *    |
          |                                                *                       |
      0.8 +                                                                        +
    P     |                                  *                                     |
    r     |                                                                        |
    o     |                                                                        |
    b 0.6 +                         *                                              +
    a     |                                                                        |
    b     |                                                                        |
    i 0.4 +                  *                                                     +
    l     |                                                                        |
    i     |              *                                                         |
    t 0.2 +                                                                        +
    y     |          *                                                             |
          |      * *                                                               |
        0 +  ****                                                                  +
          +--+------------------+-------------------+------------------+-----------+
             0                1e+07               2e+07              3e+07          
                                      Number of Entries

The probability of false positives reaches 1-in-2 after 10300768 entries.

An empty sparse bloom filter would require 4096x64 bit block map entries (32768 bytes) to map 262144 64 bit blocks, containing a total of 16777216 bits (memory saving: 98.4375%)

KeyBytes4

4 bytes / 32 bits per key results in a bloom filter with a minimum memory usage of ~67MB and a maximum memory usage of ~603MB when fully populated.

When using a 64bit hash (2x4 byte keys, k=2) the probability of a false positive is:

        1 +--+--------------+--------------+--------------+--------------+---------+
          |                                                                   *    |
          |                                                *                       |
      0.8 +                                                                        +
    P     |                                  *                                     |
    r     |                                                                        |
    o     |                                                                        |
    b 0.6 +                         *                                              +
    a     |                                                                        |
    b     |                                                                        |
    i 0.4 +                  *                                                     +
    l     |                                                                        |
    i     |              *                                                         |
    t 0.2 +                                                                        +
    y     |          *                                                             |
          |      * *                                                               |
        0 +  ****                                                                  +
          +--+--------------+--------------+--------------+--------------+---------+
             0            2e+09          4e+09          6e+09          8e+09        
                                      Number of Entries

The probability of false positives reaches 1-in-2 after 2636996484 entries.

An empty sparse bloom filter would require 1048576x64 bit block map entries (8388608 bytes) to map 67108864 64 bit blocks, containing a total of 4294967296 bits (memory saving: 98.4375%)

KeyBytes5

5 bytes / 32 bits per key results in a bloom filter with a minimum memory usage of ~17GB and a maximum memory usage of ~1117GB when fully populated.

If you actually need this get in touch - I have some ideas for reducing the memory footprint even further.

When using a 64bit hash (1x5 byte keys, k=1) the probability of a false positive is:

          +--+----------+---------+---------+----------+---------+---------+-------+
        1 +                                                *                  *    +
          |                                  *                                     |
          |                         *                                              |
    P 0.8 +                                                                        +
    r     |                  *                                                     |
    o     |              *                                                         |
    b 0.6 +                                                                        +
    a     |          *                                                             |
    b     |                                                                        |
    i 0.4 +        *                                                               +
    l     |                                                                        |
    i     |      *                                                                 |
    t 0.2 +     *                                                                  +
    y     |    *                                                                   |
          |   *                                                                    |
        0 +  **                                                                    +
          +--+----------+---------+---------+----------+---------+---------+-------+
             0        1e+12     2e+12     3e+12      4e+12     5e+12     6e+12      
                                      Number of Entries

The probability of false positives reaches 1-in-2 after 762123384786 entries.

An empty sparse bloom filter would require 268435456x64 bit block map entries (2147483648 bytes) to map 17179869184 64 bit blocks, containing a total of 1099511627776 bits (memory saving: 98.4375%)

Trait Implementations

`impl Clone for FilterSize`[src]

`pub fn clone(&self) -> FilterSize`[src]

`pub fn clone_from(&mut self, source: &Self)`1.0.0[src]

`impl Copy for FilterSize`[src]

`impl Debug for FilterSize`[src]

`pub fn fmt(&self, f: &mut Formatter<'_>) -> Result`[src]

`impl Eq for FilterSize`[src]

`impl PartialEq<FilterSize> for FilterSize`[src]

`pub fn eq(&self, other: &FilterSize) -> bool`[src]

`#[must_use]pub fn ne(&self, other: &Rhs) -> bool`1.0.0[src]

`impl StructuralEq for FilterSize`[src]

`impl StructuralPartialEq for FilterSize`[src]

Auto Trait Implementations

`impl RefUnwindSafe for FilterSize`[src]

`impl Send for FilterSize`[src]

`impl Sync for FilterSize`[src]

`impl Unpin for FilterSize`[src]

`impl UnwindSafe for FilterSize`[src]

Blanket Implementations

`impl<T> Any for T where T: 'static + ?Sized,` [src]

`pub fn type_id(&self) -> TypeId`[src]

`impl<T> Borrow<T> for T where T: ?Sized,` [src]

`pub fn borrow(&self) -> &T`[src]

`impl<T> BorrowMut<T> for T where T: ?Sized,` [src]

`pub fn borrow_mut(&mut self) -> &mut T`[src]

`impl<T> From<T> for T`[src]

`pub fn from(t: T) -> T`[src]

`impl<T, U> Into for T where U: From<T>,` [src]

`pub fn into(self) -> U`[src]

`impl<T> ToOwned for T where T: Clone,` [src]

`type Owned = T`

The resulting type after obtaining ownership.

`pub fn to_owned(&self) -> T`[src]

`pub fn clone_into(&self, target: &mut T)`[src]

`impl<T, U> TryFrom for T where U: Into<T>,` [src]

`type Error = Infallible`

The type returned in the event of a conversion error.

`pub fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>`[src]

`impl<T, U> TryInto for T where U: TryFrom<T>,` [src]

`type Error = >::Error`

The type returned in the event of a conversion error.

[−][src]Enum bloom2::FilterSize

Variants

Trait Implementations

impl Clone for FilterSize[src]

pub fn clone(&self) -> FilterSize[src]

pub fn clone_from(&mut self, source: &Self)1.0.0[src]

impl Copy for FilterSize[src]

impl Debug for FilterSize[src]

pub fn fmt(&self, f: &mut Formatter<'_>) -> Result[src]

impl Eq for FilterSize[src]

impl PartialEq<FilterSize> for FilterSize[src]

pub fn eq(&self, other: &FilterSize) -> bool[src]

#[must_use]pub fn ne(&self, other: &Rhs) -> bool1.0.0[src]

impl StructuralEq for FilterSize[src]

impl StructuralPartialEq for FilterSize[src]

Auto Trait Implementations

impl RefUnwindSafe for FilterSize[src]

impl Send for FilterSize[src]

impl Sync for FilterSize[src]

impl Unpin for FilterSize[src]

impl UnwindSafe for FilterSize[src]

Blanket Implementations

impl<T> Any for T where T: 'static + ?Sized, [src]

pub fn type_id(&self) -> TypeId[src]

impl<T> Borrow<T> for T where T: ?Sized, [src]

pub fn borrow(&self) -> &T[src]

impl<T> BorrowMut<T> for T where T: ?Sized, [src]

pub fn borrow_mut(&mut self) -> &mut T[src]

impl<T> From<T> for T[src]

pub fn from(t: T) -> T[src]

impl<T, U> Into<U> for T where U: From<T>, [src]

pub fn into(self) -> U[src]

impl<T> ToOwned for T where T: Clone, [src]

type Owned = T

pub fn to_owned(&self) -> T[src]

pub fn clone_into(&self, target: &mut T)[src]

impl<T, U> TryFrom<U> for T where U: Into<T>, [src]

type Error = Infallible

pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>[src]

impl<T, U> TryInto<U> for T where U: TryFrom<T>, [src]

type Error = <U as TryFrom<T>>::Error

pub fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>[src]