[][src]Enum bloom2::FilterSize

pub enum FilterSize {
    KeyBytes1,
    KeyBytes2,
    KeyBytes3,
    KeyBytes4,
    KeyBytes5,
}

FilterSize bounds the allocated size and false-positive rate of a Bloom2 instance.

The false positive probability for a bloom filter increases as the number of entries increases. This relationship is demonstrated using 64bit hashes as keys for each possible filter configuration below - you should choose a filter size for your expected load level and hash size.

The value of FilterSize controls the k property of the filter: k = input_length_bytes / FilterSize.

Variants

KeyBytes1

1 byte / 8 bits per key results in a bloom filter with a minimum memory usage of ~4 bytes and a maximum memory usage of 36 bytes.

The false positive probability using k=1 (a single byte key per entry) grows proportionally to the number of entries in the filter:

          +--+------------+------------+-----------+------------+------------+-----+
        1 +                                                *                   *   +
          |                                                                        |
          |                                   *                                    |
    P 0.8 +                                                                        +
    r     |                                                                        |
    o     |                                                                        |
    b 0.6 +                         *                                              +
    a     |                                                                        |
    b     |                                                                        |
    i 0.4 +                                                                        +
    l     |                  *                                                     |
    i     |                                                                        |
    t 0.2 +                                                                        +
    y     |                                                                        |
          |              *                                                         |
        0 +  ***** * *                                                             +
          +--+------------+------------+-----------+------------+------------+-----+
             0           50           100         150          200          250     
                                      Number of Entries                             

The probability of false positives reaches 1-in-2 after 80 entries.

An empty sparse bloom filter would require 1x64 bit block map entries (8 bytes) to map 4 64 bit blocks, containing a total of 256 bits (memory saving: 75%)

KeyBytes2

2 bytes / 16 bits per key results in a bloom filter with a minimum memory usage of ~1024 bytes and a maximum memory usage of ~8KB when fully populated.

When using a 64bit hash (4x2 byte keys, k=4) the probability of a false positive is:

          +--+------------------------+-----------------------+--------------------+
        1 +                                                *                  *    +
          |                                  *                                     |
          |                                                                        |
    P 0.8 +                         *                                              +
    r     |                                                                        |
    o     |                                                                        |
    b 0.6 +                                                                        +
    a     |                  *                                                     |
    b     |                                                                        |
    i 0.4 +                                                                        +
    l     |              *                                                         |
    i     |                                                                        |
    t 0.2 +                                                                        +
    y     |          *                                                             |
          |        *                                                               |
        0 +  *****                                                                 +
          +--+------------------------+-----------------------+--------------------+
             0                      50000                   1e+05                   
                                      Number of Entries                             

The probability of false positives reaches 1-in-2 after 30118 entries.

An empty sparse bloom filter would require 16x64 bit block map entries (128 bytes) to map 1024 64 bit blocks, containing a total of 65536 bits (memory saving: 98.4375%)

KeyBytes3

3 bytes / 24 bits per key results in a bloom filter with a minimum memory usage of ~262KB bytes and a maximum memory usage of ~2MB when fully populated.

When using a 64bit hash (2x3 byte keys, k=2) the probability of a false positive is:

        1 +--+------------------+-------------------+------------------+-----------+
          |                                                                   *    |
          |                                                *                       |
      0.8 +                                                                        +
    P     |                                  *                                     |
    r     |                                                                        |
    o     |                                                                        |
    b 0.6 +                         *                                              +
    a     |                                                                        |
    b     |                                                                        |
    i 0.4 +                  *                                                     +
    l     |                                                                        |
    i     |              *                                                         |
    t 0.2 +                                                                        +
    y     |          *                                                             |
          |      * *                                                               |
        0 +  ****                                                                  +
          +--+------------------+-------------------+------------------+-----------+
             0                1e+07               2e+07              3e+07          
                                      Number of Entries                             

The probability of false positives reaches 1-in-2 after 10300768 entries.

An empty sparse bloom filter would require 4096x64 bit block map entries (32768 bytes) to map 262144 64 bit blocks, containing a total of 16777216 bits (memory saving: 98.4375%)

KeyBytes4

4 bytes / 32 bits per key results in a bloom filter with a minimum memory usage of ~67MB and a maximum memory usage of ~603MB when fully populated.

When using a 64bit hash (2x4 byte keys, k=2) the probability of a false positive is:

        1 +--+--------------+--------------+--------------+--------------+---------+
          |                                                                   *    |
          |                                                *                       |
      0.8 +                                                                        +
    P     |                                  *                                     |
    r     |                                                                        |
    o     |                                                                        |
    b 0.6 +                         *                                              +
    a     |                                                                        |
    b     |                                                                        |
    i 0.4 +                  *                                                     +
    l     |                                                                        |
    i     |              *                                                         |
    t 0.2 +                                                                        +
    y     |          *                                                             |
          |      * *                                                               |
        0 +  ****                                                                  +
          +--+--------------+--------------+--------------+--------------+---------+
             0            2e+09          4e+09          6e+09          8e+09        
                                      Number of Entries                             

The probability of false positives reaches 1-in-2 after 2636996484 entries.

An empty sparse bloom filter would require 1048576x64 bit block map entries (8388608 bytes) to map 67108864 64 bit blocks, containing a total of 4294967296 bits (memory saving: 98.4375%)

KeyBytes5

5 bytes / 32 bits per key results in a bloom filter with a minimum memory usage of ~17GB and a maximum memory usage of ~1117GB when fully populated.

If you actually need this get in touch - I have some ideas for reducing the memory footprint even further.

When using a 64bit hash (1x5 byte keys, k=1) the probability of a false positive is:

          +--+----------+---------+---------+----------+---------+---------+-------+
        1 +                                                *                  *    +
          |                                  *                                     |
          |                         *                                              |
    P 0.8 +                                                                        +
    r     |                  *                                                     |
    o     |              *                                                         |
    b 0.6 +                                                                        +
    a     |          *                                                             |
    b     |                                                                        |
    i 0.4 +        *                                                               +
    l     |                                                                        |
    i     |      *                                                                 |
    t 0.2 +     *                                                                  +
    y     |    *                                                                   |
          |   *                                                                    |
        0 +  **                                                                    +
          +--+----------+---------+---------+----------+---------+---------+-------+
             0        1e+12     2e+12     3e+12      4e+12     5e+12     6e+12      
                                      Number of Entries                             

The probability of false positives reaches 1-in-2 after 762123384786 entries.

An empty sparse bloom filter would require 268435456x64 bit block map entries (2147483648 bytes) to map 17179869184 64 bit blocks, containing a total of 1099511627776 bits (memory saving: 98.4375%)

Trait Implementations

impl Clone for FilterSize[src]

impl Copy for FilterSize[src]

impl Debug for FilterSize[src]

impl Eq for FilterSize[src]

impl PartialEq<FilterSize> for FilterSize[src]

impl StructuralEq for FilterSize[src]

impl StructuralPartialEq for FilterSize[src]

Auto Trait Implementations

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T> ToOwned for T where
    T: Clone
[src]

type Owned = T

The resulting type after obtaining ownership.

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.