[][src]Enum bloom2::FilterSize

pub enum FilterSize {
    KeyBytes1,
    KeyBytes2,
    KeyBytes3,
    KeyBytes4,
    KeyBytes5,
}

FilterSize bounds the allocated size of a CompressedBitmap.

The false positive probability for a bloom filter increases as the number of entries increases. This relationship is demonstrated using sha256 hashes as keys for each possible filter configuration below - you should choose a filter size for your expected load level and hash size.

The value of FilterSize controls the k property of the filter: k = input_length_bytes / FilterSize.

Variants

KeyBytes1

1 byte / 8 bits per key results in a bloom filter with a minimum memory usage of ~4 bytes and a maximum memory usage of 36 bytes.

The false positive probability using k=1 (a single byte key per entry) grows proportionally to the number of entries in the filter:

          +--+------------+------------+------------+-----------+------------+-----+
        1 +                                                *                   *   +
          |                                                                        |
          |                                   *                                    |
    P 0.8 +                         *                                              +
    r     |                                                                        |
    o     |                   *                                                    |
    b 0.6 +                                                                        +
    a     |              *                                                         |
    b     |                                                                        |
    i 0.4 +          *                                                             +
    l     |                                                                        |
    i     |        *                                                               |
    t 0.2 +      *                                                                 +
    y     |    **                                                                  |
          |   **                                                                   |
        0 +  **                                                                    +
          +--+------------+------------+------------+-----------+------------+-----+
             0           200          400          600         800         1000     
                                      Number of Entries    

            The probability of false positives reaches 1-in-2 after 178 entries.
KeyBytes2

2 bytes / 16 bits per key results in a bloom filter with a minimum memory usage of ~1024 bytes and a maximum memory usage of ~8KB when fully populated.

When using a sha256 hash (256 bits, or 16x2 byte keys, k=16) the probability of a false positive is:

          +--+-------------------+-------------------+------------------+----------+
        1 +                                                                   *    +
          |                                                *                       |
          |                                                                        |
    P 0.8 +                                                                        +
    r     |                                  *                                     |
    o     |                                                                        |
    b 0.6 +                                                                        +
    a     |                                                                        |
    b     |                                                                        |
    i 0.4 +                         *                                              +
    l     |                                                                        |
    i     |                                                                        |
    t 0.2 +                                                                        +
    y     |                                                                        |
          |                  *                                                     |
        0 +  ***** * *   *                                                         +
          +--+-------------------+-------------------+------------------+----------+
             0                 10000               20000              30000         
                                      Number of Entries                             

           The probability of false positives reaches 1-in-2 after 12,947 entries.
KeyBytes3

3 bytes / 24 bits per key results in a bloom filter with a minimum memory usage of ~262KB bytes and a maximum memory usage of ~2MB when fully populated.

When using a sha256 hash (256 bits, or ~11x3 byte keys, k=~11) the probability of a false positive is:

        1 +--+---------------+--------------+--------------+---------------+-------+
          |                                                                   *    |
          |                                                                        |
      0.8 +                                                *                       +
    P     |                                                                        |
    r     |                                                                        |
    o     |                                                                        |
    b 0.6 +                                                                        +
    a     |                                  *                                     |
    b     |                                                                        |
    i 0.4 +                                                                        +
    l     |                                                                        |
    i     |                                                                        |
    t 0.2 +                         *                                              +
    y     |                                                                        |
          |                                                                        |
        0 +  ***** * *   *   *                                                     +
          +--+---------------+--------------+--------------+---------------+-------+
             0             2e+06          4e+06          6e+06           8e+06      
                                      Number of Entries                             

          The probability of false positives reaches 1-in-2 after 4,264,082 entries.
KeyBytes4

4 bytes / 32 bits per key results in a bloom filter with a minimum memory usage of ~67MB and a maximum memory usage of ~603MB when fully populated.

When using a sha256 hash (256 bits, or 8x3 byte keys, k=8) the probability of a false positive is:

        1 +--+----------+---------+----------+----------+---------+----------+-----+
          |                                                                   *    |
          |                                                                        |
          |                                                *                       |
    P 0.8 +                                                                        +
    r     |                                                                        |
    o     |                                  *                                     |
    b 0.6 +                                                                        +
    a     |                                                                        |
    b     |                                                                        |
    i 0.4 +                                                                        +
    l     |                         *                                              |
    i     |                                                                        |
    t 0.2 +                                                                        +
    y     |                                                                        |
          |                  *                                                     |
        0 +  ***** * *   *                                                         +
          +--+----------+---------+----------+----------+---------+----------+-----+
             0        5e+08     1e+09     1.5e+09     2e+09    2.5e+09     3e+09    
                                      Number of Entries                             

        The probability of false positives reaches 1-in-2 after 1,336,252,043 entries.
KeyBytes5

5 bytes / 32 bits per key results in a bloom filter with a minimum memory usage of ~17GB and a maximum memory usage of ~1117GB when fully populated.

If you actually need this get in touch - I have some ideas for reducing the memory footprint even further.

When using a sha256 hash (256 bits, or ~7x3 byte keys, k=~7) the probability of a false positive is:

        1 +--+----------------+---------------+----------------+---------------+---+
          |                                                                   *    |
          |                                                                        |
      0.8 +                                                *                       +
    P     |                                                                        |
    r     |                                                                        |
    o     |                                                                        |
    b 0.6 +                                  *                                     +
    a     |                                                                        |
    b     |                                                                        |
    i 0.4 +                                                                        +
    l     |                                                                        |
    i     |                         *                                              |
    t 0.2 +                                                                        +
    y     |                                                                        |
          |                  *                                                     |
        0 +  ***** * *   *                                                         +
          +--+----------------+---------------+----------------+---------------+---+
             0              2e+11           4e+11            6e+11           8e+11  
                                      Number of Entries                             

       The probability of false positives reaches 1-in-2 after 370,932,038,704 entries.

Trait Implementations

impl Clone for FilterSize[src]

impl Copy for FilterSize[src]

impl Debug for FilterSize[src]

Auto Trait Implementations

Blanket Implementations

impl<T> Any for T where
    T: 'static + ?Sized
[src]

impl<T> Borrow<T> for T where
    T: ?Sized
[src]

impl<T> BorrowMut<T> for T where
    T: ?Sized
[src]

impl<T> From<T> for T[src]

impl<T, U> Into<U> for T where
    U: From<T>, 
[src]

impl<T> ToOwned for T where
    T: Clone
[src]

type Owned = T

The resulting type after obtaining ownership.

impl<T, U> TryFrom<U> for T where
    U: Into<T>, 
[src]

type Error = Infallible

The type returned in the event of a conversion error.

impl<T, U> TryInto<U> for T where
    U: TryFrom<T>, 
[src]

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.