[−][src]Enum bloom2::FilterSize
FilterSize bounds the allocated size and false-positive rate of a
Bloom2
instance.
The false positive probability for a bloom filter increases as the number of entries increases. This relationship is demonstrated using 64bit hashes as keys for each possible filter configuration below - you should choose a filter size for your expected load level and hash size.
The value of FilterSize controls the k
property of the filter: k = input_length_bytes / FilterSize
.
Variants
1 byte / 8 bits per key results in a bloom filter with a minimum memory usage of ~4 bytes and a maximum memory usage of 36 bytes.
The false positive probability using k=1
(a single byte key per entry)
grows proportionally to the number of entries in the filter:
+--+------------+------------+-----------+------------+------------+-----+
1 + * * +
| |
| * |
P 0.8 + +
r | |
o | |
b 0.6 + * +
a | |
b | |
i 0.4 + +
l | * |
i | |
t 0.2 + +
y | |
| * |
0 + ***** * * +
+--+------------+------------+-----------+------------+------------+-----+
0 50 100 150 200 250
Number of Entries
The probability of false positives reaches 1-in-2 after 80 entries.
An empty sparse bloom filter would require 1x64 bit block map entries (8 bytes) to map 4 64 bit blocks, containing a total of 256 bits (memory saving: 75%)
2 bytes / 16 bits per key results in a bloom filter with a minimum memory usage of ~1024 bytes and a maximum memory usage of ~8KB when fully populated.
When using a 64bit hash (4x2 byte keys, k=4
) the probability of a
false positive is:
+--+------------------------+-----------------------+--------------------+
1 + * * +
| * |
| |
P 0.8 + * +
r | |
o | |
b 0.6 + +
a | * |
b | |
i 0.4 + +
l | * |
i | |
t 0.2 + +
y | * |
| * |
0 + ***** +
+--+------------------------+-----------------------+--------------------+
0 50000 1e+05
Number of Entries
The probability of false positives reaches 1-in-2 after 30118 entries.
An empty sparse bloom filter would require 16x64 bit block map entries (128 bytes) to map 1024 64 bit blocks, containing a total of 65536 bits (memory saving: 98.4375%)
3 bytes / 24 bits per key results in a bloom filter with a minimum memory usage of ~262KB bytes and a maximum memory usage of ~2MB when fully populated.
When using a 64bit hash (2x3 byte keys, k=2
) the probability of a
false positive is:
1 +--+------------------+-------------------+------------------+-----------+
| * |
| * |
0.8 + +
P | * |
r | |
o | |
b 0.6 + * +
a | |
b | |
i 0.4 + * +
l | |
i | * |
t 0.2 + +
y | * |
| * * |
0 + **** +
+--+------------------+-------------------+------------------+-----------+
0 1e+07 2e+07 3e+07
Number of Entries
The probability of false positives reaches 1-in-2 after 10300768 entries.
An empty sparse bloom filter would require 4096x64 bit block map entries (32768 bytes) to map 262144 64 bit blocks, containing a total of 16777216 bits (memory saving: 98.4375%)
4 bytes / 32 bits per key results in a bloom filter with a minimum memory usage of ~67MB and a maximum memory usage of ~603MB when fully populated.
When using a 64bit hash (2x4 byte keys, k=2
) the probability of a
false positive is:
1 +--+--------------+--------------+--------------+--------------+---------+
| * |
| * |
0.8 + +
P | * |
r | |
o | |
b 0.6 + * +
a | |
b | |
i 0.4 + * +
l | |
i | * |
t 0.2 + +
y | * |
| * * |
0 + **** +
+--+--------------+--------------+--------------+--------------+---------+
0 2e+09 4e+09 6e+09 8e+09
Number of Entries
The probability of false positives reaches 1-in-2 after 2636996484 entries.
An empty sparse bloom filter would require 1048576x64 bit block map entries (8388608 bytes) to map 67108864 64 bit blocks, containing a total of 4294967296 bits (memory saving: 98.4375%)
5 bytes / 32 bits per key results in a bloom filter with a minimum memory usage of ~17GB and a maximum memory usage of ~1117GB when fully populated.
If you actually need this get in touch - I have some ideas for reducing the memory footprint even further.
When using a 64bit hash (1x5 byte keys, k=1
) the probability of a
false positive is:
+--+----------+---------+---------+----------+---------+---------+-------+
1 + * * +
| * |
| * |
P 0.8 + +
r | * |
o | * |
b 0.6 + +
a | * |
b | |
i 0.4 + * +
l | |
i | * |
t 0.2 + * +
y | * |
| * |
0 + ** +
+--+----------+---------+---------+----------+---------+---------+-------+
0 1e+12 2e+12 3e+12 4e+12 5e+12 6e+12
Number of Entries
The probability of false positives reaches 1-in-2 after 762123384786 entries.
An empty sparse bloom filter would require 268435456x64 bit block map entries (2147483648 bytes) to map 17179869184 64 bit blocks, containing a total of 1099511627776 bits (memory saving: 98.4375%)
Trait Implementations
impl Clone for FilterSize
[src]
pub fn clone(&self) -> FilterSize
[src]
pub fn clone_from(&mut self, source: &Self)
1.0.0[src]
impl Copy for FilterSize
[src]
impl Debug for FilterSize
[src]
impl Eq for FilterSize
[src]
impl PartialEq<FilterSize> for FilterSize
[src]
pub fn eq(&self, other: &FilterSize) -> bool
[src]
#[must_use]pub fn ne(&self, other: &Rhs) -> bool
1.0.0[src]
impl StructuralEq for FilterSize
[src]
impl StructuralPartialEq for FilterSize
[src]
Auto Trait Implementations
impl RefUnwindSafe for FilterSize
[src]
impl Send for FilterSize
[src]
impl Sync for FilterSize
[src]
impl Unpin for FilterSize
[src]
impl UnwindSafe for FilterSize
[src]
Blanket Implementations
impl<T> Any for T where
T: 'static + ?Sized,
[src]
T: 'static + ?Sized,
impl<T> Borrow<T> for T where
T: ?Sized,
[src]
T: ?Sized,
impl<T> BorrowMut<T> for T where
T: ?Sized,
[src]
T: ?Sized,
pub fn borrow_mut(&mut self) -> &mut T
[src]
impl<T> From<T> for T
[src]
impl<T, U> Into<U> for T where
U: From<T>,
[src]
U: From<T>,
impl<T> ToOwned for T where
T: Clone,
[src]
T: Clone,
type Owned = T
The resulting type after obtaining ownership.
pub fn to_owned(&self) -> T
[src]
pub fn clone_into(&self, target: &mut T)
[src]
impl<T, U> TryFrom<U> for T where
U: Into<T>,
[src]
U: Into<T>,
type Error = Infallible
The type returned in the event of a conversion error.
pub fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>
[src]
impl<T, U> TryInto<U> for T where
U: TryFrom<T>,
[src]
U: TryFrom<T>,