pub struct Fuse16 {
pub seed: u64,
pub segment_length: usize,
pub fingerprints: Box<[u16]>,
}
BinaryFuse16
Expand description
Xor filter using 16-bit fingerprints in a fuse graph. Requires less space than an Xor16
.
A Fuse16
filter uses <18.202 bits per entry of the set is it constructed from, and has a false
positive rate of <0.002%. As with other probabilistic filters, a higher number of entries decreases
the bits per entry but increases the false positive rate.
A Fuse16
filter uses less space and is faster to construct than an Xor16
filter, but
requires a large number of keys to be constructed. Experimentally, this number is somewhere
>100_000. For smaller key sets, prefer the Xor16
filter. A Fuse16
filter may fail to be
constructed.
A Fuse16
is constructed from a set of 64-bit unsigned integers and is immutable.
use xorf::{Filter, Fuse16};
use core::convert::TryFrom;
const SAMPLE_SIZE: usize = 1_000_000;
let keys: Vec<u64> = (0..SAMPLE_SIZE).map(|_| rng.gen()).collect();
let filter = Fuse16::try_from(&keys).unwrap();
// no false negatives
for key in keys {
assert!(filter.contains(&key));
}
// bits per entry
let bpe = (filter.len() as f64) * 16.0 / (SAMPLE_SIZE as f64);
assert!(bpe < 18.202, "Bits per entry is {}", bpe);
// false positive rate
let false_positives: usize = (0..SAMPLE_SIZE)
.map(|_| rng.gen())
.filter(|n| filter.contains(n))
.count();
let fp_rate: f64 = (false_positives * 100) as f64 / SAMPLE_SIZE as f64;
assert!(fp_rate < 0.0025, "False positive rate is {}", fp_rate);
Serializing and deserializing Fuse16
filters can be enabled with the serde
feature (or [bincode
] for bincode).
Fields§
§seed: u64
BinaryFuse16
The seed for the filter
segment_length: usize
BinaryFuse16
The number of blocks in the filter
fingerprints: Box<[u16]>
BinaryFuse16
The fingerprints for the filter
Implementations§
source§impl Fuse16
impl Fuse16
sourcepub fn try_from_iterator<T>(keys: T) -> Result<Self, &'static str>
pub fn try_from_iterator<T>(keys: T) -> Result<Self, &'static str>
Try to construct the filter from a key iterator. Can be used directly if you don’t have a contiguous array of u64 keys.
Note: the iterator will be iterated over multiple times while building the filter. If using a hash function to map the key, it may be cheaper just to create a scratch array of hashed keys that you pass in.