pub struct Fuse8 {
pub seed: u64,
pub segment_length: usize,
pub fingerprints: Box<[u8]>,
}
BinaryFuse8
Expand description
Xor filter using 8-bit fingerprints in a fuse graph. Requires less space than an Xor8
.
A Fuse8
filter uses <9.101 bits per entry of the set is it constructed from, and has a false
positive rate of <0.4%. As with other probabilistic filters, a higher number of entries decreases
the bits per entry but increases the false positive rate.
A Fuse8
filter uses less space and is faster to construct than an Xor8
filter, but
requires a large number of keys to be constructed. Experimentally, this number is somewhere
>100_000. For smaller key sets, prefer the Xor8
filter. A Fuse8
filter may fail to be
constructed.
A Fuse8
is constructed from a set of 64-bit unsigned integers and is immutable.
use xorf::{Filter, Fuse8};
use core::convert::TryFrom;
const SAMPLE_SIZE: usize = 1_000_000;
let keys: Vec<u64> = (0..SAMPLE_SIZE).map(|_| rng.gen()).collect();
let filter = Fuse8::try_from(&keys).unwrap();
// no false negatives
for key in keys {
assert!(filter.contains(&key));
}
// bits per entry
let bpe = (filter.len() as f64) * 8.0 / (SAMPLE_SIZE as f64);
assert!(bpe < 9.101, "Bits per entry is {}", bpe);
// false positive rate
let false_positives: usize = (0..SAMPLE_SIZE)
.map(|_| rng.gen())
.filter(|n| filter.contains(n))
.count();
let fp_rate: f64 = (false_positives * 100) as f64 / SAMPLE_SIZE as f64;
assert!(fp_rate < 0.406, "False positive rate is {}", fp_rate);
Serializing and deserializing Fuse8
filters can be enabled with the serde
feature (or [bincode
] for bincode).
Fields§
§seed: u64
BinaryFuse8
The seed for the filter
segment_length: usize
BinaryFuse8
The number of blocks in the filter
fingerprints: Box<[u8]>
BinaryFuse8
The fingerprints for the filter
Implementations§
source§impl Fuse8
impl Fuse8
sourcepub fn try_from_iterator<T>(keys: T) -> Result<Self, &'static str>
pub fn try_from_iterator<T>(keys: T) -> Result<Self, &'static str>
Try to construct the filter from a key iterator. Can be used directly if you don’t have a contiguous array of u64 keys.
Note: the iterator will be iterated over multiple times while building the filter. If using a hash function to map the key, it may be cheaper just to create a scratch array of hashed keys that you pass in.