Expand description
A probabilistic counting data structure that never undercounts items before it hits counter’s capacity. It is a table structure with the depth being the number of hashes and the width being the number of unique items. When a key is inserted, each row’s hash function is used to generate the index for that row. Then the element’s count at that index is incremented. As a result, one key being inserted will increment different indices in each row. Querying the count returns the minimum values of these elements since some hashes might collide.
Users are supposed to synchronize concurrent accesses to the data structure.
E.g. insert(1) hash1(1) = 2 -> increment row 1, index 2 hash2(1) = 5 -> increment row 2, index 5 hash3(1) = 3 -> increment row 3, index 3 etc.
§Usage
use cmsketch::CMSketchU32;
const ERROR: f64 = 0.01;
const CONFIDENCE: f64 = 0.95;
fn main() {
let mut cms = CMSketchU32::new(ERROR, CONFIDENCE);
for i in 0..10 {
for _ in 0..i {
cms.inc(i);
}
}
for i in 0..10 {
assert!(cms.estimate(i) >= i as u32);
}
cms.halve();
for i in 0..10 {
assert!(cms.estimate(i) >= (i as f64 * 0.5) as u32);
}
}Structs§
- CMSketch
Atomic U8 - Count-Min Sketch that stores
u8counters using atomics for concurrent updates. - CMSketch
Atomic U16 - Count-Min Sketch that stores
u16counters using atomics for concurrent updates. - CMSketch
Atomic U32 - Count-Min Sketch that stores
u32counters using atomics for concurrent updates. - CMSketch
Atomic U64 - Count-Min Sketch that stores
u64counters using atomics for concurrent updates. - CMSketch
Atomic Usize - Count-Min Sketch that stores
usizecounters using atomics for concurrent updates. - CMSketch
U8 - Count-Min Sketch storing non-atomic
u8counters. - CMSketch
U16 - Count-Min Sketch storing non-atomic
u16counters. - CMSketch
U32 - Count-Min Sketch storing non-atomic
u32counters. - CMSketch
U64 - Count-Min Sketch storing non-atomic
u64counters. - CMSketch
Usize - Count-Min Sketch storing non-atomic
usizecounters.