Crate cmsketch

Crate cmsketch 

Source
Expand description

A probabilistic counting data structure that never undercounts items before it hits counter’s capacity. It is a table structure with the depth being the number of hashes and the width being the number of unique items. When a key is inserted, each row’s hash function is used to generate the index for that row. Then the element’s count at that index is incremented. As a result, one key being inserted will increment different indices in each row. Querying the count returns the minimum values of these elements since some hashes might collide.

Users are supposed to synchronize concurrent accesses to the data structure.

E.g. insert(1) hash1(1) = 2 -> increment row 1, index 2 hash2(1) = 5 -> increment row 2, index 5 hash3(1) = 3 -> increment row 3, index 3 etc.

§Usage

use cmsketch::CMSketchU32;

const ERROR: f64 = 0.01;
const CONFIDENCE: f64 = 0.95;

fn main() {
    let mut cms = CMSketchU32::new(ERROR, CONFIDENCE);
    for i in 0..10 {
        for _ in 0..i {
            cms.inc(i);
        }
    }
     
    for i in 0..10 {
        assert!(cms.estimate(i) >= i as u32);
    }
     
    cms.halve();
     
    for i in 0..10 {
        assert!(cms.estimate(i) >= (i as f64 * 0.5) as u32);
    }
}

Structs§

CMSketchAtomicU8
Count-Min Sketch that stores u8 counters using atomics for concurrent updates.
CMSketchAtomicU16
Count-Min Sketch that stores u16 counters using atomics for concurrent updates.
CMSketchAtomicU32
Count-Min Sketch that stores u32 counters using atomics for concurrent updates.
CMSketchAtomicU64
Count-Min Sketch that stores u64 counters using atomics for concurrent updates.
CMSketchAtomicUsize
Count-Min Sketch that stores usize counters using atomics for concurrent updates.
CMSketchU8
Count-Min Sketch storing non-atomic u8 counters.
CMSketchU16
Count-Min Sketch storing non-atomic u16 counters.
CMSketchU32
Count-Min Sketch storing non-atomic u32 counters.
CMSketchU64
Count-Min Sketch storing non-atomic u64 counters.
CMSketchUsize
Count-Min Sketch storing non-atomic usize counters.