pub struct CountMinSketch<T: ?Sized, S = DefaultHashBuilder> { /* private fields */ }alloc only.Expand description
A sublinear-space frequency estimator.
A Count-Min Sketch counts how many times it has seen each item using a fixed
grid of counters, far smaller than a real HashMap<T, u64> would be. Each
item is hashed into one counter per row and those counters are incremented;
the estimate for an item is the minimum across its rows. The estimate never
undercounts and overcounts by a bounded amount with high probability — the
width controls the error magnitude epsilon, the depth controls the
confidence delta.
The sketch is generic over the item type T and a
BuildHasher S, defaulting to the deterministic
DefaultHashBuilder.
§Examples
use bloom_lib::CountMinSketch;
// ~0.1% error with 99.9% confidence.
let mut sketch = CountMinSketch::new(0.001, 0.001).unwrap();
sketch.increment("apple");
sketch.add("apple", 4);
sketch.increment("banana");
assert!(sketch.estimate("apple") >= 5); // never undercounts
assert_eq!(sketch.estimate("cherry"), 0);Implementations§
Source§impl<T: ?Sized> CountMinSketch<T, DefaultHashBuilder>
impl<T: ?Sized> CountMinSketch<T, DefaultHashBuilder>
Sourcepub fn new(epsilon: f64, delta: f64) -> Result<Self, Error>
pub fn new(epsilon: f64, delta: f64) -> Result<Self, Error>
Creates a sketch sized for an error factor epsilon at confidence
1 - delta, using the default hasher.
The estimate for any item never undercounts and, with probability at
least 1 - delta, overcounts by at most epsilon * N, where N is the
total of all counts added. Smaller epsilon widens the grid; smaller
delta deepens it.
§Parameters
epsilon: the relative error factor. Must be in(0.0, 1.0). The width isceil(e / epsilon).delta: the failure probability. Must be in(0.0, 1.0). The depth isceil(ln(1 / delta)).
§Errors
Returns Error::InvalidParameter if either argument is not a finite
value in the open interval (0.0, 1.0).
§Examples
use bloom_lib::CountMinSketch;
let sketch = CountMinSketch::<&str>::new(0.01, 0.01).unwrap();
assert!(sketch.width() >= 271);
assert!(sketch.depth() >= 4);Sourcepub fn with_dimensions(width: usize, depth: usize) -> Result<Self, Error>
pub fn with_dimensions(width: usize, depth: usize) -> Result<Self, Error>
Creates a sketch with an explicit width and depth, using the default
hasher.
§Parameters
width: counters per row. Must be non-zero.depth: number of rows (independent hashes). Must be non-zero.
§Errors
Returns Error::InvalidParameter if either argument is zero.
§Examples
use bloom_lib::CountMinSketch;
let sketch = CountMinSketch::<u32>::with_dimensions(2_048, 5).unwrap();
assert_eq!(sketch.width(), 2_048);
assert_eq!(sketch.depth(), 5);Source§impl<T: ?Sized, S: BuildHasher> CountMinSketch<T, S>
impl<T: ?Sized, S: BuildHasher> CountMinSketch<T, S>
Sourcepub fn with_hasher(epsilon: f64, delta: f64, hasher: S) -> Result<Self, Error>
pub fn with_hasher(epsilon: f64, delta: f64, hasher: S) -> Result<Self, Error>
Creates a sketch from epsilon/delta with a caller-supplied hasher.
§Errors
Returns Error::InvalidParameter if either argument is not a finite
value in (0.0, 1.0).
§Examples
use std::collections::hash_map::RandomState;
use bloom_lib::CountMinSketch;
let sketch: CountMinSketch<&str, RandomState> =
CountMinSketch::with_hasher(0.01, 0.01, RandomState::new()).unwrap();Sourcepub fn with_dimensions_and_hasher(
width: usize,
depth: usize,
hasher: S,
) -> Result<Self, Error>
pub fn with_dimensions_and_hasher( width: usize, depth: usize, hasher: S, ) -> Result<Self, Error>
Creates a sketch with an explicit geometry and a caller-supplied hasher.
§Errors
Returns Error::InvalidParameter if width or depth is zero.
Sourcepub fn add(&mut self, item: &T, count: u64)where
T: Hash,
pub fn add(&mut self, item: &T, count: u64)where
T: Hash,
Records count additional occurrences of item.
Counters saturate at u64::MAX rather than overflowing, so an
adversarial or runaway stream degrades gracefully instead of panicking or
wrapping.
§Examples
use bloom_lib::CountMinSketch;
let mut sketch = CountMinSketch::new(0.01, 0.01).unwrap();
sketch.add("page-view", 250);
assert!(sketch.estimate("page-view") >= 250);Sourcepub fn increment(&mut self, item: &T)where
T: Hash,
pub fn increment(&mut self, item: &T)where
T: Hash,
Records a single occurrence of item. Shorthand for add(item, 1).
§Examples
use bloom_lib::CountMinSketch;
let mut sketch = CountMinSketch::new(0.01, 0.01).unwrap();
sketch.increment("hit");
sketch.increment("hit");
assert!(sketch.estimate("hit") >= 2);Sourcepub fn estimate(&self, item: &T) -> u64where
T: Hash,
pub fn estimate(&self, item: &T) -> u64where
T: Hash,
Estimates the number of times item has been added.
The estimate is the minimum counter across all rows. It never undercounts the true total and overcounts only by the sketch’s error bound.
§Examples
use bloom_lib::CountMinSketch;
let mut sketch = CountMinSketch::new(0.001, 0.001).unwrap();
for _ in 0..100 {
sketch.increment("frequent");
}
let estimate = sketch.estimate("frequent");
assert!((100..=110).contains(&estimate));Sourcepub fn total_count(&self) -> u64
pub fn total_count(&self) -> u64
The sum of every count added (saturating).
Unlike per-item estimates, this is exact up to saturation, because every
add contributes to it directly.
§Examples
use bloom_lib::CountMinSketch;
let mut sketch = CountMinSketch::new(0.01, 0.01).unwrap();
sketch.add("a", 3);
sketch.add("b", 7);
assert_eq!(sketch.total_count(), 10);Sourcepub fn clear(&mut self)
pub fn clear(&mut self)
Resets every counter to zero, retaining the allocation.
§Examples
use bloom_lib::CountMinSketch;
let mut sketch = CountMinSketch::new(0.01, 0.01).unwrap();
sketch.increment("x");
sketch.clear();
assert_eq!(sketch.estimate("x"), 0);
assert_eq!(sketch.total_count(), 0);Sourcepub fn merge(&mut self, other: &Self) -> Result<(), Error>
pub fn merge(&mut self, other: &Self) -> Result<(), Error>
Merges other into self by summing counters cell by cell (saturating).
After the merge, the sketch estimates frequencies as if every item from both sketches had been added to one. Both sketches must share their geometry.
§Errors
Returns Error::IncompatibleParameters if the two sketches differ in
width or depth.
§Examples
use bloom_lib::CountMinSketch;
let mut a = CountMinSketch::with_dimensions(512, 4).unwrap();
let mut b = CountMinSketch::with_dimensions(512, 4).unwrap();
a.add("shared", 2);
b.add("shared", 3);
a.merge(&b).unwrap();
assert!(a.estimate("shared") >= 5);Trait Implementations§
Source§impl<T: Clone + ?Sized, S: Clone> Clone for CountMinSketch<T, S>
impl<T: Clone + ?Sized, S: Clone> Clone for CountMinSketch<T, S>
Source§fn clone(&self) -> CountMinSketch<T, S>
fn clone(&self) -> CountMinSketch<T, S>
1.0.0 (const: unstable) · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more