pub struct TopK<T, S = DefaultHashBuilder> { /* private fields */ }alloc only.Expand description
Tracks the k most frequent items in a stream using bounded memory.
A Top-K tracker combines a CountMinSketch — which
estimates how often any item has been seen — with a small list of the k
highest-frequency items observed so far. Every insertion updates the sketch
and, when the item’s estimated count is high enough, promotes it into the
monitored list, evicting the current minimum. Memory is bounded by the sketch
size plus k stored keys, regardless of how many distinct items flow
through.
The estimate of which items are in the top set is approximate: under heavy churn a true heavy hitter can be missed if it never accumulates a high enough estimate while resident, but frequent, stable items are reported reliably.
Unlike the other structures, TopK stores the keys themselves, so the item
type must be Eq + Hash and sized. A key is moved into the monitored list
only when it is promoted, so non-promoted insertions store nothing.
§Examples
use bloom_lib::TopK;
let mut top = TopK::new(3, 0.001, 0.001).unwrap();
for _ in 0..100 { top.insert("apple"); }
for _ in 0..50 { top.insert("banana"); }
for _ in 0..10 { top.insert("cherry"); }
for _ in 0..1 { top.insert("date"); }
let ranked = top.top();
assert_eq!(ranked[0].0, &"apple");
assert_eq!(ranked[1].0, &"banana");
assert_eq!(ranked.len(), 3); // only k items are keptImplementations§
Source§impl<T> TopK<T, DefaultHashBuilder>
impl<T> TopK<T, DefaultHashBuilder>
Sourcepub fn new(k: usize, epsilon: f64, delta: f64) -> Result<Self, Error>
pub fn new(k: usize, epsilon: f64, delta: f64) -> Result<Self, Error>
Creates a tracker for the top k items, with an underlying sketch sized
for error epsilon at confidence 1 - delta, using the default hasher.
§Parameters
k: how many top items to keep. Must be non-zero.epsilon: the sketch error factor, in(0.0, 1.0). Smaller is more accurate and uses more memory.delta: the sketch failure probability, in(0.0, 1.0).
§Errors
Returns Error::InvalidParameter if k is zero, or if epsilon or
delta is not a finite value in (0.0, 1.0).
§Examples
use bloom_lib::TopK;
let top = TopK::<&str>::new(10, 0.001, 0.001).unwrap();
assert_eq!(top.k(), 10);
assert!(top.is_empty());Source§impl<T, S> TopK<T, S>
impl<T, S> TopK<T, S>
Sourcepub fn with_hasher(
k: usize,
epsilon: f64,
delta: f64,
hasher: S,
) -> Result<Self, Error>
pub fn with_hasher( k: usize, epsilon: f64, delta: f64, hasher: S, ) -> Result<Self, Error>
Creates a tracker with a caller-supplied hasher.
§Errors
Returns Error::InvalidParameter if k is zero, or if epsilon or
delta is not a finite value in (0.0, 1.0).
§Examples
use std::collections::hash_map::RandomState;
use bloom_lib::TopK;
let top: TopK<&str, RandomState> =
TopK::with_hasher(10, 0.001, 0.001, RandomState::new()).unwrap();Sourcepub fn insert(&mut self, item: T)
pub fn insert(&mut self, item: T)
Records one occurrence of item, updating the sketch and the top set.
The key is moved into the monitored list only if it is promoted into the
top k; a non-promoted insertion stores nothing beyond the sketch
update.
§Examples
use bloom_lib::TopK;
let mut top = TopK::new(2, 0.01, 0.01).unwrap();
top.insert("x");
top.insert("x");
assert_eq!(top.estimate(&"x"), 2);Sourcepub fn estimate(&self, item: &T) -> u64
pub fn estimate(&self, item: &T) -> u64
Returns the estimated frequency of item from the underlying sketch.
This works for any item, whether or not it is in the top set.
§Examples
use bloom_lib::TopK;
let mut top = TopK::new(5, 0.001, 0.001).unwrap();
for _ in 0..7 { top.insert("seven"); }
assert!(top.estimate(&"seven") >= 7);
assert_eq!(top.estimate(&"absent"), 0);Sourcepub fn top(&self) -> Vec<(&T, u64)>
pub fn top(&self) -> Vec<(&T, u64)>
Returns the monitored items paired with their estimated counts, ordered from most to least frequent.
The returned vector borrows the keys, so no key cloning occurs. Its
length is at most k.
§Examples
use bloom_lib::TopK;
let mut top = TopK::new(2, 0.01, 0.01).unwrap();
for _ in 0..5 { top.insert("a"); }
for _ in 0..3 { top.insert("b"); }
let ranked = top.top();
assert_eq!(ranked[0].0, &"a");
assert_eq!(ranked[1].0, &"b");