Struct streaming_algorithms::Top[][src]

pub struct Top<A: Hash + Eq + Clone, C: Ord + New + for<'a> UnionAssign<&'a C> + Intersect> { /* fields omitted */ }

This probabilistic data structure tracks the n top keys given a stream of (key,value) tuples, ordered by the sum of the values for each key (the "aggregated value"). It uses only O(n) space.

Its implementation is two parts:

  • a doubly linked hashmap, mapping the top n keys to their aggregated values, and ordered by their aggregated values. This is used to keep a more precise track of the aggregated value of the top n keys, and reduce collisions in the count-min sketch.
  • a count-min sketch to track all of the keys outside the top n. This data structure is also known as a counting Bloom filter. It uses conservative updating for increased accuracy.

The algorithm is as follows:

while a key and value from the input stream arrive:
    if H[key] exists
        increment aggregated value associated with H[key]
    elsif number of items in H < k
        put H[key] into map with its associated value
    else
        add C[key] into the count-min sketch with its associated value
        if aggregated value associated with C[key] is > the lowest aggregated value in H
            move the lowest key and value from H into C
            move C[key] and value from C into H
endwhile

See An Improved Data Stream Summary: The Count-Min Sketch and its Applications and New Directions in Traffic Measurement and Accounting for background on the count-min sketch with conservative updating.

Methods

impl<A: Hash + Eq + Clone, C: Ord + New + for<'a> UnionAssign<&'a C> + Intersect> Top<A, C>
[src]

Create an empty Top data structure with the specified n capacity.

The n most frequent elements we have capacity to track.

"Visit" an element.

Clears the Top data structure, as if it was new.

Important traits for TopIter<'a, A, C>

An iterator visiting all elements and their counts in descending order of frequency. The iterator element type is (&'a A, usize).

Trait Implementations

impl<A: Clone + Hash + Eq + Clone, C: Clone + Ord + New + for<'a> UnionAssign<&'a C> + Intersect> Clone for Top<A, C>
[src]

Returns a copy of the value. Read more

Performs copy-assignment from source. Read more

impl<A: Hash + Eq + Clone + Debug, C: Ord + New + Clone + for<'a> UnionAssign<&'a C> + Intersect + Debug> Debug for Top<A, C>
[src]

Formats the value using the given formatter. Read more

impl<A: Hash + Eq + Clone, C: Ord + New + Clone + for<'a> AddAssign<&'a C> + for<'a> UnionAssign<&'a C> + Intersect> Sum for Top<A, C>
[src]

Method which takes an iterator and generates Self from the elements by "summing up" the items. Read more

impl<A: Hash + Eq + Clone, C: Ord + New + Clone + for<'a> AddAssign<&'a C> + for<'a> UnionAssign<&'a C> + Intersect> Add for Top<A, C>
[src]

The resulting type after applying the + operator.

Performs the + operation.

impl<A: Hash + Eq + Clone, C: Ord + New + Clone + for<'a> AddAssign<&'a C> + for<'a> UnionAssign<&'a C> + Intersect> AddAssign for Top<A, C>
[src]

Performs the += operation.

Auto Trait Implementations

impl<A, C> Send for Top<A, C> where
    A: Send,
    C: Send,
    <C as New>::Config: Send

impl<A, C> Sync for Top<A, C> where
    A: Sync,
    C: Sync,
    <C as New>::Config: Sync