Top

Struct Top 

Source
pub struct Top<A, C: New> { /* private fields */ }
Expand description

This probabilistic data structure tracks the n top keys given a stream of (key,value) tuples, ordered by the sum of the values for each key (the “aggregated value”). It uses only O(n) space.

Its implementation is two parts:

  • a doubly linked hashmap, mapping the top n keys to their aggregated values, and ordered by their aggregated values. This is used to keep a more precise track of the aggregated value of the top n keys, and reduce collisions in the count-min sketch.
  • a count-min sketch to track all of the keys outside the top n. This data structure is also known as a counting Bloom filter. It uses conservative updating for increased accuracy.

The algorithm is as follows:

while a key and value from the input stream arrive:
    if H[key] exists
        increment aggregated value associated with H[key]
    elsif number of items in H < k
        put H[key] into map with its associated value
    else
        add C[key] into the count-min sketch with its associated value
        if aggregated value associated with C[key] is > the lowest aggregated value in H
            move the lowest key and value from H into C
            move C[key] and value from C into H
endwhile

See An Improved Data Stream Summary: The Count-Min Sketch and its Applications and New Directions in Traffic Measurement and Accounting for background on the count-min sketch with conservative updating.

Implementations§

Source§

impl<A: Hash + Eq + Clone, C: Ord + New + for<'a> UnionAssign<&'a C> + Intersect> Top<A, C>

Source

pub fn new( n: usize, probability: f64, tolerance: f64, config: <C as New>::Config, ) -> Self

Create an empty Top data structure with the specified n capacity.

Source

pub fn capacity(&self) -> usize

The n most frequent elements we have capacity to track.

Source

pub fn push<V: ?Sized>(&mut self, item: A, value: &V)

“Visit” an element.

Source

pub fn clear(&mut self)

Clears the Top data structure, as if it was new.

Source

pub fn iter(&self) -> TopIter<'_, A, C>

An iterator visiting all elements and their counts in descending order of frequency. The iterator element type is (&’a A, usize).

Trait Implementations§

Source§

impl<A: Hash + Eq + Clone, C: Ord + New + Clone + for<'a> AddAssign<&'a C> + for<'a> UnionAssign<&'a C> + Intersect + IntersectPlusUnionIsPlus> Add for Top<A, C>

Source§

type Output = Top<A, C>

The resulting type after applying the + operator.
Source§

fn add(self, other: Self) -> Self

Performs the + operation. Read more
Source§

impl<A: Hash + Eq + Clone, C: Ord + New + Clone + for<'a> AddAssign<&'a C> + for<'a> UnionAssign<&'a C> + Intersect + IntersectPlusUnionIsPlus> AddAssign for Top<A, C>

Source§

fn add_assign(&mut self, other: Self)

Performs the += operation. Read more
Source§

impl<A: Clone, C: Clone + New> Clone for Top<A, C>

Source§

fn clone(&self) -> Top<A, C>

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<A: Hash + Eq + Clone + Debug, C: Ord + New + Clone + for<'a> UnionAssign<&'a C> + Intersect + Debug> Debug for Top<A, C>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<'de, A, C> Deserialize<'de> for Top<A, C>
where A: Hash + Eq + Deserialize<'de>, C: Deserialize<'de> + New, <C as New>::Config: Deserialize<'de>,

Source§

fn deserialize<__D>(__deserializer: __D) -> Result<Self, __D::Error>
where __D: Deserializer<'de>,

Deserialize this value from the given Serde deserializer. Read more
Source§

impl<A, C> Serialize for Top<A, C>
where A: Hash + Eq + Serialize, C: Serialize + New, <C as New>::Config: Serialize,

Source§

fn serialize<__S>(&self, __serializer: __S) -> Result<__S::Ok, __S::Error>
where __S: Serializer,

Serialize this value into the given Serde serializer. Read more
Source§

impl<A: Hash + Eq + Clone, C: Ord + New + Clone + for<'a> AddAssign<&'a C> + for<'a> UnionAssign<&'a C> + Intersect + IntersectPlusUnionIsPlus> Sum<Top<A, C>> for Option<Top<A, C>>

Source§

fn sum<I>(iter: I) -> Self
where I: Iterator<Item = Top<A, C>>,

Takes an iterator and generates Self from the elements by “summing up” the items.

Auto Trait Implementations§

§

impl<A, C> Freeze for Top<A, C>
where <C as New>::Config: Freeze,

§

impl<A, C> RefUnwindSafe for Top<A, C>

§

impl<A, C> Send for Top<A, C>
where <C as New>::Config: Send, A: Send, C: Send,

§

impl<A, C> Sync for Top<A, C>
where <C as New>::Config: Sync, A: Sync, C: Sync,

§

impl<A, C> Unpin for Top<A, C>
where <C as New>::Config: Unpin, A: Unpin, C: Unpin,

§

impl<A, C> UnwindSafe for Top<A, C>
where <C as New>::Config: UnwindSafe, A: UnwindSafe, C: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V

Source§

impl<T> DeserializeOwned for T
where T: for<'de> Deserialize<'de>,