Crate dsrs

Source
Expand description

dsrs contains bindings for a subset of Apache DataSketches.

Modules§

counters
Stateful reducers which maintain distinct count and heavy hitters sketches, aimed at servicing the dsrs command-line tool for deduplicating byte lines of input.
stream_reducer
A small abstraction for reducing over byte lines from a stream, used for the command line tool dsrs.

Structs§

CpcSketch
The Compressed Probability Counting (CPC) sketch is a dynamically resizing (but still bounded-size) distinct count sketch. Some differences between CPC and the more typical HLL++ are:
CpcUnion
HhSketch
The Heavy Hitter (HH) sketch computes an approximate set of the heavy hitters, the items in a data stream which appear most often. Along with each proposed approximate heavy hitter, the sketch can provide an estimate of the number of its appearances.
StaticThetaSketch
ThetaIntersection
ThetaSketch
The Theta sketch is, essentially, an adaptive random sample of a stream. As a result, it can be used to estimate distinct counts and the sketches can be combined to estimate distinct counts of unions and and intersections and differences of streams.
ThetaUnion