[][src]Crate c2_histograms

This library contains tools for simulating space efficient histograms. For the purposes of this library, a histogram is a map from labels to frequencies, both of which must be numerical.

We provide is a standard histogram implementation (StandardHistogram) which stores label-frequency pairs at face value using a HashMap.

Furthermore, we provide three additional histogram implementations that use either strictly less memory than StandardHistogram or use a fixed amount of memory. The optimizations are done across two dimensions, the label storage (referred to as "compaction") and the frequency storage (referred to as "compression"). Hence the name "C-Squared Histograms". The three implementations are as follows:

  • CompressedHistogram - This implementation uses strictly less space than a StandardHistogram by approximating the frequencies
  • CompactHistogram - This implementation consumes a fixed amount of space depending on a few precision parameters. It saves space by approximating labels.
  • C2Histogram - This implementation also utilizes a fixed amount of space, usually much less than a CompactHistogram. It approximates both labels and frequencies.

Each of these implementations is parameterized such that the trade-off between precision and memory performance is directly controlled by the user.

To construct a StandardHistogram, use the function create_standard_histogram. Then, the other histograms can be derived from that standard histogram using the conversion functions to_compact, to_compressed, or to_c2.



Space saving histogram implementations


General trait definitions and StandardHistogram implementation