cumulfreqtable 0.1.0

Binary Indexed Tree frequency table
Documentation

A cumulative frequency table stores and/or compute the absolute frequency and the cumulative frequency for ever position in the table.

Formal definition from Wikipedia:

In statistics, the frequency (or absolute frequency) of an event i is the number nᵢ of times the observation has occurred/recorded in an experiment or study. The cumulative frequency is the total of the absolute frequencies of all events at or below a certain point in an ordered list of events.

Example

use cumulfreqtable::CumulFreqTable; // Import the trait in scope.

let mut table = cumulfreqtable::BinaryIndexedTree::new(16);
table.inc(0);
table.inc(3);
table.add(5, 3);

assert_eq!(table.freq(0), 1);
assert_eq!(table.freq(3), 1);
assert_eq!(table.freq(5), 3);
assert_eq!(table.freq(6), 0);

assert_eq!(table.sum(0), 1);
assert_eq!(table.sum(3), 2);
assert_eq!(table.sum(5), 5);
assert_eq!(table.sum(6), 5);

assert_eq!(table.total(), 5);

Implementations

This crate offers two implementations of the [CumulFreqTable] trait:

  • [FreqTable]: stores the absolute frequency of every positions in an array O(1) but computes the cumulative frequency in O(len). Best for small tables.
  • [BinaryIndexedTree]: stores the cumulative frequency of every positions in a binary indexed tree. The runtime complexity is O(㏒₂ len) for all operations.

There is also [cumulfreq_array::CumulFreqTable] that computes and stores the cumulative frequency of every positions on update. It's only purpose is to validate the benchmark results.

Benchmarks

For small tables, [FreqTable] is slightly faster than [BinaryIndexedTree], presumably because it fits within the CPU cache. The cross-over point depends on the computer.

You can run the benchmarks yourself with cargo criterion.

Notice that the graphs below have logarithmic scales.

Intel i7-4790K (2014)

On a 2014 Intel i7-4790K with dual-channel DDR3 1600 Mhz, the binary_indexed_tree becomes faster somewhere above 256×64 bits elements.

AMD Ryzen 7 PRO 5850U (2021)

On a 2021 AMD Ryzen 7 PRO 5850U with dual-channel DDR4 3200 Mhz, the binary_indexed_tree becomes faster somewhere above 512×64 bits elements.