Skip to main content

Crate flowstats

Crate flowstats 

Source
Expand description

§Flowstats

Production-grade streaming algorithms for Rust.

Flowstats provides high-performance implementations of probabilistic data structures and streaming algorithms, designed for real-time analytics and large-scale data processing.

§Features

  • Cardinality Estimation: Count distinct elements with HyperLogLog
  • Frequency Estimation: Track item frequencies with Count-Min Sketch
  • Heavy Hitters: Find top-K elements with Space-Saving
  • Quantile Estimation: Compute percentiles with t-digest
  • Full Mergeability: All sketches support distributed merge operations
  • Error Bounds: Formal guarantees on approximation accuracy

§Quick Start

use flowstats::prelude::*;

// Count distinct users
let mut hll = HyperLogLog::new(14);
for user_id in ["alice", "bob", "charlie", "alice"] {
    hll.insert(user_id);
}
println!("Distinct users: ~{}", hll.estimate());

// Track request latencies
let mut digest = TDigest::new(100.0);
for latency in [12.5, 45.2, 23.1, 67.8, 15.3] {
    digest.add(latency);
}
println!("p99 latency: {:?}", digest.quantile(0.99));
  

§Distributed Computing

All sketches implement the Sketch trait which includes a merge operation, allowing sketches to be combined across distributed workers:

use flowstats::cardinality::HyperLogLog;
use flowstats::traits::Sketch;

let mut worker1 = HyperLogLog::new(14);
let mut worker2 = HyperLogLog::new(14);

// Each worker processes its partition
worker1.insert("user_a");
worker2.insert("user_b");

// Merge results
worker1.merge(&worker2).unwrap();

§Feature Flags

Algorithm families (pick what you need):

  • cardinality (default): HyperLogLog for distinct counting
  • frequency (default): Count-Min Sketch, Space-Saving (tbd)
  • quantiles (default): t-digest for percentiles
  • membership: (default) Bloom filter
  • sampling: Reservoir and weighted sampling (tbd)
  • sets: Theta sketch for set operations (tbd)
  • statistics: Running moments, entropy (tbd)
  • full: Enable all algorithm families

Platform features:

  • std (default): Standard library support
  • serde: Enable serialization

Re-exports§

pub use cardinality::HyperLogLog;cardinality
pub use quantiles::TDigest;quantiles
pub use frequency::CountMinSketch;frequency
pub use frequency::SpaceSaving;frequency and std
pub use membership::BloomFilter;membership
pub use sampling::ReservoirSampler;sampling
pub use statistics::RunningStats;statistics

Modules§

cardinalitycardinality
Cardinality (distinct count) estimation algorithms
frequencyfrequency
Frequency estimation algorithms
membershipmembership
Membership testing data structures
prelude
quantilesquantiles
Quantile estimation algorithms
samplingsampling
Stream sampling algorithms
statisticsstatistics
Statistical summaries for streaming data
traits
Core traits for streaming algorithms