Skip to main content

compression_estimate

Function compression_estimate 

Source
pub fn compression_estimate(data: &[u8]) -> f64
Expand description

Compression estimator (NIST-inspired diagnostic).

Uses Maurer’s universal statistic to estimate entropy via compression. Maurer’s f_n converges to the Shannon entropy rate, NOT min-entropy.

To convert to a min-entropy bound, we use the relationship: H∞ <= H_Shannon and apply a conservative correction. For IID data with alphabet size k=256: H∞ = -log2(p_max), H_Shannon = -sum(p_i * log2(p_i)) The gap between them grows with distribution skew. We use: H∞_est ≈ f_lower * (f_lower / log2(k)) which maps f_lower=log2(256)=8.0 → 8.0 (uniform) and compresses lower values quadratically, reflecting that low Shannon entropy implies even lower min-entropy.

Returns estimated min-entropy bits per sample.