Struct constriction::stream::model::LeakyQuantizer

source ·

pub struct LeakyQuantizer<F, Symbol, Probability, const PRECISION: usize> { /* private fields */ }

Expand description

Quantizes probability distributions and represents them in fixed-point precision.

You will usually want to use this type through one of its type aliases, DefaultLeakyQuantizer or SmallLeakyQuantizer, see discussion of presets.

Examples

Quantizing Continuous Distributions

use constriction::{
    stream::{model::DefaultLeakyQuantizer, stack::DefaultAnsCoder, Encode, Decode},
    UnwrapInfallible,
};

// Create a quantizer that supports integer symbols from -5 to 20 (inclusively),
// using the "default" preset for `Probability` and `PRECISION`.
let quantizer = DefaultLeakyQuantizer::new(-5..=20);

// Quantize a normal distribution with mean 8.3 and standard deviation 4.1.
let continuous_distribution1 = probability::distribution::Gaussian::new(8.3, 4.1);
let entropy_model1 = quantizer.quantize(continuous_distribution1);

// You can reuse the same quantizer for more than one distribution, and the distributions don't
// even have to be of the same type (e.g., one can be a `Gaussian` and another a `Laplace`).
let continuous_distribution2 = probability::distribution::Laplace::new(-1.4, 2.7);
let entropy_model2 = quantizer.quantize(continuous_distribution2);

// Use the entropy models with an entropy coder.
let mut ans_coder = DefaultAnsCoder::new();
ans_coder.encode_symbol(4, entropy_model1).unwrap();
ans_coder.encode_symbol(-3, entropy_model2).unwrap();

// Decode symbols (in reverse order, since the `AnsCoder` is a stack) and verify correctness.
assert_eq!(ans_coder.decode_symbol(entropy_model2).unwrap_infallible(), -3);
assert_eq!(ans_coder.decode_symbol(entropy_model1).unwrap_infallible(), 4);
assert!(ans_coder.is_empty());

Quantizing a Discrete Distribution (That Has an Analytic Expression)

If you pass a discrete probability distribution to the method quantize then it no longer needs to perform any quantization in the data space, but it will still perform steps 2 and 3 in the list below, i.e., it will still convert to a “leaky” fixed-point approximation that can be used by any of constrictions’s stream codes. In the following example, we’ll quantize a Binomial distribution (as discussed below, you should not quantize a Categorical distribution since there are more efficient specialized types for this use case).

use constriction::stream::{
    model::DefaultLeakyQuantizer, queue::DefaultRangeEncoder, Encode, Decode
};

let distribution = probability::distribution::Binomial::new(1000, 0.1); // arguments: `n, p`
let quantizer = DefaultLeakyQuantizer::new(0..=1000); // natural support is `0..=n`
let entropy_model = quantizer.quantize(distribution);

// Let's use a Range Coder this time, just for fun (we could as well use an ANS Coder again).
let mut range_encoder = DefaultRangeEncoder::new();

// Encode a "typical" symbol from the distribution (i.e., one with non-negligible probability).
range_encoder.encode_symbol(107, entropy_model).unwrap();

// Due to the "leakiness" of the quantizer, the following still works despite the fact that
// the symbol `1000` has a ridiculously low probability under the binomial distribution.
range_encoder.encode_symbol(1000, entropy_model).unwrap();

// Decode symbols (in forward order, since range coding operates as a queue) and verify.
let mut range_decoder = range_encoder.into_decoder().unwrap();
assert_eq!(range_decoder.decode_symbol(entropy_model).unwrap(), 107);
assert_eq!(range_decoder.decode_symbol(entropy_model).unwrap(), 1000);
assert!(range_decoder.maybe_exhausted());

Detailed Description

A LeakyQuantizer is a builder of LeakilyQuantizedDistributions. It takes an arbitrary probability distribution that implements the Distribution trait from the crate probability and turns it into a LeakilyQuantizedDistribution by performing the following three steps:

quantization: lossless entropy coding can only be performed over discrete data. Any continuous (real-valued) data has to be approximated by some discrete set of points. If you provide a continuous distributions (i.e., a probability density function) to this builder, then it will quantize the data space by rounding values to the nearest integer. This step is optional, see below.
approximation with fixed-point arithmetic: an entropy model that is used for compressing and decompressing has to be exactly invertible, so that its EncoderModel implementation is compatible with its DecoderModel implementation. The LeakilyQuantizedDistributions that are built by this builder represent probabilities and quantiles in fixed-point arithmetic with PRECISION bits. This allows them to avoid rounding errors when inverting the model, so that they can implement both EncoderModel and DecoderModel in such a way that one is the exact inverse of the other.
introducing leakiness: naively approximating a probability distribution with fixed point arithmetic could lead to problems: it could round some very small probabilities to zero. This would have the undesirable effect that the corresponding symbol then could no longer be encoded. This builder ensures that the LeakilyQuantizedDistributions that it creates assign a nonzero probability to all symbols within a user-defined range, so that these symbols can always be encoded, even if their probabilities under the original probability distribution are very low (or even zero).

Continuous vs. Discrete Probability Distributions

The method quantize accepts both continuous probability distributions (i.e., probability density functions, such as Gaussian) and discrete distributions that are defined only on (some) integers (i.e., probability mass functions, such as Binomial). The resulting LeakilyQuantizedDistribution will always be a discrete probability distribution. If the original probability distribution is continuous, then the quantizer implicitly creates bins of size one by rounding to the nearest integer (i.e., the bins range from i - 0.5 to i + 0.5 for each integer i). If the original probability distribution is discrete then no rounding in the symbol space occurs, but the quantizer still performs steps 2 and 3 above, i.e., it still rounds probabilities and quantiles to fixed-point arithmetic in a way that ensures that all probabilities within a user-defined range are nonzero.

Don’t Quantize Categorical Distributions, Though.

Although you can use a LeakyQuantizer for discrete probability distributions, you should not use it for probability distributions of the type probability::distribution::Categorical. While this will technically work, it will lead to poor computational performance (and also to slightly suboptimal compression efficiency). If you’re dealing with categorical distributions, use one of the dedicated types ContiguousCategoricalEntropyModel, NonContiguousCategoricalEncoderModel, NonContiguousCategoricalDecoderModel, or LookupDecoderModel instead.

By contrast, do use a LeakyQuantizer if the underlying probability Distribution can be described by some analytic function (e.g., the function f(x) ∝ e^{-(x-\mu)^2/2} describing the bell curve of a Gaussian distribution, or the function f_n(k) = (n choose k) p^k (1-p)^{n-k} describing the probability mass function of a binomial distribution). For such parameterized distributions, both the cumulative distribution function and its inverse can often be expressed as, or at least approximated by, some analytic expression that can be evaluated in constant time, independent of the number of possible symbols.

Computational Efficiency

Two things should be noted about computational efficiency:

quantization is lazy: both the constructor of a LeakyQuantizer and the method quantize perform only a small constant amount of work, independent of the PRECISION and the number of symbols on which the resulting entropy model will be defined. The actual quantization is done once the resulting LeakilyQuantizedDistribution is used for encoding and/or decoding, and it is only done for the involved symbols.
quantization for decoding is more expensive than for encoding: using a LeakilyQuantizedDistribution as an EncoderModel only requires evaluating the cumulative distribution function (CDF) of the underlying continuous probability distribution a constant number of times (twice, to be precise). By contrast, using it as a DecoderModel requires numerical inversion of the cumulative distribution function. This numerical inversion starts by calling Inverse::inverse from the crate probability on the underlying continuous probability distribution. But the result of this method call then has to be refined by repeatedly probing the CDF in order to deal with inevitable rounding errors in the implementation of Inverse::inverse. The number of required iterations will depend on how accurate the implementation of Inverse::inverse is.

The laziness means that it is relatively cheap to use a different LeakilyQuantizedDistribution for each symbol of the message, which is a common thing to do in machine-learning based compression methods. By contrast, if you want to use the same entropy model for many symbols then a LeakilyQuantizedDistribution can become unnecessarily expensive, especially for decoding, because you might end up calculating the inverse CDF in the same region over and over again. If this is the case, consider tabularizing the LeakilyQuantizedDistribution that you obtain from the method quantize by calling to_generic_encoder_model or to_generic_decoder_model on it (or, if you use a low PRECISION, you may even consider calling to_generic_lookup_decoder_model). You’ll have to bring the trait IterableEntropyModel into scope to call these conversion methods (use constriction::stream::model::IterableEntropyModel).

Requirements for Correctness

The original distribution that you pass to the method quantize can only be an approximation of a true (normalized) probability distribution because it represents probabilities with finite (floating point) precision. Despite the possibility of rounding errors in the underlying (floating point) distribution, a LeakyQuantizer is guaranteed to generate a valid entropy model with exactly compatible implementations of EncoderModel and DecoderModel as long as both of the following requirements are met:

The cumulative distribution function (CDF) Distribution::distribution is defined on all mid points between integers that lie within the range that is provided as argument support to the new method; it is monotonically nondecreasing, and its values do not exceed the closed interval [0.0, 1.0]. It is OK if the CDF does not cover the entire interval from 0.0 to 1.0 (e.g., due to rounding errors or clipping); any remaining probability mass on the tails is added to the probability of the symbols at the respective ends of the support.
The quantile function or inverse CDF Inverse::inverse evaluates to a finite non-NaN value everywhere on the open interval (0.0, 1.0), and it is monotonically nondecreasing on this interval. It does not have to be defined at the boundaries 0.0 or 1.0 (more precisely, it only has to be defined on the closed interval [epsilon, 1.0 - epsilon] where epsilon := 2.0^{-(PRECISION+1)} and ^ denotes mathematical exponentiation). Further, the implementation of Inverse::inverse does not actually have to be the inverse of Distribution::distribution because it is only used as an initial hint where to start a search for the true inverse. It is OK if Inverse::inverse is just some approximation of the true inverse CDF. Any deviations between Inverse::inverse and the true inverse CDF will negatively impact runtime performance but will otherwise have no observable effect.

Struct constriction::stream::model::LeakyQuantizer

Implementations§

impl<F, Symbol, Probability, const PRECISION: usize> LeakyQuantizer<F, Symbol, Probability, PRECISION>where Probability: BitArray + Into<F>, Symbol: PrimInt + AsPrimitive<Probability> + WrappingSub + WrappingAdd, F: FloatCore,

pub fn new(support: RangeInclusive<Symbol>) -> Self

pub fn quantize<D: Distribution>( self, distribution: D ) -> LeakilyQuantizedDistribution<F, Symbol, Probability, D, PRECISION>

pub fn support(&self) -> RangeInclusive<Symbol>

Trait Implementations§

impl<F: Clone, Symbol: Clone, Probability: Clone, const PRECISION: usize> Clone for LeakyQuantizer<F, Symbol, Probability, PRECISION>

fn clone(&self) -> LeakyQuantizer<F, Symbol, Probability, PRECISION>

fn clone_from(&mut self, source: &Self)

impl<F: Debug, Symbol: Debug, Probability: Debug, const PRECISION: usize> Debug for LeakyQuantizer<F, Symbol, Probability, PRECISION>

fn fmt(&self, f: &mut Formatter<'_>) -> Result

impl<F: Copy, Symbol: Copy, Probability: Copy, const PRECISION: usize> Copy for LeakyQuantizer<F, Symbol, Probability, PRECISION>

Auto Trait Implementations§

impl<F, Symbol, Probability, const PRECISION: usize> RefUnwindSafe for LeakyQuantizer<F, Symbol, Probability, PRECISION>where F: RefUnwindSafe, Probability: RefUnwindSafe, Symbol: RefUnwindSafe,

impl<F, Symbol, Probability, const PRECISION: usize> Send for LeakyQuantizer<F, Symbol, Probability, PRECISION>where F: Send, Probability: Send, Symbol: Send,

impl<F, Symbol, Probability, const PRECISION: usize> Sync for LeakyQuantizer<F, Symbol, Probability, PRECISION>where F: Sync, Probability: Sync, Symbol: Sync,

impl<F, Symbol, Probability, const PRECISION: usize> Unpin for LeakyQuantizer<F, Symbol, Probability, PRECISION>where F: Unpin, Probability: Unpin, Symbol: Unpin,

impl<F, Symbol, Probability, const PRECISION: usize> UnwindSafe for LeakyQuantizer<F, Symbol, Probability, PRECISION>where F: UnwindSafe, Probability: UnwindSafe, Symbol: UnwindSafe,

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> ToOwned for Twhere T: Clone,

type Owned = T

fn to_owned(&self) -> T

fn clone_into(&self, target: &mut T)

impl<T, U> TryFrom<U> for Twhere U: Into<T>,

type Error = Infallible

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

impl<T, U> TryInto<U> for Twhere U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

impl<F, Symbol, Probability, const PRECISION: usize> LeakyQuantizer<F, Symbol, Probability, PRECISION>
where Probability: BitArray + Into<F>, Symbol: PrimInt + AsPrimitive<Probability> + WrappingSub + WrappingAdd, F: FloatCore,

impl<F, Symbol, Probability, const PRECISION: usize> RefUnwindSafe for LeakyQuantizer<F, Symbol, Probability, PRECISION>
where F: RefUnwindSafe, Probability: RefUnwindSafe, Symbol: RefUnwindSafe,

impl<F, Symbol, Probability, const PRECISION: usize> Send for LeakyQuantizer<F, Symbol, Probability, PRECISION>
where F: Send, Probability: Send, Symbol: Send,

impl<F, Symbol, Probability, const PRECISION: usize> Sync for LeakyQuantizer<F, Symbol, Probability, PRECISION>
where F: Sync, Probability: Sync, Symbol: Sync,

impl<F, Symbol, Probability, const PRECISION: usize> Unpin for LeakyQuantizer<F, Symbol, Probability, PRECISION>
where F: Unpin, Probability: Unpin, Symbol: Unpin,

impl<F, Symbol, Probability, const PRECISION: usize> UnwindSafe for LeakyQuantizer<F, Symbol, Probability, PRECISION>
where F: UnwindSafe, Probability: UnwindSafe, Symbol: UnwindSafe,

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

impl<T> ToOwned for T
where T: Clone,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,