pub struct ContiguousCategoricalEntropyModel<Probability, Table, const PRECISION: usize> { /* private fields */ }
Expand description

An entropy model for a categorical probability distribution over a contiguous range of integers starting at zero.

You will usually want to use this type through one of its type aliases, DefaultContiguousCategoricalEntropyModel or SmallContiguousCategoricalEntropyModel, see discussion of presets.

This entropy model implements both EncoderModel and DecoderModel, which means that it can be used for both encoding and decoding with any of the stream coders provided by the constriction crate.

Example

use constriction::{
    stream::{stack::DefaultAnsCoder, model::DefaultContiguousCategoricalEntropyModel, Decode},
    UnwrapInfallible,
};

// Create a `ContiguousCategoricalEntropyModel` that approximates floating point probabilities.
let probabilities = [0.3, 0.0, 0.4, 0.1, 0.2]; // Note that `probabilities[1] == 0.0`.
let model = DefaultContiguousCategoricalEntropyModel::from_floating_point_probabilities(
    &probabilities
).unwrap();
assert_eq!(model.support_size(), 5); // `model` supports the symbols `0..5usize`.

// Use `model` for entropy coding.
let message = vec![2, 0, 3, 1, 2, 4, 3, 2, 0];
let mut ans_coder = DefaultAnsCoder::new();

// We could pass `model` by reference but passing `model.as_view()` is slightly more efficient.
ans_coder.encode_iid_symbols_reverse(message.iter().cloned(), model.as_view()).unwrap();
// Note that `message` contains the symbol `1`, and that `probabilities[1] == 0.0`. However, we
// can still encode the symbol because the `ContiguousCategoricalEntropyModel` is "leaky", i.e.,
// it assigns a nonzero probability to all symbols in the range `0..model.support_size()`.

// Decode the encoded message and verify correctness.
let decoded = ans_coder
    .decode_iid_symbols(9, model.as_view())
    .collect::<Result<Vec<_>, _>>()
    .unwrap_infallible();
assert_eq!(decoded, message);
assert!(ans_coder.is_empty());

// The `model` assigns zero probability to any symbols that are not in the support
// `0..model.support_size()`, so trying to encode a message that contains such a symbol fails.
assert!(ans_coder.encode_iid_symbols_reverse(&[2, 0, 5, 1], model.as_view()).is_err())
// ERROR: symbol `5` is not in the support of `model`.

When Should I Use This Type of Entropy Model?

Use a ContiguousCategoricalEntropyModel for probabilistic models that can only be represented as an explicit probability table, and not by some more compact analytic expression. If you have a probability model that can be expressed by some analytical expression (e.g., a Binomial distribution), then use LeakyQuantizer instead (unless you want to encode lots of symbols with the same entropy model, in which case the explicitly tabulated representation of a categorical entropy model could improve runtime performance).

Further, a ContiguousCategoricalEntropyModel can only represent probability distribution whose support (i.e., the set of symbols to which the model assigns a non-zero probability) is a contiguous range of integers starting at zero. If the support of your probability distribution has a more complicated structure (or if the Symbol type is not an integer type), then you can use a NonContiguousCategoricalEncoderModel or a NonContiguousCategoricalDecoderModel, which are strictly more general than a ContiguousCategoricalEntropyModel but which have a larger memory footprint and slightly worse runtime performance.

If you want to decode lots of symbols with the same entropy model, and if reducing the PRECISION to a moderate value is acceptable to you, then you may want to consider using a LookupDecoderModel instead for even better runtime performance (at the cost of a larger memory footprint and worse compression efficiency due to lower PRECISION).

Computational Efficiency

For a probability distribution with a support of N symbols, a ContiguousCategoricalEntropyModel has the following asymptotic costs:

Implementations§

source§

impl<Probability: BitArray, const PRECISION: usize> ContiguousCategoricalEntropyModel<Probability, Vec<Probability>, PRECISION>

source

pub fn from_floating_point_probabilities<F>( probabilities: &[F] ) -> Result<Self, ()>
where F: FloatCore + Sum<F> + Into<f64>, Probability: Into<f64> + AsPrimitive<usize>, f64: AsPrimitive<Probability>, usize: AsPrimitive<Probability>,

Constructs a leaky distribution whose PMF approximates given probabilities.

The returned distribution will be defined for symbols of type usize from the range 0..probabilities.len().

The argument probabilities is a slice of floating point values (F is typically f64 or f32). All entries must be nonnegative and at least one entry has to be nonzero. The entries do not necessarily need to add up to one (the resulting distribution will automatically get normalized and an overall scaling of all entries of probabilities does not affect the result, up to effects due to rounding errors).

The probability mass function of the returned distribution will approximate the provided probabilities as well as possible, subject to the following constraints:

  • probabilities are represented in fixed point arithmetic, where the const generic parameter PRECISION controls the number of bits of precision. This typically introduces rounding errors;
  • despite the possibility of rounding errors, the returned probability distribution will be exactly normalized; and
  • each symbol in the support 0..probabilities.len() gets assigned a strictly nonzero probability, even if the provided probability for the symbol is zero or below the threshold that can be resolved in fixed point arithmetic with PRECISION bits. We refer to this property as the resulting distribution being “leaky”. The leakiness guarantees that all symbols within the support can be encoded when this distribution is used as an entropy model.

More precisely, the resulting probability distribution minimizes the cross entropy from the provided (floating point) to the resulting (fixed point) probabilities subject to the above three constraints.

Error Handling

Returns an error if the provided probability distribution cannot be normalized, either because probabilities is of length zero, or because one of its entries is negative with a nonzero magnitude, or because the sum of its elements is zero, infinite, or NaN.

Also returns an error if the probability distribution is degenerate, i.e., if probabilities has only a single element, because degenerate probability distributions currently cannot be represented.

TODO: should also return an error if support is too large to support leaky distribution

source

pub fn from_nonzero_fixed_point_probabilities<I>( probabilities: I, infer_last_probability: bool ) -> Result<Self, ()>
where I: IntoIterator, I::Item: Borrow<Probability>,

Constructs a distribution with a PMF given in fixed point arithmetic.

This is a low level method that allows, e.g,. reconstructing a probability distribution previously exported with symbol_table. The more common way to construct a LeakyCategorical distribution is via from_floating_point_probabilities.

The items of probabilities have to be nonzero and smaller than 1 << PRECISION, where PRECISION is a const generic parameter on the ContiguousCategoricalEntropyModel.

If infer_last_probability is false then the items yielded by probabilities have to (logically) sum up to 1 << PRECISION. If infer_last_probability is true then they must sum up to a value strictly smaller than 1 << PRECISION, and the method will add an additional symbol at the end that takes the remaining probability mass.

Examples

If infer_last_probability is false, the provided probabilities have to sum up to 1 << PRECISION:

use constriction::stream::model::{
    DefaultContiguousCategoricalEntropyModel, IterableEntropyModel
};

let probabilities = vec![1u32 << 21, 1 << 22, 1 << 22, 1 << 22, 1 << 21];
// `probabilities` sums up to `1 << PRECISION` as required:
assert_eq!(probabilities.iter().sum::<u32>(), 1 << 24);

let model = DefaultContiguousCategoricalEntropyModel
    ::from_nonzero_fixed_point_probabilities(&probabilities, false).unwrap();
let symbol_table = model.floating_point_symbol_table::<f64>().collect::<Vec<_>>();
assert_eq!(
    symbol_table,
    vec![
        (0, 0.0, 0.125),
        (1, 0.125, 0.25),
        (2, 0.375, 0.25),
        (3, 0.625, 0.25),
        (4, 0.875, 0.125),
    ]
);

If PRECISION is set to the maximum value supported by the type Probability, then the provided probabilities still have to logically sum up to 1 << PRECISION (i.e., the summation has to wrap around exactly once):

use constriction::stream::model::{
    ContiguousCategoricalEntropyModel, IterableEntropyModel
};

let probabilities = vec![1u32 << 29, 1 << 30, 1 << 30, 1 << 30, 1 << 29];
// `probabilities` sums up to `1 << 32` (logically), i.e., it wraps around once.
assert_eq!(probabilities.iter().fold(0u32, |accum, &x| accum.wrapping_add(x)), 0);

let model = ContiguousCategoricalEntropyModel::<u32, Vec<u32>, 32>
    ::from_nonzero_fixed_point_probabilities(&probabilities, false).unwrap();
let symbol_table = model.floating_point_symbol_table::<f64>().collect::<Vec<_>>();
assert_eq!(
    symbol_table,
    vec![
        (0, 0.0, 0.125),
        (1, 0.125, 0.25),
        (2, 0.375, 0.25),
        (3, 0.625, 0.25),
        (4, 0.875, 0.125)
    ]
);

Wrapping around twice fails:

use constriction::stream::model::ContiguousCategoricalEntropyModel;
let probabilities = vec![1u32 << 30, 1 << 31, 1 << 31, 1 << 31, 1 << 30];
// `probabilities` sums up to `1 << 33` (logically), i.e., it would wrap around twice.
assert!(
    ContiguousCategoricalEntropyModel::<u32, Vec<u32>, 32>
        ::from_nonzero_fixed_point_probabilities(&probabilities, false).is_err()
);

So does providing probabilities that just don’t sum up to 1 << FREQUENCY:

use constriction::stream::model::ContiguousCategoricalEntropyModel;
let probabilities = vec![1u32 << 21, 5 << 8, 1 << 22, 1 << 21];
// `probabilities` sums up to `1 << 33` (logically), i.e., it would wrap around twice.
assert!(
    ContiguousCategoricalEntropyModel::<u32, Vec<u32>, 32>
        ::from_nonzero_fixed_point_probabilities(&probabilities, false).is_err()
);
source§

impl<Probability, Table, const PRECISION: usize> ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>
where Probability: BitArray, Table: AsRef<[Probability]>,

source

pub fn support_size(&self) -> usize

Returns the number of symbols supported by the model.

The distribution is defined on the contiguous range of symbols from zero (inclusively) to support_size() (exclusively). All symbols within this range are guaranteed to have a nonzero probability, while all symbols outside of this range have a zero probability.

source

pub fn as_view( &self ) -> ContiguousCategoricalEntropyModel<Probability, &[Probability], PRECISION>

Makes a very cheap shallow copy of the model that can be used much like a shared reference.

The returned ContiguousCategoricalEntropyModel implements Copy, which is a requirement for some methods, such as Encode::encode_iid_symbols or Decode::decode_iid_symbols. These methods could also accept a shared reference to a ContiguousCategoricalEntropyModel (since all references to entropy models are also entropy models, and all shared references implement Copy), but passing a view instead may be slightly more efficient because it avoids one level of dereferencing.

source

pub fn to_lookup_decoder_model( &self ) -> LookupDecoderModel<Probability, Probability, ContiguousSymbolTable<Vec<Probability>>, Box<[Probability]>, PRECISION>
where Probability: Into<usize>, usize: AsPrimitive<Probability>,

Creates a LookupDecoderModel for efficient decoding of i.i.d. data

While a ContiguousCategoricalEntropyModel can already be used for decoding (since it implements DecoderModel), you may prefer converting it to a LookupDecoderModel first for improved efficiency. Logically, the two will be equivalent.

Warning

You should only call this method if both of the following conditions are satisfied:

  • PRECISION is relatively small (typically PRECISION == 12, as in the “Small” preset) because the memory footprint of a LookupDecoderModel grows exponentially in PRECISION; and
  • you’re about to decode a relatively large number of symbols with the resulting model; the conversion to a LookupDecoderModel bears a significant runtime and memory overhead, so if you’re going to use the resulting model only for a single or a handful of symbols then you’ll end up paying more than you gain.

Trait Implementations§

source§

impl<Probability: Clone, Table: Clone, const PRECISION: usize> Clone for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>

source§

fn clone( &self ) -> ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>

Returns a copy of the value. Read more
1.0.0 · source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
source§

impl<Probability: Debug, Table: Debug, const PRECISION: usize> Debug for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>

source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
source§

impl<Probability, Table, const PRECISION: usize> DecoderModel<PRECISION> for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>
where Probability: BitArray, Table: AsRef<[Probability]>,

source§

fn quantile_function( &self, quantile: Self::Probability ) -> (usize, Probability, Probability::NonZero)

Looks up the symbol for a given quantile. Read more
source§

impl<Probability, Table, const PRECISION: usize> EncoderModel<PRECISION> for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>
where Probability: BitArray, Table: AsRef<[Probability]>,

EncoderModel is only implemented for contiguous generic categorical models. To decode encode symbols from a non-contiguous support, use an NonContiguousCategoricalEncoderModel.

source§

fn left_cumulative_and_probability( &self, symbol: impl Borrow<usize> ) -> Option<(Probability, Probability::NonZero)>

Looks up a symbol in the entropy model. Read more
source§

fn floating_point_probability<F>(&self, symbol: Self::Symbol) -> F
where F: FloatCore, Self::Probability: Into<F>,

Returns the probability of the given symbol in floating point representation. Read more
source§

impl<Probability, Table, const PRECISION: usize> EntropyModel<PRECISION> for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>
where Probability: BitArray,

§

type Symbol = usize

The type of data over which the entropy model is defined. Read more
§

type Probability = Probability

The type used to represent probabilities, cumulatives, and quantiles. Read more
source§

impl<'m, Probability, Table, const PRECISION: usize> From<&'m ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>> for LookupDecoderModel<Probability, Probability, ContiguousSymbolTable<Vec<Probability>>, Box<[Probability]>, PRECISION>
where Probability: BitArray + Into<usize>, usize: AsPrimitive<Probability>, Table: AsRef<[Probability]>,

source§

fn from( model: &'m ContiguousCategoricalEntropyModel<Probability, Table, PRECISION> ) -> Self

Converts to this type from the input type.
source§

impl<'m, Probability, Table, const PRECISION: usize> IterableEntropyModel<'m, PRECISION> for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>
where Probability: BitArray, Table: AsRef<[Probability]>,

§

type Iter = SymbolTableIter<usize, Probability, ContiguousSymbolTable<&'m [Probability]>>

The type of the iterator returned by symbol_table. Read more
source§

fn symbol_table(&'m self) -> Self::Iter

Iterates over all symbols in the unique order that is consistent with the cumulative distribution. Read more
source§

fn floating_point_symbol_table<F>( &'m self ) -> FloatingPointSymbolTable<F, Self::Iter, PRECISION>
where F: From<Self::Probability>,

Similar to symbol_table, but yields both cumulatives and probabilities in floating point representation. Read more
source§

fn entropy_base2<F>(&'m self) -> F
where F: Float + Sum, Self::Probability: Into<F>,

Returns the entropy in units of bits (i.e., base 2). Read more
source§

impl<Probability: Copy, Table: Copy, const PRECISION: usize> Copy for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>

Auto Trait Implementations§

§

impl<Probability, Table, const PRECISION: usize> RefUnwindSafe for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>
where Probability: RefUnwindSafe, Table: RefUnwindSafe,

§

impl<Probability, Table, const PRECISION: usize> Send for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>
where Probability: Send, Table: Send,

§

impl<Probability, Table, const PRECISION: usize> Sync for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>
where Probability: Sync, Table: Sync,

§

impl<Probability, Table, const PRECISION: usize> Unpin for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>
where Probability: Unpin, Table: Unpin,

§

impl<Probability, Table, const PRECISION: usize> UnwindSafe for ContiguousCategoricalEntropyModel<Probability, Table, PRECISION>
where Probability: UnwindSafe, Table: UnwindSafe,

Blanket Implementations§

source§

impl<T> Any for T
where T: 'static + ?Sized,

source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
source§

impl<T> Borrow<T> for T
where T: ?Sized,

source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
source§

impl<T> From<T> for T

source§

fn from(t: T) -> T

Returns the argument unchanged.

source§

impl<T, U> Into<U> for T
where U: From<T>,

source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

source§

impl<T> ToOwned for T
where T: Clone,

§

type Owned = T

The resulting type after obtaining ownership.
source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

§

type Error = Infallible

The type returned in the event of a conversion error.
source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.