Struct NonContiguousCategoricalDecoderModel

Source
pub struct NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, const PRECISION: usize> { /* private fields */ }
Expand description

An entropy model for a categorical probability distribution over arbitrary symbols, for decoding only.

You will usually want to use this type through one of its type aliases, DefaultNonContiguousCategoricalDecoderModel or SmallNonContiguousCategoricalDecoderModel, see discussion of presets.

This type implements the trait DecoderModel but not the trait EncoderModel. Thus, you can use a NonContiguousCategoricalDecoderModel for decoding with any of the stream decoders provided by the constriction crate, but not for encoding. If you want to encode data, use a NonContiguousCategoricalEncoderModel instead. You can convert a NonContiguousCategoricalDecoderModel to a NonContiguousCategoricalEncoderModel by calling to_generic_encoder_model on it (you’ll have to bring the trait IterableEntropyModel into scope to do so: use constriction::stream::model::IterableEntropyModel).

§Example

See example for NonContiguousCategoricalEncoderModel.

§When Should I Use This Type of Entropy Model?

Use a NonContiguousCategoricalDecoderModel for probabilistic models that can only be represented as an explicit probability table, and not by some more compact analytic expression.

  • If you have a probability model that can be expressed by some analytical expression (e.g., a Binomial distribution), then use LeakyQuantizer instead (unless you want to encode lots of symbols with the same entropy model, in which case the explicitly tabulated representation of a categorical entropy model could improve runtime performance).
  • If the support of your probabilistic model (i.e., the set of symbols to which the model assigns a non-zero probability) is a contiguous range of integers starting at zero, then it is better to use a ContiguousCategoricalEntropyModel. It has better computational efficiency and it is easier to use since it supports both encoding and decoding with a single type.
  • If you want to decode only a few symbols with a given probability model, then use a LazyContiguousCategoricalEntropyModel, which will be faster (use an array to map the decoded symbols from the contiguous range 0..N to whatever noncontiguous alphabet you have). This use case occurs, e.g., in autoregressive models, where each individual model is often used for only exactly one symbol.
  • If you want to decode lots of symbols with the same entropy model, and if reducing the PRECISION to a moderate value is acceptable to you, then you may want to consider using a NonContiguousLookupDecoderModel instead for even better runtime performance (at the cost of a larger memory footprint and worse compression efficiency due to lower PRECISION).

§Computational Efficiency

For a probability distribution with a support of N symbols, a NonContiguousCategoricalDecoderModel has the following asymptotic costs:

Implementations§

Source§

impl<Symbol, Probability: BitArray, const PRECISION: usize> NonContiguousCategoricalDecoderModel<Symbol, Probability, Vec<(Probability, Symbol)>, PRECISION>
where Symbol: Clone,

Source

pub fn from_symbols_and_floating_point_probabilities_fast<F>( symbols: impl IntoIterator<Item = Symbol>, probabilities: &[F], normalization: Option<F>, ) -> Result<Self, ()>
where F: FloatCore + Sum<F> + AsPrimitive<Probability>, Probability: AsPrimitive<usize>, usize: AsPrimitive<Probability> + AsPrimitive<F>,

Constructs a leaky distribution (for decoding) over the provided symbols whose PMF approximates given probabilities.

Semantics are analogous to ContiguousCategoricalEntropyModel::from_floating_point_probabilities_fast, except that this constructor has an additional symbols argument to provide an iterator over the symbols in the alphabet (which has to yield exactly probabilities.len() symbols).

§See also
Source

pub fn from_symbols_and_floating_point_probabilities_perfect<F>( symbols: impl IntoIterator<Item = Symbol>, probabilities: &[F], ) -> Result<Self, ()>
where F: FloatCore + Sum<F> + Into<f64>, Probability: Into<f64> + AsPrimitive<usize>, f64: AsPrimitive<Probability>, usize: AsPrimitive<Probability>,

Slower variant of from_symbols_and_floating_point_probabilities_fast.

Similar to from_symbols_and_floating_point_probabilities_fast, but the resulting (fixed-point precision) model typically approximates the provided floating point probabilities very slightly better. Only recommended if compression performance is much more important to you than runtime as this constructor can be significantly slower.

See ContiguousCategoricalEntropyModel::from_floating_point_probabilities_perfect for a detailed comparison between ..._fast and ..._perfect constructors of categorical entropy models.

Source

pub fn from_symbols_and_floating_point_probabilities<F>( symbols: &[Symbol], probabilities: &[F], ) -> Result<Self, ()>
where F: FloatCore + Sum<F> + Into<f64>, Probability: Into<f64> + AsPrimitive<usize>, f64: AsPrimitive<Probability>, usize: AsPrimitive<Probability>,

👎Deprecated since 0.4.0: Please use from_symbols_and_floating_point_probabilities_fast or from_symbols_and_floating_point_probabilities_perfect instead. See documentation for detailed upgrade instructions.

Deprecated constructor.

This constructor has been deprecated in constriction version 0.4.0, and it will be removed in constriction version 0.5.0.

§Upgrade Instructions

Most new use cases should call from_symbols_and_floating_point_probabilities_fast instead. Using that constructor (abbreviated as ..._fast in the following) may lead to very slightly larger bit rates, but it runs considerably faster.

However, note that the ..._fast constructor breaks binary compatibility with constriction version <= 0.3.5. If you need to be able to exchange binary compressed data with a program that uses a categorical entropy model from constriction version <= 0.3.5, then call from_symbols_and_floating_point_probabilities_perfect instead (..._perfect for short). Another reason for using the ..._perfect constructor could be if compression performance is much more important to you than runtime performance. See documentation of from_symbols_and_floating_point_probabilities_perfect for more information.

§Compatibility Table

(In the following table, “encoding” refers to NonContiguousCategoricalEncoderModel)

constructor used for encoding →
↓ constructor used for decoding ↓
legacy..._perfect..._fast
legacy (this one)✅ compatible✅ compatible❌ incompatible
..._perfect✅ compatible✅ compatible❌ incompatible
..._fast❌ incompatible❌ incompatible✅ compatible
Source

pub fn from_symbols_and_nonzero_fixed_point_probabilities<S, P>( symbols: S, probabilities: P, infer_last_probability: bool, ) -> Result<Self, ()>
where S: IntoIterator<Item = Symbol>, P: IntoIterator, P::Item: Borrow<Probability>,

Constructs a distribution with a PMF given in fixed point arithmetic.

This is a low level method that allows, e.g,. reconstructing a probability distribution previously exported with symbol_table. The more common way to construct a NonContiguousCategoricalDecoderModel is via from_symbols_and_floating_point_probabilities_fast.

The items of probabilities have to be nonzero and smaller than 1 << PRECISION, where PRECISION is a const generic parameter on the NonContiguousCategoricalDecoderModel.

If infer_last_probability is false then probabilities must yield the same number of items as symbols does, and the items yielded by probabilities have to to (logically) sum up to 1 << PRECISION. If infer_last_probability is true then probabilities must yield one fewer item than symbols, they must sum up to a value strictly smaller than 1 << PRECISION, and the method will assign the (nonzero) remaining probability to the last symbol.

§Example

Creating a NonContiguousCategoricalDecoderModel with inferred probability of the last symbol:

use constriction::stream::model::{
    DefaultNonContiguousCategoricalDecoderModel, IterableEntropyModel
};

let partial_probabilities = vec![1u32 << 21, 1 << 22, 1 << 22, 1 << 22];
// `partial_probabilities` sums up to strictly less than `1 << PRECISION` as required:
assert!(partial_probabilities.iter().sum::<u32>() < 1 << 24);

let symbols = "abcde"; // Has one more entry than `probabilities`

let model = DefaultNonContiguousCategoricalDecoderModel
    ::from_symbols_and_nonzero_fixed_point_probabilities(
        symbols.chars(), &partial_probabilities, true).unwrap();
let symbol_table = model.floating_point_symbol_table::<f64>().collect::<Vec<_>>();
assert_eq!(
    symbol_table,
    vec![
        ('a', 0.0, 0.125),
        ('b', 0.125, 0.25),
        ('c', 0.375, 0.25),
        ('d', 0.625, 0.25),
        ('e', 0.875, 0.125), // Inferred last probability.
    ]
);

For more related examples, see ContiguousCategoricalEntropyModel::from_nonzero_fixed_point_probabilities.

Source

pub fn from_iterable_entropy_model<'m, M>(model: &'m M) -> Self
where M: IterableEntropyModel<'m, PRECISION, Symbol = Symbol, Probability = Probability> + ?Sized,

Creates a NonContiguousCategoricalDecoderModel from any entropy model that implements IterableEntropyModel.

Calling NonContiguousCategoricalDecoderModel::from_iterable_entropy_model(&model) is equivalent to calling model.to_generic_decoder_model(), where the latter requires bringing IterableEntropyModel into scope.

Source§

impl<Symbol, Probability, Cdf, const PRECISION: usize> NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>
where Symbol: Clone, Probability: BitArray, Cdf: AsRef<[(Probability, Symbol)]>,

Source

pub fn support_size(&self) -> usize

Returns the number of symbols supported by the model, i.e., the number of symbols to which the model assigns a nonzero probability.

Source

pub fn as_view( &self, ) -> NonContiguousCategoricalDecoderModel<Symbol, Probability, &[(Probability, Symbol)], PRECISION>

Makes a very cheap shallow copy of the model that can be used much like a shared reference.

The returned NonContiguousCategoricalDecoderModel implements Copy, which is a requirement for some methods, such as Decode::decode_iid_symbols. These methods could also accept a shared reference to a NonContiguousCategoricalDecoderModel (since all references to entropy models are also entropy models, and all shared references implement Copy), but passing a view instead may be slightly more efficient because it avoids one level of dereferencing.

Source

pub fn to_lookup_decoder_model( &self, ) -> NonContiguousLookupDecoderModel<Symbol, Probability, Vec<(Probability, Symbol)>, Box<[Probability]>, PRECISION>
where Probability: Into<usize>, usize: AsPrimitive<Probability>,

Creates a ContiguousLookupDecoderModel or NonContiguousLookupDecoderModel for efficient decoding of i.i.d. data

While a NonContiguousCategoricalEntropyModel can already be used for decoding (since it implements DecoderModel), you may prefer converting it to a LookupDecoderModel first for improved efficiency. Logically, the two will be equivalent.

§Warning

You should only call this method if both of the following conditions are satisfied:

  • PRECISION is relatively small (typically PRECISION == 12, as in the “Small” preset) because the memory footprint of a LookupDecoderModel grows exponentially in PRECISION; and
  • you’re about to decode a relatively large number of symbols with the resulting model; the conversion to a LookupDecoderModel bears a significant runtime and memory overhead, so if you’re going to use the resulting model only for a single or a handful of symbols then you’ll end up paying more than you gain.

Trait Implementations§

Source§

impl<Symbol: Clone, Probability: Clone, Cdf: Clone, const PRECISION: usize> Clone for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>

Source§

fn clone( &self, ) -> NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>

Returns a copy of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<Symbol: Debug, Probability: Debug, Cdf: Debug, const PRECISION: usize> Debug for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<Symbol, Probability, Cdf, const PRECISION: usize> DecoderModel<PRECISION> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>
where Symbol: Clone, Probability: BitArray, Cdf: AsRef<[(Probability, Symbol)]>,

Source§

fn quantile_function( &self, quantile: Self::Probability, ) -> (Symbol, Probability, Probability::NonZero)

Looks up the symbol for a given quantile. Read more
Source§

impl<Symbol, Probability, Cdf, const PRECISION: usize> EntropyModel<PRECISION> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>
where Probability: BitArray,

Source§

type Symbol = Symbol

The type of data over which the entropy model is defined. Read more
Source§

type Probability = Probability

The type used to represent probabilities, cumulatives, and quantiles. Read more
Source§

impl<'m, Symbol, Probability, M, const PRECISION: usize> From<&'m M> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Vec<(Probability, Symbol)>, PRECISION>
where Symbol: Clone, Probability: BitArray, M: IterableEntropyModel<'m, PRECISION, Symbol = Symbol, Probability = Probability> + ?Sized,

Source§

fn from(model: &'m M) -> Self

Converts to this type from the input type.
Source§

impl<'m, Symbol, Probability, Cdf, const PRECISION: usize> IterableEntropyModel<'m, PRECISION> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>
where Symbol: Clone + 'm, Probability: BitArray, Cdf: AsRef<[(Probability, Symbol)]>,

Source§

fn symbol_table( &'m self, ) -> impl Iterator<Item = (Self::Symbol, Self::Probability, <Self::Probability as BitArray>::NonZero)>

Iterates over all symbols in the unique order that is consistent with the cumulative distribution. Read more
Source§

fn floating_point_symbol_table<F>( &'m self, ) -> impl Iterator<Item = (Self::Symbol, F, F)>
where F: FloatCore + From<Self::Probability> + 'm, Self::Probability: Into<F>,

Similar to symbol_table, but yields both cumulatives and probabilities in floating point representation. Read more
Source§

fn entropy_base2<F>(&'m self) -> F
where F: Float + Sum, Self::Probability: Into<F>,

Returns the entropy in units of bits (i.e., base 2). Read more
Source§

fn to_generic_encoder_model( &'m self, ) -> NonContiguousCategoricalEncoderModel<Self::Symbol, Self::Probability, PRECISION>
where Self::Symbol: Hash + Eq,

Creates an EncoderModel from this EntropyModel Read more
Source§

fn to_generic_decoder_model( &'m self, ) -> NonContiguousCategoricalDecoderModel<Self::Symbol, Self::Probability, Vec<(Self::Probability, Self::Symbol)>, PRECISION>
where Self::Symbol: Clone,

Creates a DecoderModel from this EntropyModel Read more
Source§

fn to_generic_lookup_decoder_model( &'m self, ) -> NonContiguousLookupDecoderModel<Self::Symbol, Self::Probability, Vec<(Self::Probability, Self::Symbol)>, Box<[Self::Probability]>, PRECISION>

Creates a DecoderModel from this EntropyModel Read more
Source§

fn cross_entropy_base2<F>(&'m self, p: impl IntoIterator<Item = F>) -> F
where F: Float + Sum, Self::Probability: Into<F>,

Returns the cross entropy between argument p and this model in units of bits (i.e., base 2). Read more
Source§

fn reverse_cross_entropy_base2<F>(&'m self, p: impl IntoIterator<Item = F>) -> F
where F: Float + Sum, Self::Probability: Into<F>,

Returns the cross entropy between this model and argument p in units of bits (i.e., base 2). Read more
Source§

fn kl_divergence_base2<F>(&'m self, p: impl IntoIterator<Item = F>) -> F
where F: Float + Sum, Self::Probability: Into<F>,

Returns Kullback-Leibler divergence D_KL(p || self) Read more
Source§

fn reverse_kl_divergence_base2<F>(&'m self, p: impl IntoIterator<Item = F>) -> F
where F: Float + Sum, Self::Probability: Into<F>,

Returns reverse Kullback-Leibler divergence, i.e., D_KL(self || p) Read more
Source§

impl<Symbol: Copy, Probability: Copy, Cdf: Copy, const PRECISION: usize> Copy for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>

Auto Trait Implementations§

§

impl<Symbol, Probability, Cdf, const PRECISION: usize> Freeze for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>
where Cdf: Freeze,

§

impl<Symbol, Probability, Cdf, const PRECISION: usize> RefUnwindSafe for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>
where Cdf: RefUnwindSafe, Symbol: RefUnwindSafe, Probability: RefUnwindSafe,

§

impl<Symbol, Probability, Cdf, const PRECISION: usize> Send for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>
where Cdf: Send, Symbol: Send, Probability: Send,

§

impl<Symbol, Probability, Cdf, const PRECISION: usize> Sync for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>
where Cdf: Sync, Symbol: Sync, Probability: Sync,

§

impl<Symbol, Probability, Cdf, const PRECISION: usize> Unpin for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>
where Cdf: Unpin, Symbol: Unpin, Probability: Unpin,

§

impl<Symbol, Probability, Cdf, const PRECISION: usize> UnwindSafe for NonContiguousCategoricalDecoderModel<Symbol, Probability, Cdf, PRECISION>
where Cdf: UnwindSafe, Symbol: UnwindSafe, Probability: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.