pub struct NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, const PRECISION: usize> { /* private fields */ }
Expand description
An entropy model for a categorical probability distribution over arbitrary symbols, for decoding only.
You will usually want to use this type through one of its type aliases,
DefaultNonContiguousCategoricalDecoderModel
or
SmallNonContiguousCategoricalDecoderModel
, see discussion of
presets.
This type implements the trait DecoderModel
but not the trait EncoderModel
.
Thus, you can use a NonContiguousCategoricalDecoderModel
for decoding with any of
the stream decoders provided by the constriction
crate, but not for encoding. If you
want to encode data, use a NonContiguousCategoricalEncoderModel
instead. You can
convert a NonContiguousCategoricalDecoderModel
to a
NonContiguousCategoricalEncoderModel
by calling
to_generic_encoder_model
on it
(you’ll have to bring the trait IterableEntropyModel
into scope to do so: use constriction::stream::model::IterableEntropyModel
).
Example
See example for
NonContiguousCategoricalEncoderModel
.
When Should I Use This Type of Entropy Model?
Use a NonContiguousCategoricalDecoderModel
for probabilistic models that can only be
represented as an explicit probability table, and not by some more compact analytic
expression. If you have a probability model that can be expressed by some analytical
expression (e.g., a Binomial
distribution),
then use LeakyQuantizer
instead (unless you want to encode lots of symbols with the
same entropy model, in which case the explicitly tabulated representation of a
categorical entropy model could improve runtime performance).
Further, if the support of your probabilistic model (i.e., the set of symbols to which
the model assigns a non-zero probability) is a contiguous range of integers starting at
zero, then it is better to use a ContiguousCategoricalEntropyModel
. It has better
computational efficiency and it is easier to use since it supports both encoding and
decoding with a single type.
If you want to decode lots of symbols with the same entropy model, and if reducing the
PRECISION
to a moderate value is acceptable to you, then you may want to consider
using a LookupDecoderModel
instead for even better runtime performance (at the cost
of a larger memory footprint and worse compression efficiency due to lower PRECISION
).
Computational Efficiency
For a probability distribution with a support of N
symbols, a
NonContiguousCategoricalDecoderModel
has the following asymptotic costs:
- creation:
- runtime cost:
Θ(N)
when creating from fixed point probabilities,Θ(N log(N))
when creating from floating point probabilities; - memory footprint:
Θ(N)
; - both are more expensive by a constant factor than for a
ContiguousCategoricalEntropyModel
.
- runtime cost:
- encoding a symbol: not supported; use a
NonContiguousCategoricalEncoderModel
. - decoding a symbol (calling
DecoderModel::quantile_function
):- runtime cost:
Θ(log(N))
(both expected and worst-case) - memory footprint: no heap allocations, constant stack space.
- runtime cost:
Implementations§
source§impl<Symbol, Probability: BitArray, const PRECISION: usize> NonContiguousCategoricalDecoderModel<Symbol, Probability, Vec<(Probability, Symbol)>, PRECISION>where
Symbol: Clone,
impl<Symbol, Probability: BitArray, const PRECISION: usize> NonContiguousCategoricalDecoderModel<Symbol, Probability, Vec<(Probability, Symbol)>, PRECISION>where
Symbol: Clone,
sourcepub fn from_symbols_and_floating_point_probabilities<F>(
symbols: &[Symbol],
probabilities: &[F]
) -> Result<Self, ()>where
F: FloatCore + Sum<F> + Into<f64>,
Probability: Into<f64> + AsPrimitive<usize>,
f64: AsPrimitive<Probability>,
usize: AsPrimitive<Probability>,
pub fn from_symbols_and_floating_point_probabilities<F>(
symbols: &[Symbol],
probabilities: &[F]
) -> Result<Self, ()>where
F: FloatCore + Sum<F> + Into<f64>,
Probability: Into<f64> + AsPrimitive<usize>,
f64: AsPrimitive<Probability>,
usize: AsPrimitive<Probability>,
Constructs a leaky distribution over the provided symbols
whose PMF approximates
given probabilities
.
The argument probabilities
is a slice of floating point values (F
is
typically f64
or f32
). All entries must be nonnegative and at least one
entry has to be nonzero. The entries do not necessarily need to add up to
one (the resulting distribution will automatically get normalized and an
overall scaling of all entries of probabilities
does not affect the
result, up to effects due to rounding errors).
The probability mass function of the returned distribution will approximate the provided probabilities as well as possible, subject to the following constraints:
- probabilities are represented in fixed point arithmetic, where the const
generic parameter
PRECISION
controls the number of bits of precision. This typically introduces rounding errors; - despite the possibility of rounding errors, the returned probability distribution will be exactly normalized; and
- each symbol gets assigned a strictly nonzero probability, even if the provided
probability for the symbol is zero or below the threshold that can be resolved in
fixed point arithmetic with
PRECISION
bits. We refer to this property as the resulting distribution being “leaky”. The leakiness guarantees that a decoder can in principle decode any of the provided symbols (if given appropriate compressed data).
More precisely, the resulting probability distribution minimizes the cross entropy from the provided (floating point) to the resulting (fixed point) probabilities subject to the above three constraints.
Error Handling
Returns an error if symbols.len() != probabilities.len()
.
Also returns an error if the provided probability distribution cannot be normalized,
either because probabilities
is of length zero, or because one of its entries is
negative with a nonzero magnitude, or because the sum of its elements is zero,
infinite, or NaN.
Also returns an error if the probability distribution is degenerate, i.e.,
if probabilities
has only a single element, because degenerate probability
distributions currently cannot be represented.
TODO: should also return an error if support is too large to support leaky distribution
sourcepub fn from_symbols_and_nonzero_fixed_point_probabilities<S, P>(
symbols: S,
probabilities: P,
infer_last_probability: bool
) -> Result<Self, ()>
pub fn from_symbols_and_nonzero_fixed_point_probabilities<S, P>( symbols: S, probabilities: P, infer_last_probability: bool ) -> Result<Self, ()>
Constructs a distribution with a PMF given in fixed point arithmetic.
This is a low level method that allows, e.g,. reconstructing a probability
distribution previously exported with symbol_table
. The more common way to
construct a NonContiguousCategoricalDecoderModel
distribution is via
from_symbols_and_floating_point_probabilities
.
The items of probabilities
have to be nonzero and smaller than 1 << PRECISION
,
where PRECISION
is a const generic parameter on the
NonContiguousCategoricalDecoderModel
.
If infer_last_probability
is false
then probabilities
must yield the same
number of items as symbols
does and the items yielded by probabilities
have to
to (logically) sum up to 1 << PRECISION
. If infer_last_probability
is true
then probabilities
must yield one fewer item than symbols
, they items must sum
up to a value strictly smaller than 1 << PRECISION
, and the method will assign the
(nonzero) remaining probability to the last symbol.
Example
Creating a NonContiguousCategoricalDecoderModel
with inferred probability of the
last symbol:
use constriction::stream::model::{
DefaultNonContiguousCategoricalDecoderModel, IterableEntropyModel
};
let partial_probabilities = vec![1u32 << 21, 1 << 22, 1 << 22, 1 << 22];
// `partial_probabilities` sums up to strictly less than `1 << PRECISION` as required:
assert!(partial_probabilities.iter().sum::<u32>() < 1 << 24);
let symbols = "abcde"; // Has one more entry than `probabilities`
let model = DefaultNonContiguousCategoricalDecoderModel
::from_symbols_and_nonzero_fixed_point_probabilities(
symbols.chars(), &partial_probabilities, true).unwrap();
let symbol_table = model.floating_point_symbol_table::<f64>().collect::<Vec<_>>();
assert_eq!(
symbol_table,
vec![
('a', 0.0, 0.125),
('b', 0.125, 0.25),
('c', 0.375, 0.25),
('d', 0.625, 0.25),
('e', 0.875, 0.125), // Inferred last probability.
]
);
For more related examples, see
ContiguousCategoricalEntropyModel::from_nonzero_fixed_point_probabilities
.
sourcepub fn from_iterable_entropy_model<'m, M>(model: &'m M) -> Selfwhere
M: IterableEntropyModel<'m, PRECISION, Symbol = Symbol, Probability = Probability> + ?Sized,
pub fn from_iterable_entropy_model<'m, M>(model: &'m M) -> Selfwhere
M: IterableEntropyModel<'m, PRECISION, Symbol = Symbol, Probability = Probability> + ?Sized,
Creates a NonContiguousCategoricalDecoderModel
from any entropy model that
implements IterableEntropyModel
.
Calling NonContiguousCategoricalDecoderModel::from_iterable_entropy_model(&model)
is equivalent to calling model.to_generic_decoder_model()
, where the latter
requires bringing IterableEntropyModel
into scope.
TODO: test
source§impl<Symbol, Probability, Table, const PRECISION: usize> NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
impl<Symbol, Probability, Table, const PRECISION: usize> NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
sourcepub fn support_size(&self) -> usize
pub fn support_size(&self) -> usize
Returns the number of symbols supported by the model, i.e., the number of symbols to which the model assigns a nonzero probability.
sourcepub fn as_view(
&self
) -> NonContiguousCategoricalDecoderModel<Symbol, Probability, &[(Probability, Symbol)], PRECISION>
pub fn as_view( &self ) -> NonContiguousCategoricalDecoderModel<Symbol, Probability, &[(Probability, Symbol)], PRECISION>
Makes a very cheap shallow copy of the model that can be used much like a shared reference.
The returned NonContiguousCategoricalDecoderModel
implements Copy
, which is a
requirement for some methods, such as Decode::decode_iid_symbols
. These methods
could also accept a shared reference to a NonContiguousCategoricalDecoderModel
(since all references to entropy models are also entropy models, and all shared
references implement Copy
), but passing a view instead may be slightly more
efficient because it avoids one level of dereferencing.
Trait Implementations§
source§impl<Symbol: Clone, Probability: Clone, Table: Clone, const PRECISION: usize> Clone for NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
impl<Symbol: Clone, Probability: Clone, Table: Clone, const PRECISION: usize> Clone for NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
source§fn clone(
&self
) -> NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
fn clone( &self ) -> NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
1.0.0 · source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source
. Read moresource§impl<Symbol: Debug, Probability: Debug, Table: Debug, const PRECISION: usize> Debug for NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
impl<Symbol: Debug, Probability: Debug, Table: Debug, const PRECISION: usize> Debug for NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
source§impl<Symbol, Probability, Table, const PRECISION: usize> DecoderModel<PRECISION> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
impl<Symbol, Probability, Table, const PRECISION: usize> DecoderModel<PRECISION> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
source§fn quantile_function(
&self,
quantile: Self::Probability
) -> (Symbol, Probability, Probability::NonZero)
fn quantile_function( &self, quantile: Self::Probability ) -> (Symbol, Probability, Probability::NonZero)
source§impl<Symbol, Probability, Table, const PRECISION: usize> EntropyModel<PRECISION> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>where
Probability: BitArray,
impl<Symbol, Probability, Table, const PRECISION: usize> EntropyModel<PRECISION> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>where
Probability: BitArray,
source§impl<'m, Symbol, Probability, M, const PRECISION: usize> From<&'m M> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Vec<(Probability, Symbol)>, PRECISION>where
Symbol: Clone,
Probability: BitArray,
M: IterableEntropyModel<'m, PRECISION, Symbol = Symbol, Probability = Probability> + ?Sized,
impl<'m, Symbol, Probability, M, const PRECISION: usize> From<&'m M> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Vec<(Probability, Symbol)>, PRECISION>where
Symbol: Clone,
Probability: BitArray,
M: IterableEntropyModel<'m, PRECISION, Symbol = Symbol, Probability = Probability> + ?Sized,
source§impl<'m, Symbol, Probability, Table, const PRECISION: usize> IterableEntropyModel<'m, PRECISION> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
impl<'m, Symbol, Probability, Table, const PRECISION: usize> IterableEntropyModel<'m, PRECISION> for NonContiguousCategoricalDecoderModel<Symbol, Probability, Table, PRECISION>
§type Iter = SymbolTableIter<Symbol, Probability, NonContiguousSymbolTable<&'m [(Probability, Symbol)]>>
type Iter = SymbolTableIter<Symbol, Probability, NonContiguousSymbolTable<&'m [(Probability, Symbol)]>>
symbol_table
. Read moresource§fn symbol_table(&'m self) -> Self::Iter
fn symbol_table(&'m self) -> Self::Iter
source§fn floating_point_symbol_table<F>(
&'m self
) -> FloatingPointSymbolTable<F, Self::Iter, PRECISION> ⓘwhere
F: From<Self::Probability>,
fn floating_point_symbol_table<F>(
&'m self
) -> FloatingPointSymbolTable<F, Self::Iter, PRECISION> ⓘwhere
F: From<Self::Probability>,
symbol_table
, but yields both cumulatives and probabilities in
floating point representation. Read more