Module codes

Source
Expand description

Traits for reading and writing instantaneous codes.

This modules contains code for reading and writing instantaneous codes. Codewords are uniformly indexed from 0 for all codes. For example, the first few words of unary, γ, and δ codes are:

Argunaryγδ
0111
1010100100
20010110101
300010010001100
4000010010101101
50000010011001110
600000010011101111
700000001000100000100000

If you need to encode signed integers, please use the ToInt and ToNat traits, which provide a bijection between signed integers and natural numbers.

Each code is implemented as a pair of traits for reading and writing (e.g., GammaReadParam and GammaWriteParam). The traits for reading depend on BitRead, whereas the traits for writing depend on BitWrite. Note that most codes cannot write the number u64::MAX because of overflow issues, which could be avoided with tests, but at the price of a significant performance drop.

The traits ending with Param make it possible to specify parameters—for example, whether to use decoding tables. Usually, one would instead pull in scope non-parametric traits such as GammaRead and GammaWrite, for which defaults are provided using the mechanism described in the params module.

Note that if you are using decoding tables, you must ensure that the peek_bits method of your BitRead implementation returns a sufficient number of bits: if it does not, an assertion will be triggered in test mode, but behavior will be unpredictable otherwise. This is unfortunately difficult to check statically. To stay on the safe side, we recommend to use a an implementation that is able to peek at least at 16 bits.

§Big-endian vs. little-endian

As discussed in the traits module, in general reversing the bits of a big-endian bit stream will not yield a little-endian bit stream containing the same sequence of fixed-width integers. The same is true for codes, albeit the situation is more complex.

The only code that can be safely reversed is the unary code. All other codes contain some value, and that value is written without reversing its bits. Thus, reversing the bits of a big-endian bit stream containing a sequence of instantaneous codes will not yield a little-endian bit stream containing the same sequence of codes (again, with the exception of unary codes). Technically, the codes written for the little-endian case are different than those written for the big-endian case.

For example, the γ code of 4 is 00101 in big-endian order, but it is 01100 in little-endian order, so that upon reading the unary code for 2 we can read the 01 part without a bit reversal.

The case of minimal binary codes is even more convoluted: for example, the code with upper bound 7 has codewords 00, 010, 011, 100, 101, 110, and 111. To decode such a code without peeking at more bits than necessary, one first reads two bits, and then decides, based on their value, whether to read a further bit and add it on the right. But this means that we have to encode 2 as 011 in the big-endian case, and as 101 in the little-endian case, because we need to read the first two bits to decide whether to read the third one.

In some cases, we resort to completely ad hoc solutions: for example, in the case of the ω code, for the little-endian case instead of reversing the bits written at each recursive call (which in principle would be necessary), we simply rotate them to the left by one position, exposing the most significant bit as first bit. This is sufficient to make the decoding possible, and the rotation is a much faster operation than bit reversal.

§Dispatch

The basic method for accessing code is through traits like GammaRead and GammaWrite. This approach, however, forces a choice of code in the source. To pass a choice of code dynamically, please have a look at the dispatch module.

Re-exports§

pub use gamma::len_gamma;
pub use gamma::len_gamma_param;
pub use gamma::GammaRead;
pub use gamma::GammaReadParam;
pub use gamma::GammaWrite;
pub use gamma::GammaWriteParam;
pub use delta::len_delta;
pub use delta::len_delta_param;
pub use delta::DeltaRead;
pub use delta::DeltaReadParam;
pub use delta::DeltaWrite;
pub use delta::DeltaWriteParam;
pub use omega::len_omega;
pub use omega::OmegaRead;
pub use omega::OmegaWrite;
pub use minimal_binary::len_minimal_binary;
pub use minimal_binary::MinimalBinaryRead;
pub use minimal_binary::MinimalBinaryWrite;
pub use zeta::len_zeta;
pub use zeta::len_zeta_param;
pub use zeta::ZetaRead;
pub use zeta::ZetaReadParam;
pub use zeta::ZetaWrite;
pub use zeta::ZetaWriteParam;
pub use pi::len_pi;
pub use pi::PiRead;
pub use pi::PiWrite;
pub use golomb::len_golomb;
pub use golomb::GolombRead;
pub use golomb::GolombWrite;
pub use rice::len_rice;
pub use rice::RiceRead;
pub use rice::RiceWrite;
pub use exp_golomb::len_exp_golomb;
pub use exp_golomb::ExpGolombRead;
pub use exp_golomb::ExpGolombWrite;
pub use vbyte::*;

Modules§

delta
Elias δ code.
exp_golomb
Exponential Golomb codes.
gamma
Elias γ code.
golomb
Golomb codes.
minimal_binary
Minimal binary codes.
omega
Elias ω code.
params
Mechanisms for selecting parameters.
pi
Streamlined Apostolico−Drovandi π codes
rice
Rice codes.
vbyte
Variable-length byte codes.
zeta
Boldi−Vigna ζ codes.

Traits§

ToInt
Extension trait mapping natural numbers bijectively to integers.
ToNat
Extension trait mapping signed integers bijectively to natural numbers.