Expand description
Traits for reading and writing instantaneous codes.
This module contains code for reading and writing instantaneous codes. Codewords are uniformly indexed from 0 for all codes. For example, the first few words of unary, γ, and δ codes are:
| Arg | unary | γ | δ |
|---|---|---|---|
| 0 | 1 | 1 | 1 |
| 1 | 01 | 010 | 0100 |
| 2 | 001 | 011 | 0101 |
| 3 | 0001 | 00100 | 01100 |
| 4 | 00001 | 00101 | 01101 |
| 5 | 000001 | 00110 | 01110 |
| 6 | 0000001 | 00111 | 01111 |
| 7 | 00000001 | 0001000 | 00100000 |
If you need to encode signed integers, please use the ToInt and
ToNat traits, which provide a bijection between signed integers and
natural numbers.
Each code is implemented as a pair of traits for reading and writing (e.g.,
GammaReadParam and
GammaWriteParam). The traits for reading depend
on BitRead, whereas the traits for writing
depend on BitWrite. Note that most codes cannot
write the number u64::MAX because of overflow issues, which could be
avoided with tests, but at the price of a significant performance drop.
The traits ending with Param make it possible to specify parameters—for
example, whether to use decoding tables. Usually, one would instead pull
into scope non-parametric traits such as GammaRead and GammaWrite,
for which defaults are provided using the mechanism described in the
params module.
§Big-endian vs. little-endian
As discussed in the traits module, in general reversing the bits of a big-endian bit stream will not yield a little-endian bit stream containing the same sequence of fixed-width integers. The same is true for codes, albeit the situation is more complex.
The only code that can be safely reversed is the unary code. All other codes contain some value, and that value is written without reversing its bits. Thus, reversing the bits of a big-endian bit stream containing a sequence of instantaneous codes will not yield a little-endian bit stream containing the same sequence of codes (again, with the exception of unary codes). Technically, the codes written for the little-endian case are different from those written for the big-endian case.
For example, the γ code of 4 is 00101 in big-endian order, but it
is 01100 in little-endian order, so that upon reading the unary code for 2
we can read the 01 part without a bit reversal.
The case of minimal binary codes is even more convoluted:
for example, the code with upper bound 7 has codewords 00, 010, 011,
100, 101, 110, and 111. To decode such a code without peeking at
more bits than necessary, one first reads two bits, and then decides, based
on their value, whether to read a further bit and add it on the right. But
this means that we have to encode 2 as 011 in the big-endian case, and as
101 in the little-endian case, because we need to read the first two bits
to decide whether to read the third one.
In some cases, we resort to completely ad hoc solutions: for example, in the case of the ω code, for the little-endian case instead of reversing the bits written at each recursive call (which in principle would be necessary), we simply rotate them to the left by one position, exposing the most significant bit as first bit. This is sufficient to make the decoding possible, and the rotation is a much faster operation than bit reversal.
§Dispatch
The basic method for accessing codes is through traits like
GammaRead and GammaWrite. This approach, however, forces a choice of code in the
source. To pass a choice of code dynamically, please have a look at the
dispatch module.
Re-exports§
pub use gamma::GammaRead;pub use gamma::GammaWrite;pub use gamma::len_gamma;pub use delta::DeltaRead;pub use delta::DeltaWrite;pub use delta::len_delta;pub use omega::OmegaRead;pub use omega::OmegaWrite;pub use omega::len_omega;pub use minimal_binary::MinimalBinaryRead;pub use minimal_binary::MinimalBinaryWrite;pub use minimal_binary::len_minimal_binary;pub use zeta::ZetaRead;pub use zeta::ZetaWrite;pub use zeta::len_zeta;pub use pi::PiRead;pub use pi::PiWrite;pub use pi::len_pi;pub use golomb::GolombRead;pub use golomb::GolombWrite;pub use golomb::len_golomb;pub use rice::RiceRead;pub use rice::RiceWrite;pub use rice::len_rice;pub use exp_golomb::ExpGolombRead;pub use exp_golomb::ExpGolombWrite;pub use exp_golomb::len_exp_golomb;pub use vbyte::VByteBeRead;pub use vbyte::VByteBeWrite;pub use vbyte::VByteLeRead;pub use vbyte::VByteLeWrite;pub use vbyte::bit_len_vbyte;pub use vbyte::byte_len_vbyte;pub use vbyte::vbyte_read;pub use vbyte::vbyte_read_be;pub use vbyte::vbyte_read_le;pub use vbyte::vbyte_write;pub use vbyte::vbyte_write_be;pub use vbyte::vbyte_write_le;
Modules§
- delta
- Elias δ code.
- exp_
golomb - Exponential Golomb codes.
- gamma
- Elias γ code.
- golomb
- Golomb codes.
- minimal_
binary - Minimal binary codes.
- omega
- Elias ω code.
- params
- Mechanisms for selecting parameters.
- pi
- Streamlined Apostolico–Drovandi π codes.
- rice
- Rice codes.
- vbyte
- Variable-length byte codes.
- zeta
- Boldi–Vigna ζ codes.