Struct FseTable

Source

pub struct FseTable { /* private fields */ }

Expand description

FSE decoding table.

The table size is always a power of 2, determined by the accuracy log. Table size = 1 << accuracy_log

Implementations§

Source §

impl FseTable

Source

pub fn build( normalized_freqs: &[i16], accuracy_log: u8, max_symbol: u8, ) -> Result<Self>

Build an FSE decoding table from a normalized frequency distribution.

§Arguments

normalized_freqs - Frequency for each symbol (must sum to table_size)
accuracy_log - Log2 of table size (max 15)
max_symbol - Maximum symbol value

§Returns

A built FSE decoding table.

Source

pub fn from_predefined(distribution: &[i16], accuracy_log: u8) -> Result<Self>

Build a table using predefined distributions.

IMPORTANT: This uses the EXACT hardcoded predefined tables from zstd for bit-exact compatibility. The distribution parameter is used only to determine which predefined table to use.

Source

pub fn from_hardcoded_of() -> Result<Self>

Build the exact predefined Offset FSE table from zstd’s hardcoded values.

Source

pub fn from_hardcoded_ll() -> Result<Self>

Build the exact predefined Literal Length FSE table from zstd’s hardcoded values.

Source

pub fn from_hardcoded_ml() -> Result<Self>

Build the exact predefined Match Length FSE table from zstd’s hardcoded values.

Uses ML_PREDEFINED_TABLE with zstd’s exact (symbol, nbBits, baseline) values. Also populates seq_base and seq_extra_bits from ML_BASELINE_TABLE for direct sequence decoding.

This ensures compatibility with reference zstd decompression.

Source

pub fn parse(data: &[u8], max_symbol: u8) -> Result<(Self, usize)>

Parse an FSE table from compressed data.

Returns the parsed table and number of bytes consumed.

§Format (RFC 8878 Section 4.1.1)

4 bits: accuracy_log - 5 (actual log = value + 5)
Variable-length encoded symbol probabilities

Probabilities use a variable number of bits based on remaining probability.

Source

pub fn size(&self) -> usize

Get the table size.

Source

pub fn accuracy_log(&self) -> u8

Get the accuracy log.

Source

pub fn decode(&self, state: usize) -> &FseTableEntry

Decode a symbol from the current state.

Source

pub fn state_mask(&self) -> usize

Get the initial state mask for decoding.

Source

pub fn is_valid(&self) -> bool

Check if the table is valid.

A valid table has:

Non-empty entries
Valid accuracy log (1-15)
All symbols in valid range

Source

pub fn max_symbol(&self) -> u8

Get the maximum symbol value in this table.

Source

pub fn is_rle_mode(&self) -> bool

Check if this table encodes RLE mode (single symbol only).

RLE mode is detected when all table entries decode to the same symbol. This is common for highly skewed distributions where one symbol dominates.

Source

pub fn from_frequencies( frequencies: &[u32], min_accuracy_log: u8, ) -> Result<(Self, Vec<i16>)>

Build an FSE table from symbol frequencies, automatically computing accuracy_log.

This normalizes frequencies to sum to a power of 2 (table_size).

Source

pub fn from_frequencies_serializable( frequencies: &[u32], min_accuracy_log: u8, ) -> Result<(Self, Vec<i16>)>

Build an FSE table from symbol frequencies with serialization-safe normalization.

This variant ensures the normalized distribution can be serialized by padding with synthetic -1 symbols to avoid the “100% remaining” encoding limitation.

The key insight: FSE variable-length encoding can’t represent a probability that equals 100% of remaining. By adding trailing -1 symbols, we ensure remaining > last_probability at each step.

The synthetic symbols are never used during sequence encoding - they just exist to satisfy the serialization constraint.

Source