Module codec_spec

Source
Expand description

§Codec Specification and Strategy Selection

This module defines the mechanisms for selecting and configuring the compression strategy for an IntVec. The choice of codec is critical, as its effectiveness is highly dependent on the statistical properties of the data being compressed.

§Encoding Strategies

The library supports two fundamental encoding families:

  1. Variable-Length Instantaneous Codes: Sourced from the dsi-bitstream crate, these codes (e.g., Gamma, Delta, Zeta) are designed to compress integers by using shorter bit sequences for more frequent values, making them ideal for skewed data distributions.

  2. Fixed-Width Integer Encoding: This strategy uses the same number of bits for every integer. It is optimal for data that is uniformly distributed within a known range, providing the fastest possible random access.

§The CodecSpec Enum

The primary user-facing API for this module is the CodecSpec enum. It provides a high-level interface for specifying the desired compression strategy, allowing for:

  • Direct selection of a parameter-free codec (e.g., Gamma).
  • Explicit parameterization of tunable codecs (e.g., Zeta { k: Some(3) }).
  • Automatic parameter selection, where the library analyzes the data to find the optimal configuration (e.g., Auto or FixedLength { num_bits: None }).

The resolve_codec function translates a user’s CodecSpec into a concrete Encoding variant that the IntVec can use for its internal operations.

Enums§

CodecSpec
Specifies the compression codec and its parameters for an IntVec.
Encoding
Represents the chosen encoding strategy for an IntVec.

Functions§

resolve_codec
Resolves a user-provided CodecSpec into a concrete Encoding variant.