Expand description
§Compressed IntVec Module
This module provides a compressed vector of integers that leverages bit-level encoding to efficiently store a sequence of unsigned 64-bit integers.
§Overview
The core data structure, IntVec
, maintains a compressed bitstream along with sampling offsets,
which enable fast random access to individual elements without the need to decode the entire stream.
The module supports two variants based on endianness:
- Big-Endian (
BEIntVec
) - Little-Endian (
LEIntVec
)
Both variants work with codecs that implement the Codec
trait, allowing flexible and configurable
encoding/decoding strategies. Codecs may optionally accept extra runtime parameters to tune the compression.
§Key Features
- Efficient Storage: Compresses integer sequences into a compact bitstream.
- Random Access: Uses periodic sampling (every k-th element) to jump-start decompression.
- Generic Codec Support: Works with any codec implementing the
Codec
trait. - Endian Flexibility: Supports both big-endian and little-endian representations.
§Components
IntVec
: The main structure containing compressed data, sample offsets, codec parameters, and metadata. You don’t need to interact with this directly.BEIntVec
/LEIntVec
: Type aliases for endianness-specific versions ofIntVec
.- Iterators:
BEIntVecIter
andLEIntVecIter
decode values on the fly when iterated.
§Usage Examples
§Creating a Big-Endian Compressed Vector
use compressed_intvec::intvec::BEIntVec;
use compressed_intvec::codecs::ExpGolombCodec;
// Define a vector of unsigned 64-bit integers.
let input = vec![1, 5, 3, 1991, 42];
// Create a Big-Endian compressed vector using ExpGolombCodec with a parameter (e.g., 3)
// and sample every 2 elements.
let intvec = BEIntVec::<ExpGolombCodec>::from_with_param(&input, 2, 3);
// Retrieve a specific element by its index.
let value = intvec.get(3);
assert_eq!(value, Some(1991));
// Decode the entire compressed vector back to its original form.
let decoded = intvec.into_vec();
assert_eq!(decoded, input);
§Creating a Little-Endian Compressed Vector
use compressed_intvec::intvec::LEIntVec;
use compressed_intvec::codecs::GammaCodec;
// Define a vector of unsigned 64-bit integers.
let input = vec![10, 20, 30, 40, 50];
// Create a Little-Endian compressed vector using GammaCodec without extra codec parameters,
// sampling every 2 elements.
let intvec = LEIntVec::<GammaCodec>::from(&input, 2);
assert_eq!(intvec.get(2), Some(30));
§Design Details
- Bitstream Storage: The compressed data is stored as a vector of 64-bit words (
Vec<u64>
). - Sampling Strategy: To support fast random access, sample offsets (in bits) are stored for every k-th integer.
- Codec Abstraction: The module is codec-agnostic; any codec conforming to the
Codec
trait can be used. - Endian Handling: The endianness of the encoding/decoding process is managed through phantom types, enabling both big-endian and little-endian variants.
§Module Structure and Extensibility
The module’s API provides constructors (from_with_param
and from
), element access (get
), full
decoding (into_vec
), and iteration (iter
). It can be extended with new codecs by implementing
the Codec
trait for additional compression methods or parameters.
§Error Handling
The current implementation assumes that errors in encoding/decoding are exceptional and uses .unwrap()
in places where failure is unexpected. For production code, you might consider propagating errors
instead of panicking.
§Getting Started
- Choose or implement a codec that satisfies the
Codec
trait requirements. - Use the provided constructors to compress a vector of integers.
- Leverage the efficient sampling mechanism for fast random access, or decode the full content when needed.
For more details, refer to the documentation of the Codec
trait and the respective codec implementations.
Structs§
- BEInt
VecIter - Iterator over the values stored in a
BEIntVec
. The iterator decodes values on the fly. - IntVec
- A compressed vector of integers.
- LEInt
VecIter - Iterator over the values stored in a
LEIntVec
.