Expand description
§A compressed, randomly accessible vector of u64
integers.
This module provides the core implementation of IntVec
, a data structure
designed for space-efficient storage and fast random access of u64
integer
sequences. It achieves compression by leveraging a variety of instantaneous
codes from the dsi-bitstream
crate, which encode integers into a
variable-length bitstream.
§Core Functionality
- Compression: Employs codecs like Gamma (γ), Delta (δ), and Zeta (ζ) for
skewed data, and a highly efficient
FixedLength
encoding for uniform data with a small range - Random Access: For variable-length codes, it uses a sampling mechanism
to provide fast random access. The sampling rate,
k
, determines the trade-off between access speed and memory overhead. ForFixedLength
encoding, access is a true O(1) operation. - Flexible Construction: Provides a builder API that can construct an
IntVec
from a slice (with automatic codec selection) or an iterator (for large datasets, requiring manual parameter specification). - High-Performance Lookups: Offers optimized methods for various access
patterns, including a reusable
IntVecReader
for dynamic lookups, and efficient batch methods likeget_many
andpar_get_many
.
The main struct, IntVec
, is generic over Endianness
, allowing
to choose between Little-Endian (LEIntVec
) and Big-Endian (BEIntVec
)
representations to optimize for specific hardware architectures.
§Example
use compressed_intvec::prelude::*;
// A small vector of integers to be compressed.
let data: &[u64] = &[40, 200, 0, 50, 13, 90, 1023];
// Use the builder to create an IntVec.
// `CodecSpec::Auto` will analyze the data and select the best codec.
let intvec = LEIntVec::builder(data)
.k(2) // Use a small sampling rate for this vector.
.codec(CodecSpec::Auto)
.build()
.unwrap();
// Verify the length and access some elements.
assert_eq!(intvec.len(), data.len());
assert_eq!(intvec.get(1), Some(200));
assert_eq!(intvec.get(6), Some(1023));
Or alternatively, we can use a fixed-length encoding:
use compressed_intvec::prelude::*;
// A small vector of integers to be compressed.
let data: &[u64] = &[40, 200, 0, 50, 13, 90, 1023];
// Use the builder to create an IntVec with fixed-length encoding.
// Using `None` for `num_bits` will automatically select the best bit width (in this case, 10 bits).
let intvec = LEIntVec::builder(data)
.codec(CodecSpec::FixedLength { num_bits: None })
.build()
.unwrap();
// Verify the length and access some elements.
assert_eq!(intvec.len(), data.len());
assert_eq!(intvec.get(1), Some(200));
assert_eq!(intvec.get(6), Some(1023));
Structs§
- IntVec
- A compressed, randomly accessible vector of
u64
integers. - IntVec
Builder - A builder for creating an
IntVec
from a slice (&[u64]
). - IntVec
From Iter Builder - A builder for creating an
IntVec
from an iterator. - IntVec
Iter - An iterator over the decompressed
u64
values of anIntVec
. - IntVec
Reader - A stateful reader for an
IntVec
that provides fast random access.
Enums§
- IntVec
Error - Defines the set of errors that can occur in
IntVec
operations.