Expand description
A generic, compressed, and randomly accessible vector with fixed-width encoding.
This module provides FixedVec
and its thread-safe counterpart, AtomicFixedVec
,
two data structures for storing sequences of integers where each element is encoded
using the same number of bits. This strategy, known as fixed-width encoding,
is suitable for data where values are bounded within a known range, as it
allows for O(1) random access by directly calculating the memory location of any element.
§Core Concepts
The design is generic over four parameters: FixedVec<T, W, E, B>
.
T
: The element type as seen by the user (e.g.,u32
,i16
), constrained by theStorable
trait.W
: The underlying unsigned integer type used for storage (e.g.,u64
,usize
), constrained by theWord
trait.E
: TheEndianness
(e.g.,LE
orBE
) for bit-level operations.B
: The backing buffer, which abstracts over ownership. This allowsFixedVec
to be an owned container (e.g.,Vec<W>
) or a zero-copy, borrowed view (e.g.,&[W]
).
§Immutability and Zero-Copy Views
Immutable access is performed in O(1) time by calculating an element’s bit-level offset.
Structures like FixedVecSlice
provide zero-copy views into a vector, representing
a sub-region of the data without requiring data duplication or new allocations.
§Mutability via Proxy Objects
Direct mutable references (&mut T
) to individual bit-packed elements are not possible, as
elements do not align with byte boundaries and may not even exist as discrete entities in memory.
Instead, mutability is managed through the proxy object pattern. Methods like at_mut
return a temporary proxy (MutProxy
) that holds a decoded copy of the value.
Modifications are applied to this copy, and the value is automatically encoded and written
back into the vector’s bitstream when the proxy object goes out of scope (i.e., is dropped).
§Core Data Structures
FixedVec
: The primary implementation for single-threaded contexts.AtomicFixedVec
: A thread-safe variant for concurrent applications. It provides an API analogous to Rust’s standard atomic types (load
,store
,fetch_add
, etc.). It uses lock-free atomic instructions for elements contained within a single machine word and a fine-grained locking strategy (lock striping) for elements that span word boundaries. This hybrid approach ensures thread safety for anybit-width
configuration.
§Main Components
FixedVec
andAtomicFixedVec
BitWidth
: An enum to control the bit-width selection strategy. Options include:BitWidth::Minimal
: Selects the minimal bit-width required.BitWidth::PowerOfTwo
: Rounds up to the nearest power of two.BitWidth::Explicit
: Allows specifying a fixed bit-width.
- Builders:
FixedVecBuilder
andFixedVecFromIterBuilder
- Slices:
FixedVecSlice
for creating immutable or mutable views.
§Examples
Create a FixedVec
from a slice of data. The builder will
automatically determine the minimal number of bits required.
use compressed_intvec::fixed::{FixedVec, UFixedVec};
// The numbers 0-7 can all be represented in 3 bits.
let data: Vec<u32> = (0..8).collect();
// The builder infers that `bit_width` should be 3.
let vec: UFixedVec<u32> = FixedVec::builder()
.build(&data)
.unwrap();
assert_eq!(vec.len(), 8);
assert_eq!(vec.bit_width(), 3);
assert_eq!(vec.get(5), Some(5));
§Storing Signed Integers
FixedVec
can store signed integers. The underlying storage uses zig-zag encoding,
which maps small negative and positive numbers to small unsigned integers.
use compressed_intvec::fixed::{FixedVec, SFixedVec};
// The values range from -2 to 1. Zig-zag encoding maps these to
// unsigned values, so the maximum value is 3, which
// requires 2 bits.
let data: &[i16] = &[-2, -1, 0, 1];
let vec: SFixedVec<i16> = FixedVec::builder().build(data).unwrap();
assert_eq!(vec.bit_width(), 2);
assert_eq!(vec.get(0), Some(-2));
assert_eq!(vec.get(3), Some(1));
§Implementation Notes
To ensure safe and efficient memory access, FixedVec
adds one padding
word at the end of its storage buffer. This padding prevents out-of-bounds
reads in methods like get_unchecked
and is a
prerequisite for unaligned access with get_unaligned_unchecked
.
The builders handle this padding automatically. When creating a FixedVec
from raw parts,
it is the caller’s responsibility to ensure this padding is present.
§Common Type Aliases
To simplify usage, this crate provides several type aliases for common FixedVec
configurations. In most cases, you should prefer using these aliases over the
fully generic FixedVec<T, W, E, B>
struct.
§General-Purpose Aliases
These aliases use usize
for the storage word, which is often the most
efficient choice for the target architecture.
UFixedVec<T>
: For unsigned integers (e.g.,u8
,u16
,u32
).SFixedVec<T>
: For signed integers (e.g.,i8
,i16
,i32
).
§Concrete Aliases for u64
/i64
These aliases are specialized for u64
/i64
elements stored in u64
words:
LEFixedVec
:u64
elements, Little-Endian.BEFixedVec
:u64
elements, Big-Endian.LESFixedVec
:i64
elements, Little-Endian.BESFixedVec
:i64
elements, Big-Endian.
The atomic
module provides a similar set of aliases like UAtomicFixedVec
and SAtomicFixedVec
.
Re-exports§
pub use atomic::AtomicFixedVec;
pub use atomic::SAtomicFixedVec;
pub use atomic::UAtomicFixedVec;
Modules§
- atomic
- A thread-safe, compressed vector of integers with fixed-width encoding.
- builder
FixedVec
Builders- iter
FixedVec
Iterators- iter_
mut - Mutable Iterators
- macros
- Macros for
FixedVec
- parallel
FixedVec
Parallel Operations- proxy
- Mutable Access Proxy
- slice
- Zero-Copy Slices
- traits
- Core Traits for
FixedVec
Structs§
- Fixed
Vec - A compressed vector of integers with fixed-width encoding.
Enums§
- BitWidth
- Specifies the strategy for determining the number of bits for each integer.
- Error
- Defines errors that can occur during
FixedVec
operations.
Type Aliases§
- BEFixed
Vec - A
FixedVec
foru64
elements with au64
backend and Big-Endian layout. - BESFixed
Vec - A
FixedVec
fori64
elements with au64
backend and Big-Endian layout. - LEFixed
Vec - A
FixedVec
foru64
elements with au64
backend and Little-Endian layout. - LESFixed
Vec - A
FixedVec
fori64
elements with au64
backend and Little-Endian layout. - SFixed
Vec - A
FixedVec
for signed integers with ausize
word and Little-Endian layout. - UFixed
Vec - A
FixedVec
for unsigned integers with ausize
word and Little-Endian layout.