pub struct IntVec<T: Storable, E: Endianness, B: AsRef<[u64]> = Vec<u64>> { /* private fields */ }
Expand description
A compressed, randomly accessible vector of integers using variable-length encoding.
IntVec
achieves compression by using instantaneous codes and enables fast,
amortized O(1) random access via a sampling mechanism. See the
module-level documentation for a detailed explanation.
§Type Parameters
T
: The integer type for the elements (e.g.,u32
,i16
). It must implement theStorable
trait.E
: TheEndianness
of the underlying bitstream (e.g.,LE
orBE
).B
: The backend storage buffer, such asVec<u64>
for an owned vector or&[u64]
for a borrowed, zero-copy view.
Implementations§
Source§impl<T, E, B> IntVec<T, E, B>where
T: Storable + Send + Sync,
E: Endianness + Send + Sync,
B: AsRef<[u64]> + Send + Sync,
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible> + Send,
impl<T, E, B> IntVec<T, E, B>where
T: Storable + Send + Sync,
E: Endianness + Send + Sync,
B: AsRef<[u64]> + Send + Sync,
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible> + Send,
Sourcepub fn par_iter(&self) -> impl ParallelIterator<Item = T> + '_
pub fn par_iter(&self) -> impl ParallelIterator<Item = T> + '_
Returns a parallel iterator over the decompressed values.
This method uses Rayon to decompress the entire vector in parallel. It can provide a significant speedup on multi-core systems, especially when using a computationally intensive compression codec.
§Performance
For the specific task of full decompression, this parallel version is not
always faster than the sequential iter
. If the
decoding operation is very fast (e.g., with VByte
encoding), the
operation can be limited by memory bandwidth. In such cases, the
sequential iterator’s better use of CPU caches may outperform this
parallel version.
§Examples
use compressed_intvec::variable::{IntVec, UIntVec};
use rayon::prelude::*;
let data: Vec<u32> = (0..1000).collect();
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();
// Use the parallel iterator to compute the sum in parallel
let sum: u32 = vec.par_iter().sum();
assert_eq!(sum, (0..1000).sum());
Sourcepub fn par_get_many(&self, indices: &[usize]) -> Result<Vec<T>, IntVecError>
pub fn par_get_many(&self, indices: &[usize]) -> Result<Vec<T>, IntVecError>
Retrieves multiple elements from a slice of indices in parallel.
This method uses Rayon to parallelize random access. It works by creating
a separate IntVecReader
for each thread and
distributing the lookup work among them.
§Errors
Returns IntVecError::IndexOutOfBounds
if any index is out of bounds.
§Examples
use compressed_intvec::variable::{IntVec, SIntVec};
let data: Vec<i64> = (0..1000).map(|x| x * -1).collect();
let vec: SIntVec<i64> = IntVec::from_slice(&data).unwrap();
let indices = [500, 10, 999, 0, 250];
let values = vec.par_get_many(&indices).unwrap();
assert_eq!(values, vec![-500, -10, -999, 0, -250]);
Sourcepub unsafe fn par_get_many_unchecked(&self, indices: &[usize]) -> Vec<T>
pub unsafe fn par_get_many_unchecked(&self, indices: &[usize]) -> Vec<T>
Retrieves multiple elements in parallel without bounds checking.
§Safety
Calling this method with any out-of-bounds index in the indices
slice
is undefined behavior. In debug builds, an assertion will panic.
Source§impl<T: Storable + 'static, E: Endianness> IntVec<T, E, Vec<u64>>
impl<T: Storable + 'static, E: Endianness> IntVec<T, E, Vec<u64>>
Sourcepub fn builder() -> IntVecBuilder<T, E>
pub fn builder() -> IntVecBuilder<T, E>
Creates a builder for constructing an owned IntVec
from a slice of data.
This is the most flexible way to create an IntVec
, allowing customization
of the compression codec and sampling rate.
§Examples
use compressed_intvec::variable::{IntVec, UIntVec, VariableCodecSpec};
let data: &[u32] = &[5, 8, 13, 21, 34];
let vec: UIntVec<u32> = IntVec::builder()
.k(2) // Sample every 2nd element
.codec(VariableCodecSpec::Delta)
.build(data)
.unwrap();
assert_eq!(vec.get(3), Some(21));
Sourcepub fn from_iter_builder<I>(iter: I) -> IntVecFromIterBuilder<T, E, I>where
I: IntoIterator<Item = T> + Clone,
pub fn from_iter_builder<I>(iter: I) -> IntVecFromIterBuilder<T, E, I>where
I: IntoIterator<Item = T> + Clone,
Creates a builder for constructing an owned IntVec
from an iterator.
This is useful for large datasets that are generated on the fly.
Sourcepub fn into_vec(self) -> Vec<T>where
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
pub fn into_vec(self) -> Vec<T>where
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
Consumes the IntVec
and returns its decoded values as a standard Vec<T>
.
§Examples
use compressed_intvec::variable::{IntVec, SIntVec};
let data: &[i32] = &[-10, 0, 10];
let vec: SIntVec<i32> = IntVec::from_slice(data).unwrap();
let decoded_data = vec.into_vec();
assert_eq!(decoded_data, &[-10, 0, 10]);
Sourcepub fn from_slice(slice: &[T]) -> Result<Self, IntVecError>where
for<'a> BufBitWriter<E, MemWordWriterVec<u64, Vec<u64>>>: BitWrite<E, Error = Infallible> + CodesWrite<E>,
pub fn from_slice(slice: &[T]) -> Result<Self, IntVecError>where
for<'a> BufBitWriter<E, MemWordWriterVec<u64, Vec<u64>>>: BitWrite<E, Error = Infallible> + CodesWrite<E>,
Creates an owned IntVec
from a slice of data using default settings.
This method uses VariableCodecSpec::Auto
to select a codec and a
default sampling rate of k=16
.
Source§impl<T: Storable, E: Endianness, B: AsRef<[u64]>> IntVec<T, E, B>
impl<T: Storable, E: Endianness, B: AsRef<[u64]>> IntVec<T, E, B>
Sourcepub fn from_parts(
data: B,
samples_data: B,
samples_len: usize,
samples_num_bits: usize,
k: usize,
len: usize,
encoding: Codes,
) -> Result<Self, IntVecError>
pub fn from_parts( data: B, samples_data: B, samples_len: usize, samples_num_bits: usize, k: usize, len: usize, encoding: Codes, ) -> Result<Self, IntVecError>
Creates a new IntVec
from its raw components, enabling zero-copy views.
This constructor is intended for advanced use cases, such as memory-mapping
a pre-built IntVec
from disk without copying the data.
§Errors
Returns an IntVecError::InvalidParameters
if k
is zero or if the
number of samples is inconsistent with len
and k
.
Sourcepub fn slice(
&self,
start: usize,
len: usize,
) -> Option<IntVecSlice<'_, T, E, B>>
pub fn slice( &self, start: usize, len: usize, ) -> Option<IntVecSlice<'_, T, E, B>>
Creates a zero-copy, immutable view (a slice) of this vector.
Returns None
if the specified range is out of bounds.
§Examples
use compressed_intvec::variable::{IntVec, UIntVec};
let data: Vec<u32> = (0..20).collect();
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();
let slice = vec.slice(5, 10).unwrap();
assert_eq!(slice.len(), 10);
assert_eq!(slice.get(0), Some(5)); // Corresponds to index 5 of the original vec
Sourcepub fn split_at(
&self,
mid: usize,
) -> Option<(IntVecSlice<'_, T, E, B>, IntVecSlice<'_, T, E, B>)>
pub fn split_at( &self, mid: usize, ) -> Option<(IntVecSlice<'_, T, E, B>, IntVecSlice<'_, T, E, B>)>
Splits the vector into two immutable slices at a given index.
Returns None
if mid
is out of bounds.
Sourcepub fn get_sampling_rate(&self) -> usize
pub fn get_sampling_rate(&self) -> usize
Returns the sampling rate k
used during encoding.
Sourcepub fn get_num_samples(&self) -> usize
pub fn get_num_samples(&self) -> usize
Returns the number of sample points stored in the vector.
Sourcepub fn samples_ref(&self) -> &FixedVec<u64, u64, LE, B>
pub fn samples_ref(&self) -> &FixedVec<u64, u64, LE, B>
Returns a reference to the inner FixedVec
of samples.
Sourcepub fn as_limbs(&self) -> &[u64]
pub fn as_limbs(&self) -> &[u64]
Returns a read-only slice of the underlying compressed data words (&[u64]
).
Sourcepub fn encoding(&self) -> Codes
pub fn encoding(&self) -> Codes
Returns the concrete Codes
variant that was used for compression.
Sourcepub fn iter(&self) -> impl Iterator<Item = T> + '_where
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
pub fn iter(&self) -> impl Iterator<Item = T> + '_where
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
Returns an iterator over the decompressed values.
Source§impl<T: Storable, E: Endianness, B: AsRef<[u64]>> IntVec<T, E, B>where
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
impl<T: Storable, E: Endianness, B: AsRef<[u64]>> IntVec<T, E, B>where
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
Sourcepub fn reader(&self) -> IntVecReader<'_, T, E, B>
pub fn reader(&self) -> IntVecReader<'_, T, E, B>
Creates a reusable, stateless reader for efficient random access.
This method returns an IntVecReader
, a struct that maintains a persistent,
reusable bitstream reader. This amortizes the setup cost across multiple get
operations, making it more efficient than calling get
repeatedly in a loop.
This reader is stateless: it performs a full seek from the nearest sample point for each call, independently of any previous access.
§When to use it
Use IntVecReader
for true random access patterns where lookup indices are sparse,
unordered, or not known in advance (e.g., graph traversals, pointer chasing).
For accessing a known set of indices, get_many
is generally superior.
§Examples
use compressed_intvec::prelude::*;
let data: Vec<u32> = (0..100).rev().collect(); // Data is not sequential
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();
// Create a reusable reader for multiple random lookups
let mut reader = vec.reader();
assert_eq!(reader.get(99).unwrap(), Some(0));
assert_eq!(reader.get(0).unwrap(), Some(99));
assert_eq!(reader.get(50).unwrap(), Some(49));
Sourcepub fn seq_reader(&self) -> IntVecSeqReader<'_, T, E, B>
pub fn seq_reader(&self) -> IntVecSeqReader<'_, T, E, B>
Creates a stateful, reusable reader optimized for sequential access.
This method returns an IntVecSeqReader
, which is specifically designed
to take advantage of the vector’s internal state, tracking the current decoding position (cursor).
This statefulness enables a key optimization:
- Fast Path: If a requested index is at or after the cursor and within the same sample block, the reader decodes forward from its last known position. This avoids a costly seek operation.
- Fallback Path: If the requested index is before the cursor (requiring a backward move) or in a different sample block, the reader falls back to the standard behavior of seeking to the nearest sample point.
§When to use it
Use IntVecSeqReader
when your access pattern has high locality, meaning
indices are primarily increasing and often clustered together. It is ideal
for iterating through a sorted list of indices or for stream-like processing.
§Examples
use compressed_intvec::prelude::*;
let data: Vec<u32> = (0..100).collect();
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();
// Create a reader optimized for sequential access
let mut seq_reader = vec.seq_reader();
// Accessing indices in increasing order is efficient
assert_eq!(seq_reader.get(10).unwrap(), Some(10));
// This next call is fast, as it decodes forward from index 10
assert_eq!(seq_reader.get(15).unwrap(), Some(15));
// A large jump will trigger a seek to a new sample block
assert_eq!(seq_reader.get(90).unwrap(), Some(90));
// A backward jump will also trigger a seek
assert_eq!(seq_reader.get(5).unwrap(), Some(5));
Sourcepub fn get(&self, index: usize) -> Option<T>
pub fn get(&self, index: usize) -> Option<T>
Returns the element at the specified index, or None
if the index is
out of bounds.
This operation is amortized O(1).
§Examples
use compressed_intvec::variable::{IntVec, UIntVec};
let data: Vec<u32> = (0..100).collect();
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();
assert_eq!(vec.get(50), Some(50));
assert_eq!(vec.get(100), None);
Sourcepub unsafe fn get_unchecked(&self, index: usize) -> T
pub unsafe fn get_unchecked(&self, index: usize) -> T
Returns the element at the specified index without bounds checking.
§Safety
Calling this method with an out-of-bounds index
is undefined behavior.
The index
must be less than the vector’s len
.
Sourcepub fn get_many(&self, indices: &[usize]) -> Result<Vec<T>, IntVecError>
pub fn get_many(&self, indices: &[usize]) -> Result<Vec<T>, IntVecError>
Retrieves multiple elements from the vector at the specified indices.
This method is generally more efficient than calling get
in a loop, as
it sorts the indices and scans through the compressed data stream once.
§Errors
Returns IntVecError::IndexOutOfBounds
if any index is out of bounds.
§Examples
use compressed_intvec::variable::{IntVec, UIntVec};
let data: Vec<u32> = (0..100).collect();
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();
let indices = [99, 0, 50];
let values = vec.get_many(&indices).unwrap();
assert_eq!(values, vec![99, 0, 50]);
Sourcepub unsafe fn get_many_unchecked(&self, indices: &[usize]) -> Vec<T>
pub unsafe fn get_many_unchecked(&self, indices: &[usize]) -> Vec<T>
Retrieves multiple elements without bounds checking.
§Safety
Calling this method with any out-of-bounds index is undefined behavior.
Sourcepub fn get_many_from_iter<I>(&self, indices: I) -> Result<Vec<T>, IntVecError>where
I: IntoIterator<Item = usize>,
pub fn get_many_from_iter<I>(&self, indices: I) -> Result<Vec<T>, IntVecError>where
I: IntoIterator<Item = usize>,
Retrieves multiple elements from an iterator of indices.
This is a convenient alternative to get_many
when the indices are not
already in a slice. It may be less performant as it cannot pre-sort the
indices for optimal access.
Source§impl<T: Storable + Ord, E: Endianness, B: AsRef<[u64]>> IntVec<T, E, B>where
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
impl<T: Storable + Ord, E: Endianness, B: AsRef<[u64]>> IntVec<T, E, B>where
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
Sourcepub fn binary_search(&self, value: &T) -> Result<usize, usize>
pub fn binary_search(&self, value: &T) -> Result<usize, usize>
Binary searches this vector for a given element.
If the value is found, returns Ok(usize)
with the index of the
matching element. If the value is not found, returns Err(usize)
with
the index where the value could be inserted to maintain order.
§Complexity
The time complexity of this operation is O(k * log n), where n
is the
number of elements in the vector and k
is the sampling rate. This is
because each of the O(log n) probes during the search requires an
element access, which has a cost proportional to k
in the worst case.
§Examples
use compressed_intvec::variable::{IntVec, SIntVec};
let data: &[i32] = &[-10, 0, 10, 20, 30];
let vec: SIntVec<i32> = IntVec::from_slice(data).unwrap();
assert_eq!(vec.binary_search(&10), Ok(2));
assert_eq!(vec.binary_search(&15), Err(3));
Sourcepub fn binary_search_by<F>(&self, f: F) -> Result<usize, usize>
pub fn binary_search_by<F>(&self, f: F) -> Result<usize, usize>
Binary searches this vector with a custom comparison function.
§Complexity
The time complexity of this operation is O(k * log n), where n
is the
number of elements in the vector and k
is the sampling rate.
Sourcepub fn binary_search_by_key<K: Ord, F>(
&self,
b: &K,
f: F,
) -> Result<usize, usize>where
F: FnMut(T) -> K,
pub fn binary_search_by_key<K: Ord, F>(
&self,
b: &K,
f: F,
) -> Result<usize, usize>where
F: FnMut(T) -> K,
Binary searches this vector with a key extraction function.
§Complexity
The time complexity of this operation is O(k * log n), where n
is the
number of elements in the vector and k
is the sampling rate.
Trait Implementations§
Source§impl<T: Clone + Storable, E: Clone + Endianness, B: Clone + AsRef<[u64]>> Clone for IntVec<T, E, B>
impl<T: Clone + Storable, E: Clone + Endianness, B: Clone + AsRef<[u64]>> Clone for IntVec<T, E, B>
Source§impl<T: Debug + Storable, E: Debug + Endianness, B: Debug + AsRef<[u64]>> Debug for IntVec<T, E, B>
impl<T: Debug + Storable, E: Debug + Endianness, B: Debug + AsRef<[u64]>> Debug for IntVec<T, E, B>
Source§impl<T: Storable + 'static, E: Endianness + 'static> IntoIterator for IntVec<T, E, Vec<u64>>where
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
impl<T: Storable + 'static, E: Endianness + 'static> IntoIterator for IntVec<T, E, Vec<u64>>where
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
Source§impl<T: Storable, E: Endianness, B: AsRef<[u64]> + MemDbgImpl> MemDbgImpl for IntVec<T, E, B>
impl<T: Storable, E: Endianness, B: AsRef<[u64]> + MemDbgImpl> MemDbgImpl for IntVec<T, E, B>
fn _mem_dbg_rec_on( &self, writer: &mut impl Write, total_size: usize, max_depth: usize, prefix: &mut String, _is_last: bool, flags: DbgFlags, ) -> Result
fn _mem_dbg_depth_on( &self, writer: &mut impl Write, total_size: usize, max_depth: usize, prefix: &mut String, field_name: Option<&str>, is_last: bool, padded_size: usize, flags: DbgFlags, ) -> Result<(), Error>
Source§impl<T, E, B, O> PartialEq<O> for IntVec<T, E, B>where
T: Storable + PartialEq,
E: Endianness,
B: AsRef<[u64]>,
O: AsRef<[T]>,
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
impl<T, E, B, O> PartialEq<O> for IntVec<T, E, B>where
T: Storable + PartialEq,
E: Endianness,
B: AsRef<[u64]>,
O: AsRef<[T]>,
for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,
Auto Trait Implementations§
impl<T, E, B> Freeze for IntVec<T, E, B>where
B: Freeze,
impl<T, E, B> RefUnwindSafe for IntVec<T, E, B>
impl<T, E, B> Send for IntVec<T, E, B>
impl<T, E, B> Sync for IntVec<T, E, B>
impl<T, E, B> Unpin for IntVec<T, E, B>
impl<T, E, B> UnwindSafe for IntVec<T, E, B>
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T, U> CastableInto<U> for Twhere
U: CastableFrom<T>,
impl<T, U> CastableInto<U> for Twhere
U: CastableFrom<T>,
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
Source§impl<T> DowncastableFrom<T> for T
impl<T> DowncastableFrom<T> for T
Source§fn downcast_from(value: T) -> T
fn downcast_from(value: T) -> T
Source§impl<T, U> DowncastableInto<U> for Twhere
U: DowncastableFrom<T>,
impl<T, U> DowncastableInto<U> for Twhere
U: DowncastableFrom<T>,
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left
is true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self
into a Left
variant of Either<Self, Self>
if into_left(&self)
returns true
.
Converts self
into a Right
variant of Either<Self, Self>
otherwise. Read moreSource§impl<T> MemDbg for Twhere
T: MemDbgImpl,
impl<T> MemDbg for Twhere
T: MemDbgImpl,
Source§fn mem_dbg(&self, flags: DbgFlags) -> Result<(), Error>
fn mem_dbg(&self, flags: DbgFlags) -> Result<(), Error>
Source§fn mem_dbg_on(
&self,
writer: &mut impl Write,
flags: DbgFlags,
) -> Result<(), Error>
fn mem_dbg_on( &self, writer: &mut impl Write, flags: DbgFlags, ) -> Result<(), Error>
core::fmt::Write
debug infos about the structure memory
usage, expanding all levels of nested structures.Source§fn mem_dbg_depth(&self, max_depth: usize, flags: DbgFlags) -> Result<(), Error>
fn mem_dbg_depth(&self, max_depth: usize, flags: DbgFlags) -> Result<(), Error>
mem_dbg
, but expanding only up to max_depth
levels of nested structures.Source§fn mem_dbg_depth_on(
&self,
writer: &mut impl Write,
max_depth: usize,
flags: DbgFlags,
) -> Result<(), Error>
fn mem_dbg_depth_on( &self, writer: &mut impl Write, max_depth: usize, flags: DbgFlags, ) -> Result<(), Error>
core::fmt::Write
debug infos about the structure memory
usage as mem_dbg_on
, but expanding only up to
max_depth
levels of nested structures.