Struct IntVec

Source
pub struct IntVec<T: Storable, E: Endianness, B: AsRef<[u64]> = Vec<u64>> { /* private fields */ }
Expand description

A compressed, randomly accessible vector of integers using variable-length encoding.

IntVec achieves compression by using instantaneous codes and enables fast, amortized O(1) random access via a sampling mechanism. See the module-level documentation for a detailed explanation.

§Type Parameters

  • T: The integer type for the elements (e.g., u32, i16). It must implement the Storable trait.
  • E: The Endianness of the underlying bitstream (e.g., LE or BE).
  • B: The backend storage buffer, such as Vec<u64> for an owned vector or &[u64] for a borrowed, zero-copy view.

Implementations§

Source§

impl<T, E, B> IntVec<T, E, B>
where T: Storable + Send + Sync, E: Endianness + Send + Sync, B: AsRef<[u64]> + Send + Sync, for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible> + Send,

Source

pub fn par_iter(&self) -> impl ParallelIterator<Item = T> + '_

Returns a parallel iterator over the decompressed values.

This method uses Rayon to decompress the entire vector in parallel. It can provide a significant speedup on multi-core systems, especially when using a computationally intensive compression codec.

§Performance

For the specific task of full decompression, this parallel version is not always faster than the sequential iter. If the decoding operation is very fast (e.g., with VByte encoding), the operation can be limited by memory bandwidth. In such cases, the sequential iterator’s better use of CPU caches may outperform this parallel version.

§Examples
use compressed_intvec::variable::{IntVec, UIntVec};
use rayon::prelude::*;

let data: Vec<u32> = (0..1000).collect();
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();

// Use the parallel iterator to compute the sum in parallel
let sum: u32 = vec.par_iter().sum();

assert_eq!(sum, (0..1000).sum());
Source

pub fn par_get_many(&self, indices: &[usize]) -> Result<Vec<T>, IntVecError>

Retrieves multiple elements from a slice of indices in parallel.

This method uses Rayon to parallelize random access. It works by creating a separate IntVecReader for each thread and distributing the lookup work among them.

§Errors

Returns IntVecError::IndexOutOfBounds if any index is out of bounds.

§Examples
use compressed_intvec::variable::{IntVec, SIntVec};

let data: Vec<i64> = (0..1000).map(|x| x * -1).collect();
let vec: SIntVec<i64> = IntVec::from_slice(&data).unwrap();

let indices = [500, 10, 999, 0, 250];
let values = vec.par_get_many(&indices).unwrap();

assert_eq!(values, vec![-500, -10, -999, 0, -250]);
Source

pub unsafe fn par_get_many_unchecked(&self, indices: &[usize]) -> Vec<T>

Retrieves multiple elements in parallel without bounds checking.

§Safety

Calling this method with any out-of-bounds index in the indices slice is undefined behavior. In debug builds, an assertion will panic.

Source§

impl<T: Storable + 'static, E: Endianness> IntVec<T, E, Vec<u64>>

Source

pub fn builder() -> IntVecBuilder<T, E>

Creates a builder for constructing an owned IntVec from a slice of data.

This is the most flexible way to create an IntVec, allowing customization of the compression codec and sampling rate.

§Examples
use compressed_intvec::variable::{IntVec, UIntVec, VariableCodecSpec};

let data: &[u32] = &[5, 8, 13, 21, 34];
let vec: UIntVec<u32> = IntVec::builder()
    .k(2) // Sample every 2nd element
    .codec(VariableCodecSpec::Delta)
    .build(data)
    .unwrap();

assert_eq!(vec.get(3), Some(21));
Source

pub fn from_iter_builder<I>(iter: I) -> IntVecFromIterBuilder<T, E, I>
where I: IntoIterator<Item = T> + Clone,

Creates a builder for constructing an owned IntVec from an iterator.

This is useful for large datasets that are generated on the fly.

Source

pub fn into_vec(self) -> Vec<T>
where for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,

Consumes the IntVec and returns its decoded values as a standard Vec<T>.

§Examples
use compressed_intvec::variable::{IntVec, SIntVec};

let data: &[i32] = &[-10, 0, 10];
let vec: SIntVec<i32> = IntVec::from_slice(data).unwrap();
let decoded_data = vec.into_vec();

assert_eq!(decoded_data, &[-10, 0, 10]);
Source

pub fn from_slice(slice: &[T]) -> Result<Self, IntVecError>
where for<'a> BufBitWriter<E, MemWordWriterVec<u64, Vec<u64>>>: BitWrite<E, Error = Infallible> + CodesWrite<E>,

Creates an owned IntVec from a slice of data using default settings.

This method uses VariableCodecSpec::Auto to select a codec and a default sampling rate of k=16.

Source§

impl<T: Storable, E: Endianness, B: AsRef<[u64]>> IntVec<T, E, B>

Source

pub fn from_parts( data: B, samples_data: B, samples_len: usize, samples_num_bits: usize, k: usize, len: usize, encoding: Codes, ) -> Result<Self, IntVecError>

Creates a new IntVec from its raw components, enabling zero-copy views.

This constructor is intended for advanced use cases, such as memory-mapping a pre-built IntVec from disk without copying the data.

§Errors

Returns an IntVecError::InvalidParameters if k is zero or if the number of samples is inconsistent with len and k.

Source

pub fn slice( &self, start: usize, len: usize, ) -> Option<IntVecSlice<'_, T, E, B>>

Creates a zero-copy, immutable view (a slice) of this vector.

Returns None if the specified range is out of bounds.

§Examples
use compressed_intvec::variable::{IntVec, UIntVec};

let data: Vec<u32> = (0..20).collect();
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();
let slice = vec.slice(5, 10).unwrap();

assert_eq!(slice.len(), 10);
assert_eq!(slice.get(0), Some(5)); // Corresponds to index 5 of the original vec
Source

pub fn split_at( &self, mid: usize, ) -> Option<(IntVecSlice<'_, T, E, B>, IntVecSlice<'_, T, E, B>)>

Splits the vector into two immutable slices at a given index.

Returns None if mid is out of bounds.

Source

pub fn len(&self) -> usize

Returns the number of integers in the vector.

Source

pub fn is_empty(&self) -> bool

Returns true if the vector contains no elements.

Source

pub fn get_sampling_rate(&self) -> usize

Returns the sampling rate k used during encoding.

Source

pub fn get_num_samples(&self) -> usize

Returns the number of sample points stored in the vector.

Source

pub fn samples_ref(&self) -> &FixedVec<u64, u64, LE, B>

Returns a reference to the inner FixedVec of samples.

Source

pub fn as_limbs(&self) -> &[u64]

Returns a read-only slice of the underlying compressed data words (&[u64]).

Source

pub fn encoding(&self) -> Codes

Returns the concrete Codes variant that was used for compression.

Source

pub fn limbs(&self) -> Vec<u64>

Returns a clone of the underlying storage as a Vec<u64>.

Source

pub fn iter(&self) -> impl Iterator<Item = T> + '_
where for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,

Returns an iterator over the decompressed values.

Source§

impl<T: Storable, E: Endianness, B: AsRef<[u64]>> IntVec<T, E, B>
where for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,

Source

pub fn reader(&self) -> IntVecReader<'_, T, E, B>

Creates a reusable, stateless reader for efficient random access.

This method returns an IntVecReader, a struct that maintains a persistent, reusable bitstream reader. This amortizes the setup cost across multiple get operations, making it more efficient than calling get repeatedly in a loop.

This reader is stateless: it performs a full seek from the nearest sample point for each call, independently of any previous access.

§When to use it

Use IntVecReader for true random access patterns where lookup indices are sparse, unordered, or not known in advance (e.g., graph traversals, pointer chasing). For accessing a known set of indices, get_many is generally superior.

§Examples
use compressed_intvec::prelude::*;

let data: Vec<u32> = (0..100).rev().collect(); // Data is not sequential
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();

// Create a reusable reader for multiple random lookups
let mut reader = vec.reader();

assert_eq!(reader.get(99).unwrap(), Some(0));
assert_eq!(reader.get(0).unwrap(), Some(99));
assert_eq!(reader.get(50).unwrap(), Some(49));
Source

pub fn seq_reader(&self) -> IntVecSeqReader<'_, T, E, B>

Creates a stateful, reusable reader optimized for sequential access.

This method returns an IntVecSeqReader, which is specifically designed to take advantage of the vector’s internal state, tracking the current decoding position (cursor).

This statefulness enables a key optimization:

  • Fast Path: If a requested index is at or after the cursor and within the same sample block, the reader decodes forward from its last known position. This avoids a costly seek operation.
  • Fallback Path: If the requested index is before the cursor (requiring a backward move) or in a different sample block, the reader falls back to the standard behavior of seeking to the nearest sample point.
§When to use it

Use IntVecSeqReader when your access pattern has high locality, meaning indices are primarily increasing and often clustered together. It is ideal for iterating through a sorted list of indices or for stream-like processing.

§Examples
use compressed_intvec::prelude::*;

let data: Vec<u32> = (0..100).collect();
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();

// Create a reader optimized for sequential access
let mut seq_reader = vec.seq_reader();

// Accessing indices in increasing order is efficient
assert_eq!(seq_reader.get(10).unwrap(), Some(10));
// This next call is fast, as it decodes forward from index 10
assert_eq!(seq_reader.get(15).unwrap(), Some(15));

// A large jump will trigger a seek to a new sample block
assert_eq!(seq_reader.get(90).unwrap(), Some(90));

// A backward jump will also trigger a seek
assert_eq!(seq_reader.get(5).unwrap(), Some(5));
Source

pub fn get(&self, index: usize) -> Option<T>

Returns the element at the specified index, or None if the index is out of bounds.

This operation is amortized O(1).

§Examples
use compressed_intvec::variable::{IntVec, UIntVec};

let data: Vec<u32> = (0..100).collect();
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();

assert_eq!(vec.get(50), Some(50));
assert_eq!(vec.get(100), None);
Source

pub unsafe fn get_unchecked(&self, index: usize) -> T

Returns the element at the specified index without bounds checking.

§Safety

Calling this method with an out-of-bounds index is undefined behavior. The index must be less than the vector’s len.

Source

pub fn get_many(&self, indices: &[usize]) -> Result<Vec<T>, IntVecError>

Retrieves multiple elements from the vector at the specified indices.

This method is generally more efficient than calling get in a loop, as it sorts the indices and scans through the compressed data stream once.

§Errors

Returns IntVecError::IndexOutOfBounds if any index is out of bounds.

§Examples
use compressed_intvec::variable::{IntVec, UIntVec};

let data: Vec<u32> = (0..100).collect();
let vec: UIntVec<u32> = IntVec::from_slice(&data).unwrap();

let indices = [99, 0, 50];
let values = vec.get_many(&indices).unwrap();
assert_eq!(values, vec![99, 0, 50]);
Source

pub unsafe fn get_many_unchecked(&self, indices: &[usize]) -> Vec<T>

Retrieves multiple elements without bounds checking.

§Safety

Calling this method with any out-of-bounds index is undefined behavior.

Source

pub fn get_many_from_iter<I>(&self, indices: I) -> Result<Vec<T>, IntVecError>
where I: IntoIterator<Item = usize>,

Retrieves multiple elements from an iterator of indices.

This is a convenient alternative to get_many when the indices are not already in a slice. It may be less performant as it cannot pre-sort the indices for optimal access.

Source§

impl<T: Storable + Ord, E: Endianness, B: AsRef<[u64]>> IntVec<T, E, B>
where for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,

Binary searches this vector for a given element.

If the value is found, returns Ok(usize) with the index of the matching element. If the value is not found, returns Err(usize) with the index where the value could be inserted to maintain order.

§Complexity

The time complexity of this operation is O(k * log n), where n is the number of elements in the vector and k is the sampling rate. This is because each of the O(log n) probes during the search requires an element access, which has a cost proportional to k in the worst case.

§Examples
use compressed_intvec::variable::{IntVec, SIntVec};

let data: &[i32] = &[-10, 0, 10, 20, 30];
let vec: SIntVec<i32> = IntVec::from_slice(data).unwrap();

assert_eq!(vec.binary_search(&10), Ok(2));
assert_eq!(vec.binary_search(&15), Err(3));
Source

pub fn binary_search_by<F>(&self, f: F) -> Result<usize, usize>
where F: FnMut(T) -> Ordering,

Binary searches this vector with a custom comparison function.

§Complexity

The time complexity of this operation is O(k * log n), where n is the number of elements in the vector and k is the sampling rate.

Source

pub fn binary_search_by_key<K: Ord, F>( &self, b: &K, f: F, ) -> Result<usize, usize>
where F: FnMut(T) -> K,

Binary searches this vector with a key extraction function.

§Complexity

The time complexity of this operation is O(k * log n), where n is the number of elements in the vector and k is the sampling rate.

Trait Implementations§

Source§

impl<T: Clone + Storable, E: Clone + Endianness, B: Clone + AsRef<[u64]>> Clone for IntVec<T, E, B>

Source§

fn clone(&self) -> IntVec<T, E, B>

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<T: Debug + Storable, E: Debug + Endianness, B: Debug + AsRef<[u64]>> Debug for IntVec<T, E, B>

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl<T: Storable + 'static, E: Endianness + 'static> IntoIterator for IntVec<T, E, Vec<u64>>
where for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,

Source§

type Item = T

The type of the elements being iterated over.
Source§

type IntoIter = IntVecIntoIter<T, E>

Which kind of iterator are we turning this into?
Source§

fn into_iter(self) -> Self::IntoIter

Creates an iterator from a value. Read more
Source§

impl<T: Storable, E: Endianness, B: AsRef<[u64]> + MemDbgImpl> MemDbgImpl for IntVec<T, E, B>

Source§

fn _mem_dbg_rec_on( &self, writer: &mut impl Write, total_size: usize, max_depth: usize, prefix: &mut String, _is_last: bool, flags: DbgFlags, ) -> Result

Source§

fn _mem_dbg_depth_on( &self, writer: &mut impl Write, total_size: usize, max_depth: usize, prefix: &mut String, field_name: Option<&str>, is_last: bool, padded_size: usize, flags: DbgFlags, ) -> Result<(), Error>

Source§

impl<T: Storable, E: Endianness, B: AsRef<[u64]> + MemSize> MemSize for IntVec<T, E, B>

Source§

fn mem_size(&self, flags: SizeFlags) -> usize

Returns the (recursively computed) overall memory size of the structure in bytes.
Source§

impl<T, E, B, O> PartialEq<O> for IntVec<T, E, B>
where T: Storable + PartialEq, E: Endianness, B: AsRef<[u64]>, O: AsRef<[T]>, for<'a> BufBitReader<E, MemWordReader<u64, &'a [u64]>, DefaultReadParams>: BitRead<E, Error = Infallible> + CodesRead<E> + BitSeek<Error = Infallible>,

Source§

fn eq(&self, other: &O) -> bool

Checks for equality between an IntVec and a standard slice.

The comparison is done by iterating over both and comparing elements one by one. The overall comparison is not a single atomic operation.

1.0.0 · Source§

fn ne(&self, other: &Rhs) -> bool

Tests for !=. The default implementation is almost always sufficient, and should not be overridden without very good reason.

Auto Trait Implementations§

§

impl<T, E, B> Freeze for IntVec<T, E, B>
where B: Freeze,

§

impl<T, E, B> RefUnwindSafe for IntVec<T, E, B>

§

impl<T, E, B> Send for IntVec<T, E, B>
where B: Send, T: Send,

§

impl<T, E, B> Sync for IntVec<T, E, B>
where B: Sync, T: Sync,

§

impl<T, E, B> Unpin for IntVec<T, E, B>
where B: Unpin, T: Unpin, E: Unpin,

§

impl<T, E, B> UnwindSafe for IntVec<T, E, B>
where B: UnwindSafe, T: UnwindSafe, E: UnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CastableFrom<T> for T

Source§

fn cast_from(value: T) -> T

Call Self as W
Source§

impl<T, U> CastableInto<U> for T
where U: CastableFrom<T>,

Source§

fn cast(self) -> U

Call W::cast_from(self)
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> DowncastableFrom<T> for T

Source§

fn downcast_from(value: T) -> T

Truncate the current UnsignedInt to a possibly smaller size
Source§

impl<T, U> DowncastableInto<U> for T
where U: DowncastableFrom<T>,

Source§

fn downcast(self) -> U

Call W::downcast_from(self)
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> MemDbg for T
where T: MemDbgImpl,

Source§

fn mem_dbg(&self, flags: DbgFlags) -> Result<(), Error>

Writes to stderr debug infos about the structure memory usage, expanding all levels of nested structures.
Source§

fn mem_dbg_on( &self, writer: &mut impl Write, flags: DbgFlags, ) -> Result<(), Error>

Writes to a core::fmt::Write debug infos about the structure memory usage, expanding all levels of nested structures.
Source§

fn mem_dbg_depth(&self, max_depth: usize, flags: DbgFlags) -> Result<(), Error>

Writes to stderr debug infos about the structure memory usage as mem_dbg, but expanding only up to max_depth levels of nested structures.
Source§

fn mem_dbg_depth_on( &self, writer: &mut impl Write, max_depth: usize, flags: DbgFlags, ) -> Result<(), Error>

Writes to a core::fmt::Write debug infos about the structure memory usage as mem_dbg_on, but expanding only up to max_depth levels of nested structures.
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Splat<T> for T

Source§

fn splat(value: T) -> T

Source§

impl<T> To<T> for T

Source§

fn to(self) -> T

Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> UpcastableFrom<T> for T

Source§

fn upcast_from(value: T) -> T

Extend the current UnsignedInt to a possibly bigger size.
Source§

impl<T, U> UpcastableInto<U> for T
where U: UpcastableFrom<T>,

Source§

fn upcast(self) -> U

Call W::upcast_from(self)
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V