pub struct ColumnData<C: ColumnCursor> {
pub len: usize,
pub slabs: SpanTree<Slab, C::SlabIndex>,
/* private fields */
}Expand description
A compressed, mutable column of optional typed values.
ColumnData<C> stores a sequence of Option<C::Item> values using the encoding
determined by cursor type C. Data is held internally in a SpanTree of Slabs;
modifications replace individual slabs, leaving the rest untouched.
§Common cursor types
UIntCursor, IntCursor,
StrCursor, ByteCursor,
BooleanCursor, DeltaCursor,
RawCursor.
§Example
use hexane::{ColumnData, UIntCursor};
use std::borrow::Cow;
let mut col: ColumnData<UIntCursor> = ColumnData::new();
col.splice(0, 0, [1u64, 2, 3]);
assert_eq!(col.get(1), Some(Some(Cow::Owned(2))));
assert_eq!(col.to_vec(), vec![Some(1), Some(2), Some(3)]);Fields§
§len: usize§slabs: SpanTree<Slab, C::SlabIndex>Implementations§
Source§impl<C: ColumnCursor> ColumnData<C>
impl<C: ColumnCursor> ColumnData<C>
Sourcepub fn byte_len(&self) -> usize
pub fn byte_len(&self) -> usize
Total number of bytes used by all slabs (encoded, compressed size).
Sourcepub fn get(&self, index: usize) -> Option<Option<Cow<'_, C::Item>>>
pub fn get(&self, index: usize) -> Option<Option<Cow<'_, C::Item>>>
Returns the value at index, or None if the index is out of bounds.
The inner Option is None for null entries and Some(value) otherwise.
This is O(log n + B) where B is the number of encoded runs in the target slab.
For multiple sequential reads prefer ColumnData::iter or ColumnData::iter_range.
Sourcepub fn get_acc_delta(
&self,
index1: usize,
index2: usize,
) -> (Acc, Option<Cow<'_, C::Item>>)
pub fn get_acc_delta( &self, index1: usize, index2: usize, ) -> (Acc, Option<Cow<'_, C::Item>>)
Returns the change in accumulator between index1 and index2, together with
the item at index2.
Panics if index1 > index2.
Sourcepub fn get_acc(&self, index: usize) -> Acc
pub fn get_acc(&self, index: usize) -> Acc
Returns the cumulative Acc for all items before index
(i.e. the sum of agg(item) for items 0..index).
Sourcepub fn get_with_acc(
&self,
index: usize,
) -> Option<ColGroupItem<'_, <C as ColumnCursor>::Item>>
pub fn get_with_acc( &self, index: usize, ) -> Option<ColGroupItem<'_, <C as ColumnCursor>::Item>>
Returns the item at index together with the Acc value immediately before it,
or None if the index is out of bounds.
Sourcepub fn is_empty(&self) -> bool
pub fn is_empty(&self) -> bool
Returns true if every item in the column is null (None) or, for
BooleanCursor, if every value is false.
An empty column (len() == 0) is also considered empty.
pub fn dump(&self)
Sourcepub fn and_remap<F>(self, f: F) -> Self
pub fn and_remap<F>(self, f: F) -> Self
Returns a new column with every item transformed by f.
Equivalent to consuming self and re-encoding all items through f.
For an in-place version see ColumnData::remap.
Sourcepub fn remap<F>(&mut self, f: F)
pub fn remap<F>(&mut self, f: F)
Replaces the column with a re-encoded version where every item has been
transformed by f. For a consuming version see ColumnData::and_remap.
Sourcepub fn save_to(&self, out: &mut Vec<u8>) -> Range<usize>
pub fn save_to(&self, out: &mut Vec<u8>) -> Range<usize>
Serializes the column by appending encoded bytes to out.
Returns the byte range written (out[range] is the serialized column data).
The output is compatible with ColumnData::load. If the column is empty (zero items),
nothing is written and an empty range is returned.
pub fn raw_reader(&self, advance: usize) -> RawReader<'_, C::SlabIndex>
Source§impl<C: ColumnCursor> ColumnData<C>
impl<C: ColumnCursor> ColumnData<C>
Sourcepub fn run_iter(&self) -> impl Iterator<Item = Run<'_, C::Item>>
pub fn run_iter(&self) -> impl Iterator<Item = Run<'_, C::Item>>
Iterates over the raw Runs in the column.
Each Run has a count and an optional value. This gives lower-level access to the
RLE structure than iter() — useful for re-encoding or bulk inspection.
Sourcepub fn to_vec(&self) -> Vec<C::Export>
pub fn to_vec(&self) -> Vec<C::Export>
Decodes all items into a Vec. Primarily useful for testing and debugging.
Sourcepub fn iter(&self) -> ColumnDataIter<'_, C> ⓘ
pub fn iter(&self) -> ColumnDataIter<'_, C> ⓘ
Returns a forward iterator over all items in the column.
The iterator decodes one slab at a time, carrying state across items within each slab
for amortized O(1) per-item cost after an O(log n) initial seek.
For a sub-range use iter_range.
Sourcepub fn scope_to_value<B, R>(&self, value: Option<B>, range: R) -> Range<usize>
pub fn scope_to_value<B, R>(&self, value: Option<B>, range: R) -> Range<usize>
Returns the contiguous index range where value appears within range.
Requires that the values in range are sorted. Uses B-tree binary search over slabs
followed by a linear scan within the target slab.
Returns an empty range at the found position if value is not present.
For repeated lookups on the same iterator use ColumnDataIter::seek_to_value.
Sourcepub fn iter_range(&self, range: Range<usize>) -> ColumnDataIter<'_, C> ⓘ
pub fn iter_range(&self, range: Range<usize>) -> ColumnDataIter<'_, C> ⓘ
Returns an iterator over items in range, clamped to the column’s length.
Sourcepub fn extend<'b, M, I>(&mut self, values: I) -> Acc
pub fn extend<'b, M, I>(&mut self, values: I) -> Acc
Appends multiple values to the end of the column.
Returns the total Acc contributed by the appended values.
Sourcepub fn splice<'b, M, I>(&mut self, index: usize, del: usize, values: I) -> Acc
pub fn splice<'b, M, I>(&mut self, index: usize, del: usize, values: I) -> Acc
Removes del items starting at index and inserts values in their place.
This is the primary mutation method. It finds the slab containing index in O(log n),
re-encodes the affected slab with the deletion/insertion applied, then replaces it in
the B-tree. Unaffected slabs are not touched.
Returns the accumulated Acc of the inserted values.
Panics if index > self.len().
Sourcepub fn fill_if_empty(&mut self, len: usize) -> bool
pub fn fill_if_empty(&mut self, len: usize) -> bool
If the column is currently empty, fills it with len null values and returns true.
If the column already has items, returns false without modifying it.
Sourcepub fn init_empty(len: usize) -> Self
pub fn init_empty(len: usize) -> Self
Creates a column of len null values.
Sourcepub fn load_unless_empty(data: &[u8], len: usize) -> Result<Self, PackError>
pub fn load_unless_empty(data: &[u8], len: usize) -> Result<Self, PackError>
Deserializes data, or returns a column of len nulls if data is empty.
Returns PackError::InvalidLength if the decoded column has a different length
than len.
Sourcepub fn load_with_unless_empty<F>(
data: &[u8],
len: usize,
test: &F,
) -> Result<Self, PackError>
pub fn load_with_unless_empty<F>( data: &[u8], len: usize, test: &F, ) -> Result<Self, PackError>
Like load_unless_empty but also validates each value
with test. If test returns Some(msg), decoding fails with
PackError::InvalidValue.
Source§impl<C: ColumnCursor> ColumnData<C>
impl<C: ColumnCursor> ColumnData<C>
Sourcepub fn find_by_range(
&self,
range: Range<usize>,
) -> impl Iterator<Item = usize> + '_
pub fn find_by_range( &self, range: Range<usize>, ) -> impl Iterator<Item = usize> + '_
Returns an iterator over the indices of items whose value falls within range.
Uses slab-level min/max metadata to skip slabs that cannot contain matching values,
making this efficient for sparse matches. Requires that the cursor type supports
min/max tracking (HasMinMax).
Trait Implementations§
Source§impl<C: Clone + ColumnCursor> Clone for ColumnData<C>
impl<C: Clone + ColumnCursor> Clone for ColumnData<C>
Source§fn clone(&self) -> ColumnData<C>
fn clone(&self) -> ColumnData<C>
1.0.0 · Source§fn clone_from(&mut self, source: &Self)
fn clone_from(&mut self, source: &Self)
source. Read more