Skip to main content

DataChunk

Struct DataChunk 

Source
pub struct DataChunk { /* private fields */ }
Expand description

A batch of rows stored column-wise for vectorized processing.

Instead of storing rows like [(a1,b1), (a2,b2), ...], we store columns like [a1,a2,...], [b1,b2,...]. This is cache-friendly for analytical queries that touch few columns but many rows.

The optional SelectionVector lets you filter rows without copying data - just mark which row indices are “selected”.

§Example

use grafeo_core::execution::DataChunk;
use grafeo_core::execution::ValueVector;
use grafeo_common::types::Value;

// Create columns
let names = ValueVector::from_values(&[Value::from("Alix"), Value::from("Gus")]);
let ages = ValueVector::from_values(&[Value::from(30i64), Value::from(25i64)]);

// Bundle into a chunk
let chunk = DataChunk::new(vec![names, ages]);
assert_eq!(chunk.len(), 2);

Implementations§

Source§

impl DataChunk

Source

pub fn empty() -> Self

Creates an empty data chunk with no columns.

Source

pub fn new(columns: Vec<ValueVector>) -> Self

Creates a new data chunk from existing vectors.

Source

pub fn with_schema(column_types: &[LogicalType]) -> Self

Creates a new empty data chunk with the given schema.

Source

pub fn with_capacity(column_types: &[LogicalType], capacity: usize) -> Self

Creates a new data chunk with the given schema and capacity.

Source

pub fn column_count(&self) -> usize

Returns the number of columns.

Source

pub fn row_count(&self) -> usize

Returns the number of rows (considering selection).

Source

pub fn len(&self) -> usize

Alias for row_count().

Source

pub fn columns(&self) -> &[ValueVector]

Returns all columns.

Source

pub fn total_row_count(&self) -> usize

Returns the total number of rows (ignoring selection).

Source

pub fn is_empty(&self) -> bool

Returns true if the chunk is empty.

Source

pub fn capacity(&self) -> usize

Returns the capacity of this chunk.

Source

pub fn is_full(&self) -> bool

Returns true if the chunk is full.

Source

pub fn column(&self, index: usize) -> Option<&ValueVector>

Gets a column by index.

Source

pub fn column_mut(&mut self, index: usize) -> Option<&mut ValueVector>

Gets a mutable column by index.

Source

pub fn selection(&self) -> Option<&SelectionVector>

Returns the selection vector.

Source

pub fn set_selection(&mut self, selection: SelectionVector)

Sets the selection vector.

Source

pub fn clear_selection(&mut self)

Clears the selection vector (selects all rows).

Source

pub fn set_zone_hints(&mut self, hints: ChunkZoneHints)

Sets zone map hints for this chunk.

Zone map hints enable the filter operator to skip entire chunks when predicates can’t possibly match based on min/max statistics.

Source

pub fn zone_hints(&self) -> Option<&ChunkZoneHints>

Returns zone map hints if available.

Used by the filter operator for chunk-level predicate pruning.

Source

pub fn clear_zone_hints(&mut self)

Clears zone map hints.

Source

pub fn set_count(&mut self, count: usize)

Sets the row count.

Source

pub fn reset(&mut self)

Resets the chunk for reuse.

Source

pub fn flatten(&mut self)

Flattens the selection by copying only selected rows.

After this operation, selection is None and count equals the previously selected row count.

Source

pub fn sort_by_column(&self, col_idx: usize) -> DataChunk

Returns a new chunk with rows sorted by the values in the given column.

For NodeId columns, sorts by the raw node ID. For other types, sorts by the Value’s natural ordering. Rows with null values in the sort column are placed last.

This is used for locality optimization: sorting by source node ID before an expand operator groups nearby nodes together, improving cache locality during adjacency index lookups.

Source

pub fn selected_indices(&self) -> Box<dyn Iterator<Item = usize> + '_>

Returns an iterator over selected row indices.

Source

pub fn concat(chunks: &[DataChunk]) -> DataChunk

Concatenates multiple chunks into a single chunk.

All chunks must have the same schema (same number and types of columns).

Source

pub fn filter(&self, predicate: &SelectionVector) -> DataChunk

Applies a filter predicate and returns a new chunk with selected rows.

Source

pub fn slice(&self, offset: usize, count: usize) -> DataChunk

Returns a slice of this chunk.

Returns a new DataChunk containing rows [offset, offset + count).

Source

pub fn num_columns(&self) -> usize

Returns the number of columns.

Trait Implementations§

Source§

impl Clone for DataChunk

Source§

fn clone(&self) -> Self

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for DataChunk

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more
Source§

impl From<DataChunk> for ChunkVariant

Source§

fn from(chunk: DataChunk) -> Self

Converts to this type from the input type.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.