ColumnStore

Struct ColumnStore 

Source
pub struct ColumnStore<P: Pager> { /* private fields */ }
Expand description

Columnar storage engine for managing Arrow-based data.

ColumnStore provides the primary interface for persisting and retrieving columnar data using Apache Arrow RecordBatches. It manages:

  • Column descriptors and metadata (chunk locations, row counts, min/max values)
  • Data type caching for efficient schema queries
  • Index management (presence indexes, value indexes)
  • Integration with the Pager for persistent storage

§Namespaces

Columns are identified by LogicalFieldId, which combines a namespace, table ID, and field ID. This prevents collisions between user data, row IDs, and MVCC metadata:

  • UserData: Regular table columns
  • RowIdShadow: Internal row ID tracking
  • TxnCreatedBy: MVCC transaction creation timestamps
  • TxnDeletedBy: MVCC transaction deletion timestamps

§Thread Safety

ColumnStore is Send + Sync and can be safely shared across threads via Arc. Internal state (catalog, caches) uses RwLock for concurrent access.

§Test Harness Integration

  • SQLite sqllogictest: Every upstream case exercises the column store, providing a compatibility baseline but not full parity with SQLite yet.
  • DuckDB suites: Early dialect-specific tests stress MVCC and typed casts, informing future work rather than proving comprehensive DuckDB coverage.
  • Hardening mandate: Failures uncovered by the suites result in storage fixes, not filtered tests, to preserve confidence in OLAP scenarios built atop this crate.

Implementations§

Source§

impl<P> ColumnStore<P>
where P: Pager<Blob = EntryHandle> + Send + Sync,

Source

pub fn open(pager: Arc<P>) -> Result<Self>

Opens or creates a ColumnStore using the provided pager.

Loads the column catalog from the pager’s root catalog key, or initializes an empty catalog if none exists. The catalog maps LogicalFieldId to the physical keys of column descriptors.

§Errors

Returns an error if the pager fails to load the catalog or if deserialization fails.

Source

pub fn write_hints(&self) -> ColumnStoreWriteHints

Return heuristics that guide upstream writers when sizing batches.

Source

pub fn register_index( &self, field_id: LogicalFieldId, kind: IndexKind, ) -> Result<()>

Creates and persists an index for a column.

Builds the specified index type for all existing data in the column and persists it atomically. The index will be maintained automatically on subsequent appends and updates.

§Errors

Returns an error if the column doesn’t exist or if index creation fails.

Source

pub fn has_field(&self, field_id: LogicalFieldId) -> bool

Checks if a logical field is registered in the catalog.

Source

pub fn unregister_index( &self, field_id: LogicalFieldId, kind: IndexKind, ) -> Result<()>

Removes a persisted index from a column.

Atomically removes the index and frees associated storage. The column data itself is not affected.

§Errors

Returns an error if the column or index doesn’t exist.

Source

pub fn data_type(&self, field_id: LogicalFieldId) -> Result<DataType>

Returns the Arrow data type of a column.

Returns the data type from cache if available, otherwise loads it from the column descriptor and caches it for future queries.

§Errors

Returns an error if the column doesn’t exist or if the descriptor is corrupted.

Source

pub fn update_data_type( &self, field_id: LogicalFieldId, new_data_type: &DataType, ) -> Result<()>

Updates the data type of an existing column.

This updates the descriptor’s data type fingerprint and cache. Note that this does NOT migrate existing data - the caller must ensure data compatibility.

§Errors

Returns an error if the column doesn’t exist or if the descriptor update fails.

Source

pub fn ensure_column_registered( &self, field_id: LogicalFieldId, data_type: &DataType, ) -> Result<()>

Ensures that catalog entries and descriptors exist for a logical column.

Primarily used when creating empty tables so that subsequent operations (like CREATE INDEX) can resolve column metadata before any data has been appended.

Source

pub fn filter_row_ids<T>( &self, field_id: LogicalFieldId, predicate: &Predicate<T::Value>, ) -> Result<Vec<u64>>
where T: FilterDispatch,

Find all row IDs where a column satisfies a predicate.

This evaluates the predicate against the column’s data and returns a vector of matching row IDs. Uses indexes and chunk metadata (min/max values) to skip irrelevant data when possible.

§Errors

Returns an error if the column doesn’t exist or if chunk data is corrupted.

Source

pub fn filter_matches<T, F>( &self, field_id: LogicalFieldId, predicate: F, ) -> Result<FilterResult>
where T: FilterPrimitive, F: FnMut(T::Native) -> bool,

Evaluate a predicate against a column and return match metadata.

This variant drives a primitive predicate over the column and returns a FilterResult describing contiguous match regions. Callers can use the result to build paginated scans or gather row identifiers.

§Arguments
  • field_id: Logical column to filter.
  • predicate: Callable invoked on each value; should be cheap and free of side effects.
§Errors

Returns an error if the column metadata cannot be loaded or if decoding a chunk fails.

Source

pub fn list_persisted_indexes( &self, field_id: LogicalFieldId, ) -> Result<Vec<IndexKind>>

List all indexes registered for a column.

Returns the types of indexes (e.g., presence, value) that are currently persisted for the specified column.

§Errors

Returns an error if the column doesn’t exist or if the descriptor is corrupted.

Source

pub fn total_rows_for_field(&self, field_id: LogicalFieldId) -> Result<u64>

Get the total number of rows in a column.

Returns the persisted row count from the column’s descriptor. This value is updated by append and delete operations.

§Errors

Returns an error if the column doesn’t exist or if the descriptor is corrupted.

Source

pub fn total_rows_for_table(&self, table_id: TableId) -> Result<u64>

Get the total number of rows in a table.

This returns the maximum row count across all user-data columns in the table. If the table has no persisted columns, returns 0.

§Errors

Returns an error if column descriptors cannot be loaded.

Source

pub fn user_field_ids_for_table(&self, table_id: TableId) -> Vec<LogicalFieldId>

Get all user-data column IDs for a table.

This returns the LogicalFieldIds of all persisted user columns (namespace UserData) belonging to the specified table. MVCC and row ID columns are not included.

Source

pub fn remove_column(&self, field_id: LogicalFieldId) -> Result<()>

Remove a column from the column store catalog.

This removes the column descriptor entry from the in-memory catalog and persists the updated catalog. The actual column data pages remain in the pager but become unreachable and will be garbage collected on compaction.

§Arguments
  • field_id - The logical field ID of the column to remove.
§Errors

Returns an error if the column doesn’t exist or if catalog persistence fails.

Source

pub fn has_row_id( &self, field_id: LogicalFieldId, row_id: RowId, ) -> Result<bool>

Check whether a specific row ID exists in a column.

This uses presence indexes and binary search when available for fast lookups. If no presence index exists, it scans chunks and uses min/max metadata to prune irrelevant data.

§Errors

Returns an error if the column doesn’t exist or if chunk data is corrupted.

Source

pub fn append(&self, batch: &RecordBatch) -> Result<()>

Append a RecordBatch to the store.

The batch must include a rowid column (type UInt64) that uniquely identifies each row. Each other column must have field_id metadata mapping it to a LogicalFieldId.

§Last-Write-Wins Updates

If any row IDs in the batch already exist, they are updated in-place (overwritten) rather than creating duplicates. This happens in a separate transaction before appending new rows.

§Row ID Ordering

The batch is automatically sorted by rowid if not already sorted. This ensures efficient metadata updates and naturally sorted shadow columns.

§Table Separation

Each batch should contain columns from only one table. To append to multiple tables, call append separately for each table’s batch (may be concurrent).

§Errors

Returns an error if:

  • The batch is missing the rowid column
  • Column metadata is missing or invalid
  • Storage operations fail
Source

pub fn delete_rows( &self, fields: &[LogicalFieldId], rows_to_delete: &[RowId], ) -> Result<()>

Delete row positions for one or more logical fields in a single atomic batch.

The same set of global row positions is applied to every field in fields. All staged metadata and chunk updates are committed in a single pager batch.

Source

pub fn verify_integrity(&self) -> Result<()>

Verifies the integrity of the column store’s metadata.

This check is useful for tests and debugging. It verifies:

  1. The catalog can be read.
  2. All descriptor chains are walkable.
  3. The row and chunk counts in the descriptors match the sum of the chunk metadata.

Returns Ok(()) if consistent, otherwise returns an Error.

Source

pub fn get_layout_stats(&self) -> Result<Vec<ColumnLayoutStats>>

Gathers detailed statistics about the storage layout.

This method is designed for low-level analysis and debugging, allowing you to check for under- or over-utilization of descriptor pages.

Source§

impl<P> ColumnStore<P>
where P: Pager<Blob = EntryHandle>,

Source

pub fn scan( &self, field_id: LogicalFieldId, opts: ScanOptions, visitor: &mut dyn PrimitiveFullVisitor, ) -> Result<()>

Unified scan entrypoint configured by ScanOptions. Requires V to implement both unsorted and sorted visitor traits; methods are no-ops by default.

Source§

impl<P> ColumnStore<P>
where P: Pager<Blob = EntryHandle> + Send + Sync,

Source

pub fn gather_rows( &self, field_ids: &[LogicalFieldId], row_ids: &[u64], policy: GatherNullPolicy, ) -> Result<RecordBatch>

Gathers multiple columns using a configurable null-handling policy. When GatherNullPolicy::DropNulls is selected, rows where all projected columns are null or missing are removed from the resulting batch.

Source

pub fn gather_rows_with_schema( &self, field_ids: &[LogicalFieldId], row_ids: &[u64], policy: GatherNullPolicy, expected_schema: Option<Arc<Schema>>, ) -> Result<RecordBatch>

Gather rows with an optional expected schema for empty result sets.

When expected_schema is provided and row_ids is empty, the returned RecordBatch will use that schema instead of synthesizing one with all nullable fields. This ensures that non-nullable columns (e.g., PRIMARY KEYs) are correctly represented.

Source

pub fn prepare_gather_context( &self, field_ids: &[LogicalFieldId], ) -> Result<MultiGatherContext>

Source

pub fn gather_rows_with_reusable_context( &self, ctx: &mut MultiGatherContext, row_ids: &[u64], policy: GatherNullPolicy, ) -> Result<RecordBatch>

Gathers rows while reusing chunk caches and scratch buffers stored in the context.

This path amortizes chunk fetch and decode costs across multiple calls by retaining Arrow arrays and scratch state inside the provided context.

Trait Implementations§

Source§

impl<P> Clone for ColumnStore<P>
where P: Pager<Blob = EntryHandle> + Send + Sync,

Source§

fn clone(&self) -> Self

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl<P: Pager> ColumnStoreDebug for ColumnStore<P>

Source§

fn render_storage_as_formatted_string(&self) -> String

Renders the entire physical layout of the store into a formatted ASCII table string.
Source§

fn render_storage_as_dot( &self, batch_colors: &HashMap<PhysicalKey, usize>, ) -> String

Renders the physical layout of the store into a Graphviz .dot file string. The caller provides a map to color nodes based on an external category, like batch number.

Auto Trait Implementations§

§

impl<P> !Freeze for ColumnStore<P>

§

impl<P> RefUnwindSafe for ColumnStore<P>
where P: RefUnwindSafe,

§

impl<P> Send for ColumnStore<P>

§

impl<P> Sync for ColumnStore<P>

§

impl<P> Unpin for ColumnStore<P>

§

impl<P> UnwindSafe for ColumnStore<P>
where P: RefUnwindSafe,

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,