Module columnar

Module columnar 

Source
Expand description

Columnar execution for high-performance aggregation queries

This module implements column-oriented query execution that avoids materializing full Row objects during table scans, providing 8-10x speedup for aggregation-heavy workloads.

§Architecture

Instead of:

TableScan → Row{Vec<SqlValue>} → Filter(Row) → Aggregate(Row) → Vec<Row>

We use:

TableScan → ColumnRefs → Filter(native types) → Aggregate → Row

§Benefits

  • Zero-copy: Work with &SqlValue references instead of cloning
  • Cache-friendly: Access contiguous column data instead of scattered row data
  • Type-specialized: Skip SqlValue enum matching overhead for filters/aggregates
  • Minimal allocations: Only allocate result rows, not intermediate data

§Usage

This path is automatically selected for simple aggregate queries that:

  • Have a single table scan (no JOINs)
  • Use simple WHERE predicates
  • Compute aggregates (SUM, COUNT, AVG, MIN, MAX)
  • Don’t use window functions or complex subqueries

Re-exports§

pub use batch::ColumnArray;
pub use batch::ColumnarBatch;
pub use filter::apply_columnar_filter;
pub use filter::apply_columnar_filter_simd_streaming;
pub use filter::create_filter_bitmap;
pub use filter::create_filter_bitmap_tree;
pub use filter::evaluate_predicate_tree;
pub use filter::extract_column_predicates;
pub use filter::extract_predicate_tree;
pub use filter::ColumnPredicate;
pub use filter::PredicateTree;
pub use simd_filter::simd_create_filter_mask;
pub use simd_filter::simd_create_filter_mask_auto;
pub use simd_filter::simd_create_filter_mask_packed;
pub use simd_filter::simd_filter_batch;
pub use simd_filter::simd_filter_to_indices;
pub use simd_filter::simd_create_filter_mask_parallel;
pub use simd_filter::simd_filter_batch_parallel;
pub use simd_ops::PackedMask;

Modules§

batch
Columnar batch structure for high-performance query execution
filter
Columnar filtering - efficient predicate evaluation on column data
simd_filter
Auto-vectorized filtering for columnar batches
simd_ops
Auto-vectorized SIMD operations for columnar data processing.

Structs§

AggregateSpec
A complete aggregate specification
ColumnarScan
Columnar scan over a slice of rows

Enums§

AggregateOp
Aggregate operation type
AggregateSource
Source of data for an aggregate - either a simple column or an expression

Functions§

can_use_simd_for_column
Determine if a column can use SIMD path based on its data type
columnar_group_by
Compute aggregates with GROUP BY using columnar execution
columnar_group_by_batch
Compute aggregates with GROUP BY using SIMD-accelerated columnar execution
columnar_hash_join_inner
Columnar hash join with SIMD-accelerated key comparison (INNER JOIN)
compute_aggregates_from_batch
Compute aggregates directly from a ColumnarBatch (no row conversion)
compute_multiple_aggregates
Compute multiple aggregates in a single pass over the data
evaluate_expression_to_column
Evaluate an expression on a ColumnarBatch and return the result as a ColumnArray
evaluate_expression_with_cached_column
Evaluate an expression using a cached column as a base value
execute_columnar
Execute a query using columnar processing (AST-based interface)
execute_columnar_aggregate
Execute a columnar aggregate query with filtering
execute_columnar_batch
Execute a columnar query end-to-end on a ColumnarBatch
extract_aggregates
Extract aggregate operations from AST expressions
fast_aggregate_on_rows
Fast single-pass aggregate on rows - avoids batch conversion overhead
simd_aggregate_f64
Compute SIMD aggregate for Float64 columns using streaming batches
simd_aggregate_i64
Compute aggregate for Int64 columns using streaming batches