Expand description
Columnar execution for high-performance aggregation queries
This module implements column-oriented query execution that avoids materializing full Row objects during table scans, providing 8-10x speedup for aggregation-heavy workloads.
§Architecture
Instead of:
TableScan → Row{Vec<SqlValue>} → Filter(Row) → Aggregate(Row) → Vec<Row>We use:
TableScan → ColumnRefs → Filter(native types) → Aggregate → Row§Benefits
- Zero-copy: Work with
&SqlValuereferences instead of cloning - Cache-friendly: Access contiguous column data instead of scattered row data
- Type-specialized: Skip SqlValue enum matching overhead for filters/aggregates
- Minimal allocations: Only allocate result rows, not intermediate data
§Usage
This path is automatically selected for simple aggregate queries that:
- Have a single table scan (no JOINs)
- Use simple WHERE predicates
- Compute aggregates (SUM, COUNT, AVG, MIN, MAX)
- Don’t use window functions or complex subqueries
Re-exports§
pub use batch::ColumnArray;pub use batch::ColumnarBatch;pub use filter::apply_columnar_filter;pub use filter::apply_columnar_filter_simd_streaming;pub use filter::create_filter_bitmap;pub use filter::create_filter_bitmap_tree;pub use filter::evaluate_predicate_tree;pub use filter::extract_column_predicates;pub use filter::extract_predicate_tree;pub use filter::ColumnPredicate;pub use filter::PredicateTree;pub use simd_filter::simd_create_filter_mask;pub use simd_filter::simd_create_filter_mask_auto;pub use simd_filter::simd_create_filter_mask_packed;pub use simd_filter::simd_filter_batch;pub use simd_filter::simd_filter_to_indices;pub use simd_filter::simd_create_filter_mask_parallel;pub use simd_filter::simd_filter_batch_parallel;pub use simd_ops::PackedMask;
Modules§
- batch
- Columnar batch structure for high-performance query execution
- filter
- Columnar filtering - efficient predicate evaluation on column data
- simd_
filter - Auto-vectorized filtering for columnar batches
- simd_
ops - Auto-vectorized SIMD operations for columnar data processing.
Structs§
- Aggregate
Spec - A complete aggregate specification
- Columnar
Scan - Columnar scan over a slice of rows
Enums§
- Aggregate
Op - Aggregate operation type
- Aggregate
Source - Source of data for an aggregate - either a simple column or an expression
Functions§
- can_
use_ simd_ for_ column - Determine if a column can use SIMD path based on its data type
- columnar_
group_ by - Compute aggregates with GROUP BY using columnar execution
- columnar_
group_ by_ batch - Compute aggregates with GROUP BY using SIMD-accelerated columnar execution
- columnar_
hash_ join_ inner - Columnar hash join with SIMD-accelerated key comparison (INNER JOIN)
- compute_
aggregates_ from_ batch - Compute aggregates directly from a ColumnarBatch (no row conversion)
- compute_
multiple_ aggregates - Compute multiple aggregates in a single pass over the data
- evaluate_
expression_ to_ column - Evaluate an expression on a ColumnarBatch and return the result as a ColumnArray
- evaluate_
expression_ with_ cached_ column - Evaluate an expression using a cached column as a base value
- execute_
columnar - Execute a query using columnar processing (AST-based interface)
- execute_
columnar_ aggregate - Execute a columnar aggregate query with filtering
- execute_
columnar_ batch - Execute a columnar query end-to-end on a ColumnarBatch
- extract_
aggregates - Extract aggregate operations from AST expressions
- fast_
aggregate_ on_ rows - Fast single-pass aggregate on rows - avoids batch conversion overhead
- simd_
aggregate_ f64 - Compute SIMD aggregate for Float64 columns using streaming batches
- simd_
aggregate_ i64 - Compute aggregate for Int64 columns using streaming batches