Struct Executor

Source

pub struct Executor { /* private fields */ }

Expand description

SQL Query Executor

The executor is the main entry point for executing SQL statements. It coordinates between the parser, storage engine, and function registry.

Implementations§

Source §

impl Executor

Source

pub fn try_storage_aggregation( &self, table: &dyn Table, stmt: &SelectStatement, all_columns: &[String], classification: &QueryClassification, ) -> Option<Box<dyn QueryResult>>

Try to use storage-level aggregation for GROUP BY queries.

This optimization bypasses row materialization by computing aggregates directly from arena storage using Arc::clone for group keys.

Returns None if the optimization cannot be applied.

Currently only applies to simple queries with:

GROUP BY columns that match SELECT identifiers exactly (same order)
Simple aggregates (COUNT, SUM, AVG, MIN, MAX) on column references
No WHERE, HAVING, ROLLUP, CUBE, or GROUPING SETS

Source §

impl Executor

Source

pub fn try_extract_semi_join_info( exists: &ExistsExpression, is_negated: bool, outer_tables: &[String], ) -> Option<SemiJoinInfo>

Try to extract semi-join information from a correlated EXISTS subquery.

For semi-join optimization, we need:

A simple table source (no joins in subquery)
A WHERE clause with inner.col = outer.col equality
Optional additional non-correlated predicates

Returns None if the subquery cannot be optimized as a semi-join.

Source

pub fn should_use_index_nested_loop_for_anti_join( &self, _info: &SemiJoinInfo, outer_limit: Option<i64>, ) -> bool

Check if index-nested-loop should be preferred over anti-join for NOT EXISTS.

For NOT EXISTS, anti-join using HashJoinOperator is almost always more efficient than both index-nested-loop and InHashSet because:

HashJoinOperator does bulk hash table build/probe (cache-efficient)
No per-row expression evaluation overhead
Even with LIMIT, the bulk operation is faster than per-row checking

The only case where we might prefer index-nested-loop is for VERY small LIMIT (e.g., LIMIT 10) with a highly selective index, but benchmarks show hash join is still faster in most cases.

Source

pub fn execute_semi_join_optimization( &self, info: &SemiJoinInfo, ctx: &ExecutionContext, ) -> Result<CompactArc<ValueSet>>

Execute the semi-join optimization for an EXISTS subquery.

Instead of executing the subquery for each outer row, we:

Execute the inner query once with non-correlated predicates
Collect all distinct values of the inner correlation column
Return an FxHashSet for fast O(1) lookups

Results are cached to avoid re-execution for the same query within a single top-level query execution.

Source

pub fn execute_anti_join( &self, info: &SemiJoinInfo, outer_rows: CompactArc<Vec<Row>>, outer_columns: &[String], _ctx: &ExecutionContext, ) -> Result<RowVec>

Execute NOT EXISTS as a true anti-join using HashJoinOperator.

This is more efficient than the InHashSet approach because:

HashJoinOperator builds hash table once and probes in bulk
No per-row expression evaluation overhead
Better cache efficiency due to batch processing
Direct table access without going through full query pipeline

§Arguments

info - SemiJoinInfo extracted from the NOT EXISTS subquery
outer_rows - Pre-materialized outer table rows
outer_columns - Column names for outer table
_ctx - Execution context (not used but kept for API consistency)

§Returns

Rows from outer table that have NO match in inner table (anti-join result)

Source

pub fn try_extract_not_exists_info( expr: &Expression, outer_tables: &[String], ) -> Option<SemiJoinInfo>

Try to extract SemiJoinInfo from a NOT EXISTS expression. Returns None if the expression is not a valid NOT EXISTS pattern.

Source

pub fn transform_exists_to_in_list( info: &SemiJoinInfo, hash_set: CompactArc<ValueSet>, ) -> Expression

Transform a WHERE clause with EXISTS into one using a pre-computed hash set.

Replaces: EXISTS (SELECT …) with: outer_col IN (hash_set_values)

Source

pub fn try_optimize_exists_to_semi_join( &self, expr: &Expression, ctx: &ExecutionContext, outer_tables: &[String], outer_limit: Option<i64>, ) -> Result<Option<Expression>>

Try to optimize correlated EXISTS subqueries to semi-join. Returns Some(optimized_expression) if successful, None if not applicable.

Note: This function now checks if index-nested-loop would be more efficient and skips the semi-join transformation in that case, allowing per-row index probing.

The outer_limit parameter helps decide between strategies:

With small LIMIT + index: prefer index-nested-loop (per-row probing with early termination)
Without LIMIT: prefer semi-join (scan inner once, hash lookup per outer row)

Source

pub fn try_optimize_in_to_semi_join( &self, expr: &Expression, ctx: &ExecutionContext, outer_tables: &[String], ) -> Result<Option<Expression>>

Try to optimize IN subqueries to semi-join (execute once, hash lookup per row).

This transforms:

WHERE outer.col IN (SELECT inner_col FROM t WHERE non_correlated_pred)

Into:

WHERE outer.col IN (hash_set_of_inner_col_values)

§Optimization Criteria

IN right side must be a scalar subquery
Subquery must SELECT exactly one column
Subquery must have a simple table source (no joins)
Subquery WHERE clause must NOT reference outer tables (non-correlated)

§Performance Impact

Before: O(N×M) - executes subquery for each outer row
After: O(N+M) - executes subquery once, O(1) hash lookup per row

Source

pub fn collect_outer_table_names( table_expr: &Option<Box<Expression>>, ) -> Vec<String>

Get outer table names from a table expression (for semi-join optimization).

Source §

Lazy partition fetching for window functions with LIMIT pushdown Fetches partitions one at a time from the index and stops when LIMIT is reached This is the key optimization for PARTITION BY + LIMIT queries

Source §

impl Executor

Source

pub fn new(engine: Arc<MVCCEngine>) -> Self

Create a new executor with the given storage engine

Source

pub fn with_function_registry( engine: Arc<MVCCEngine>, function_registry: Arc<FunctionRegistry>, ) -> Self

Create a new executor with a custom function registry

Source

pub fn with_cache_size(engine: Arc<MVCCEngine>, cache_size: usize) -> Self

Create a new executor with a custom cache size

Source

pub fn has_active_transaction(&self) -> bool

Check if there is an active explicit transaction

Source

pub fn set_default_isolation_level(&mut self, level: IsolationLevel)

Set the default isolation level for new transactions

Source

pub fn engine(&self) -> &Arc<MVCCEngine>

Get the storage engine

Source

pub fn function_registry(&self) -> &Arc<FunctionRegistry>

Get the function registry

Source

pub fn execute(&self, sql: &str) -> Result<Box<dyn QueryResult>>

Execute a SQL query string

This is the main entry point for executing SQL statements. It parses the query and executes each statement in order. Uses the query cache to avoid re-parsing identical queries.

Source

pub fn execute_with_params( &self, sql: &str, params: ParamVec, ) -> Result<Box<dyn QueryResult>>

Execute a SQL query with positional parameters

Parameters are substituted for $1, $2, etc. placeholders in the query. Uses the query cache for efficient re-execution of parameterized queries. Note: Callers should try try_fast_path_with_params() first before calling this.

Source

pub fn try_fast_path_with_params( &self, sql: &str, params: &[Value], ) -> Option<Result<Box<dyn QueryResult>>>

Try fast path execution with borrowed params slice Returns None if fast path doesn’t apply, Some(result) otherwise

Source

pub fn execute_with_named_params( &self, sql: &str, params: FxHashMap<String, Value>, ) -> Result<Box<dyn QueryResult>>

Execute a SQL query with named parameters

Parameters are substituted for :name placeholders in the query. Uses the query cache for efficient re-execution of parameterized queries.

Source

pub fn execute_with_context( &self, sql: &str, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>

Execute a SQL query with a full execution context Uses the query cache for efficient re-execution.

Source

pub fn query_cache(&self) -> &QueryCache

Get the query cache

Source

pub fn cache_stats(&self) -> CacheStats

Get query cache statistics

Source

pub fn clear_cache(&self)

Clear the query cache

Source

pub fn semantic_cache(&self) -> &SemanticCache

Get the semantic cache

Source

pub fn semantic_cache_stats(&self) -> SemanticCacheStatsSnapshot

Get semantic cache statistics

Source

pub fn clear_semantic_cache(&self)

Clear the semantic cache

Source

pub fn invalidate_semantic_cache(&self, table_name: &str)

Invalidate semantic cache for a specific table

Call this after INSERT, UPDATE, DELETE, or TRUNCATE on a table.

Source

pub fn execute_program(&self, program: &Program) -> Result<Box<dyn QueryResult>>

Execute a parsed program

Source

pub fn execute_program_with_context( &self, program: &Program, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>

Execute a parsed program with context

Source

pub fn execute_statement( &self, statement: &Statement, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>

Execute a single statement

Source

pub fn install_transaction(&self, tx: Box<dyn Transaction>)

Install an external storage transaction as the active transaction.

Used by the programmatic Transaction API to delegate SELECT queries to the full executor pipeline (aggregates, JOINs, window functions, etc.) while keeping the transaction’s uncommitted changes visible.

Source

pub fn take_transaction(&self) -> Option<Box<dyn Transaction>>

Take back the storage transaction from the active transaction slot.

Returns the transaction so the caller can continue using it for further DML operations after the SELECT delegation completes.

Source

pub fn begin_transaction(&self) -> Result<Box<dyn Transaction>>

Begin a new transaction

Source

pub fn begin_transaction_with_isolation( &self, isolation: IsolationLevel, ) -> Result<Box<dyn Transaction>>

Begin a new transaction with a specific isolation level

Source

pub fn get_or_create_plan(&self, sql: &str) -> Result<CachedPlanRef>

Get or create a cached plan for a SQL statement.

Parses the SQL and caches the plan if not already cached. Returns a lightweight CachedPlanRef that can be stored and reused for repeated execution without re-parsing or cache lookup overhead.

Source

pub fn execute_with_cached_plan( &self, plan: &CachedPlanRef, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>

Execute a pre-cached plan directly, skipping cache lookup.

This is the fast path for prepared statements: the caller holds a CachedPlanRef obtained from get_or_create_plan() and passes it here on every execution, avoiding normalize + hash + RwLock read per call.

Auto Trait Implementations§

§

impl !UnwindSafe for Executor

Blanket Implementations§

Source §

impl<T> Any for T
where T: 'static + ?Sized,

Source §

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

Source §

impl<T> Borrow<T> for T
where T: ?Sized,

Source §

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

Source §

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source §

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

Source §

impl<T> CompactArcDrop for T

Source §

unsafe fn drop_and_dealloc(ptr: *mut u8)

Drop the contained data and deallocate the header+data allocation. Read more

Source §

impl<T> From<T> for T

Source §

fn from(t: T) -> T

Returns the argument unchanged.

Source §

impl<T, U> Into for T
where U: From<T>,

Source §

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source §

impl<T> IntoEither for T

Source §

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more

Source §

impl<T> Pointable for T

Source §

const ALIGN: usize

The alignment of pointer.

Source §

type Init = T

The type for initializers.

Source §

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more

Source §

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more

Source §

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more

Source §

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more

Source §

impl<T> Same for T

Source §

type Output = T

Should always be Self

Source §

impl<T, U> TryFrom for T
where U: Into<T>,

Source §

type Error = Infallible

The type returned in the event of a conversion error.

Source §

fn try_from(value: U) -> Result<T, <T as TryFrom>::Error>

Performs the conversion.

Source §

impl<T, U> TryInto for T
where U: TryFrom<T>,

Source §

type Error = >::Error

The type returned in the event of a conversion error.

Source §

fn try_into(self) -> Result<U, >::Error>

Performs the conversion.

Source §

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source §

Executor

Struct Executor Copy item path

Implementations§

impl Executor

pub fn try_storage_aggregation( &self, table: &dyn Table, stmt: &SelectStatement, all_columns: &[String], classification: &QueryClassification, ) -> Option<Box<dyn QueryResult>>

impl Executor

pub fn try_extract_semi_join_info( exists: &ExistsExpression, is_negated: bool, outer_tables: &[String], ) -> Option<SemiJoinInfo>

pub fn should_use_index_nested_loop_for_anti_join( &self, _info: &SemiJoinInfo, outer_limit: Option<i64>, ) -> bool

pub fn execute_semi_join_optimization( &self, info: &SemiJoinInfo, ctx: &ExecutionContext, ) -> Result<CompactArc<ValueSet>>

pub fn execute_anti_join( &self, info: &SemiJoinInfo, outer_rows: CompactArc<Vec<Row>>, outer_columns: &[String], _ctx: &ExecutionContext, ) -> Result<RowVec>

§Arguments

§Returns

pub fn try_extract_not_exists_info( expr: &Expression, outer_tables: &[String], ) -> Option<SemiJoinInfo>

pub fn transform_exists_to_in_list( info: &SemiJoinInfo, hash_set: CompactArc<ValueSet>, ) -> Expression

pub fn try_optimize_exists_to_semi_join( &self, expr: &Expression, ctx: &ExecutionContext, outer_tables: &[String], outer_limit: Option<i64>, ) -> Result<Option<Expression>>

pub fn try_optimize_in_to_semi_join( &self, expr: &Expression, ctx: &ExecutionContext, outer_tables: &[String], ) -> Result<Option<Expression>>

§Optimization Criteria

§Performance Impact

pub fn collect_outer_table_names( table_expr: &Option<Box<Expression>>, ) -> Vec<String>

impl Executor

pub fn execute_select_with_window_functions_lazy_partition( &self, stmt: &SelectStatement, ctx: &ExecutionContext, table: &dyn Table, base_columns: &[String], partition_col: &str, limit: usize, ) -> Result<Box<dyn QueryResult>>

impl Executor

pub fn new(engine: Arc<MVCCEngine>) -> Self

pub fn with_function_registry( engine: Arc<MVCCEngine>, function_registry: Arc<FunctionRegistry>, ) -> Self

pub fn with_cache_size(engine: Arc<MVCCEngine>, cache_size: usize) -> Self

pub fn has_active_transaction(&self) -> bool

pub fn set_default_isolation_level(&mut self, level: IsolationLevel)

pub fn engine(&self) -> &Arc<MVCCEngine>

pub fn function_registry(&self) -> &Arc<FunctionRegistry>

pub fn execute(&self, sql: &str) -> Result<Box<dyn QueryResult>>

pub fn execute_with_params( &self, sql: &str, params: ParamVec, ) -> Result<Box<dyn QueryResult>>

pub fn try_fast_path_with_params( &self, sql: &str, params: &[Value], ) -> Option<Result<Box<dyn QueryResult>>>

pub fn execute_with_named_params( &self, sql: &str, params: FxHashMap<String, Value>, ) -> Result<Box<dyn QueryResult>>

pub fn execute_with_context( &self, sql: &str, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>

pub fn query_cache(&self) -> &QueryCache

pub fn cache_stats(&self) -> CacheStats

pub fn clear_cache(&self)

pub fn semantic_cache(&self) -> &SemanticCache

pub fn semantic_cache_stats(&self) -> SemanticCacheStatsSnapshot

pub fn clear_semantic_cache(&self)

pub fn invalidate_semantic_cache(&self, table_name: &str)

pub fn execute_program(&self, program: &Program) -> Result<Box<dyn QueryResult>>

pub fn execute_program_with_context( &self, program: &Program, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>

pub fn execute_statement( &self, statement: &Statement, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>

pub fn install_transaction(&self, tx: Box<dyn Transaction>)

pub fn take_transaction(&self) -> Option<Box<dyn Transaction>>

pub fn begin_transaction(&self) -> Result<Box<dyn Transaction>>

pub fn begin_transaction_with_isolation( &self, isolation: IsolationLevel, ) -> Result<Box<dyn Transaction>>

pub fn get_or_create_plan(&self, sql: &str) -> Result<CachedPlanRef>

pub fn execute_with_cached_plan( &self, plan: &CachedPlanRef, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>

Auto Trait Implementations§

impl !Freeze for Executor

impl !RefUnwindSafe for Executor

impl Send for Executor

impl Sync for Executor

impl Unpin for Executor

impl UnsafeUnpin for Executor

impl !UnwindSafe for Executor

Blanket Implementations§

impl<T> Any for Twhere T: 'static + ?Sized,

fn type_id(&self) -> TypeId

impl<T> Borrow<T> for Twhere T: ?Sized,

fn borrow(&self) -> &T

impl<T> BorrowMut<T> for Twhere T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

impl<T> CompactArcDrop for T

unsafe fn drop_and_dealloc(ptr: *mut u8)

impl<T> From<T> for T

fn from(t: T) -> T

impl<T, U> Into<U> for Twhere U: From<T>,

fn into(self) -> U

impl<T> IntoEither for T

fn into_either(self, into_left: bool) -> Either<Self, Self>

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>where F: FnOnce(&Self) -> bool,

impl<T> Pointable for T

const ALIGN: usize

type Init = T

unsafe fn init(init: <T as Pointable>::Init) -> usize

unsafe fn deref<'a>(ptr: usize) -> &'a T

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Struct Executor

impl<T> Any for T
where T: 'static + ?Sized,

impl<T> Borrow<T> for T
where T: ?Sized,

impl<T> BorrowMut<T> for T
where T: ?Sized,

impl<T, U> Into<U> for T
where U: From<T>,

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

impl<T, U> TryFrom<U> for T
where U: Into<T>,

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

impl<V, T> VZip<V> for T
where V: MultiLane<T>,