pub struct Executor { /* private fields */ }Expand description
SQL Query Executor
The executor is the main entry point for executing SQL statements. It coordinates between the parser, storage engine, and function registry.
Implementations§
Source§impl Executor
impl Executor
Sourcepub fn try_storage_aggregation(
&self,
table: &dyn Table,
stmt: &SelectStatement,
all_columns: &[String],
classification: &QueryClassification,
) -> Option<Box<dyn QueryResult>>
pub fn try_storage_aggregation( &self, table: &dyn Table, stmt: &SelectStatement, all_columns: &[String], classification: &QueryClassification, ) -> Option<Box<dyn QueryResult>>
Try to use storage-level aggregation for GROUP BY queries.
This optimization bypasses row materialization by computing aggregates directly from arena storage using Arc::clone for group keys.
Returns None if the optimization cannot be applied.
Currently only applies to simple queries with:
- GROUP BY columns that match SELECT identifiers exactly (same order)
- Simple aggregates (COUNT, SUM, AVG, MIN, MAX) on column references
- No WHERE, HAVING, ROLLUP, CUBE, or GROUPING SETS
Source§impl Executor
impl Executor
Sourcepub fn try_extract_semi_join_info(
exists: &ExistsExpression,
is_negated: bool,
outer_tables: &[String],
) -> Option<SemiJoinInfo>
pub fn try_extract_semi_join_info( exists: &ExistsExpression, is_negated: bool, outer_tables: &[String], ) -> Option<SemiJoinInfo>
Try to extract semi-join information from a correlated EXISTS subquery.
For semi-join optimization, we need:
- A simple table source (no joins in subquery)
- A WHERE clause with
inner.col = outer.colequality - Optional additional non-correlated predicates
Returns None if the subquery cannot be optimized as a semi-join.
Sourcepub fn should_use_index_nested_loop_for_anti_join(
&self,
_info: &SemiJoinInfo,
outer_limit: Option<i64>,
) -> bool
pub fn should_use_index_nested_loop_for_anti_join( &self, _info: &SemiJoinInfo, outer_limit: Option<i64>, ) -> bool
Check if index-nested-loop should be preferred over anti-join for NOT EXISTS.
For NOT EXISTS, anti-join using HashJoinOperator is almost always more efficient than both index-nested-loop and InHashSet because:
- HashJoinOperator does bulk hash table build/probe (cache-efficient)
- No per-row expression evaluation overhead
- Even with LIMIT, the bulk operation is faster than per-row checking
The only case where we might prefer index-nested-loop is for VERY small LIMIT (e.g., LIMIT 10) with a highly selective index, but benchmarks show hash join is still faster in most cases.
Sourcepub fn execute_semi_join_optimization(
&self,
info: &SemiJoinInfo,
ctx: &ExecutionContext,
) -> Result<CompactArc<ValueSet>>
pub fn execute_semi_join_optimization( &self, info: &SemiJoinInfo, ctx: &ExecutionContext, ) -> Result<CompactArc<ValueSet>>
Execute the semi-join optimization for an EXISTS subquery.
Instead of executing the subquery for each outer row, we:
- Execute the inner query once with non-correlated predicates
- Collect all distinct values of the inner correlation column
- Return an FxHashSet for fast O(1) lookups
Results are cached to avoid re-execution for the same query within a single top-level query execution.
Sourcepub fn execute_anti_join(
&self,
info: &SemiJoinInfo,
outer_rows: CompactArc<Vec<Row>>,
outer_columns: &[String],
_ctx: &ExecutionContext,
) -> Result<RowVec>
pub fn execute_anti_join( &self, info: &SemiJoinInfo, outer_rows: CompactArc<Vec<Row>>, outer_columns: &[String], _ctx: &ExecutionContext, ) -> Result<RowVec>
Execute NOT EXISTS as a true anti-join using HashJoinOperator.
This is more efficient than the InHashSet approach because:
- HashJoinOperator builds hash table once and probes in bulk
- No per-row expression evaluation overhead
- Better cache efficiency due to batch processing
- Direct table access without going through full query pipeline
§Arguments
info- SemiJoinInfo extracted from the NOT EXISTS subqueryouter_rows- Pre-materialized outer table rowsouter_columns- Column names for outer table_ctx- Execution context (not used but kept for API consistency)
§Returns
Rows from outer table that have NO match in inner table (anti-join result)
Sourcepub fn try_extract_not_exists_info(
expr: &Expression,
outer_tables: &[String],
) -> Option<SemiJoinInfo>
pub fn try_extract_not_exists_info( expr: &Expression, outer_tables: &[String], ) -> Option<SemiJoinInfo>
Try to extract SemiJoinInfo from a NOT EXISTS expression. Returns None if the expression is not a valid NOT EXISTS pattern.
Sourcepub fn transform_exists_to_in_list(
info: &SemiJoinInfo,
hash_set: CompactArc<ValueSet>,
) -> Expression
pub fn transform_exists_to_in_list( info: &SemiJoinInfo, hash_set: CompactArc<ValueSet>, ) -> Expression
Transform a WHERE clause with EXISTS into one using a pre-computed hash set.
Replaces: EXISTS (SELECT …) with: outer_col IN (hash_set_values)
Sourcepub fn try_optimize_exists_to_semi_join(
&self,
expr: &Expression,
ctx: &ExecutionContext,
outer_tables: &[String],
outer_limit: Option<i64>,
) -> Result<Option<Expression>>
pub fn try_optimize_exists_to_semi_join( &self, expr: &Expression, ctx: &ExecutionContext, outer_tables: &[String], outer_limit: Option<i64>, ) -> Result<Option<Expression>>
Try to optimize correlated EXISTS subqueries to semi-join. Returns Some(optimized_expression) if successful, None if not applicable.
Note: This function now checks if index-nested-loop would be more efficient and skips the semi-join transformation in that case, allowing per-row index probing.
The outer_limit parameter helps decide between strategies:
- With small LIMIT + index: prefer index-nested-loop (per-row probing with early termination)
- Without LIMIT: prefer semi-join (scan inner once, hash lookup per outer row)
Sourcepub fn try_optimize_in_to_semi_join(
&self,
expr: &Expression,
ctx: &ExecutionContext,
outer_tables: &[String],
) -> Result<Option<Expression>>
pub fn try_optimize_in_to_semi_join( &self, expr: &Expression, ctx: &ExecutionContext, outer_tables: &[String], ) -> Result<Option<Expression>>
Try to optimize IN subqueries to semi-join (execute once, hash lookup per row).
This transforms:
WHERE outer.col IN (SELECT inner_col FROM t WHERE non_correlated_pred)Into:
WHERE outer.col IN (hash_set_of_inner_col_values)§Optimization Criteria
- IN right side must be a scalar subquery
- Subquery must SELECT exactly one column
- Subquery must have a simple table source (no joins)
- Subquery WHERE clause must NOT reference outer tables (non-correlated)
§Performance Impact
- Before: O(N×M) - executes subquery for each outer row
- After: O(N+M) - executes subquery once, O(1) hash lookup per row
Sourcepub fn collect_outer_table_names(
table_expr: &Option<Box<Expression>>,
) -> Vec<String>
pub fn collect_outer_table_names( table_expr: &Option<Box<Expression>>, ) -> Vec<String>
Get outer table names from a table expression (for semi-join optimization).
Source§impl Executor
impl Executor
Sourcepub fn execute_select_with_window_functions_lazy_partition(
&self,
stmt: &SelectStatement,
ctx: &ExecutionContext,
table: &dyn Table,
base_columns: &[String],
partition_col: &str,
limit: usize,
) -> Result<Box<dyn QueryResult>>
pub fn execute_select_with_window_functions_lazy_partition( &self, stmt: &SelectStatement, ctx: &ExecutionContext, table: &dyn Table, base_columns: &[String], partition_col: &str, limit: usize, ) -> Result<Box<dyn QueryResult>>
Lazy partition fetching for window functions with LIMIT pushdown Fetches partitions one at a time from the index and stops when LIMIT is reached This is the key optimization for PARTITION BY + LIMIT queries
Source§impl Executor
impl Executor
Sourcepub fn new(engine: Arc<MVCCEngine>) -> Self
pub fn new(engine: Arc<MVCCEngine>) -> Self
Create a new executor with the given storage engine
Sourcepub fn with_function_registry(
engine: Arc<MVCCEngine>,
function_registry: Arc<FunctionRegistry>,
) -> Self
pub fn with_function_registry( engine: Arc<MVCCEngine>, function_registry: Arc<FunctionRegistry>, ) -> Self
Create a new executor with a custom function registry
Sourcepub fn with_cache_size(engine: Arc<MVCCEngine>, cache_size: usize) -> Self
pub fn with_cache_size(engine: Arc<MVCCEngine>, cache_size: usize) -> Self
Create a new executor with a custom cache size
Sourcepub fn has_active_transaction(&self) -> bool
pub fn has_active_transaction(&self) -> bool
Check if there is an active explicit transaction
Sourcepub fn set_default_isolation_level(&mut self, level: IsolationLevel)
pub fn set_default_isolation_level(&mut self, level: IsolationLevel)
Set the default isolation level for new transactions
Sourcepub fn engine(&self) -> &Arc<MVCCEngine>
pub fn engine(&self) -> &Arc<MVCCEngine>
Get the storage engine
Sourcepub fn function_registry(&self) -> &Arc<FunctionRegistry>
pub fn function_registry(&self) -> &Arc<FunctionRegistry>
Get the function registry
Sourcepub fn execute(&self, sql: &str) -> Result<Box<dyn QueryResult>>
pub fn execute(&self, sql: &str) -> Result<Box<dyn QueryResult>>
Execute a SQL query string
This is the main entry point for executing SQL statements. It parses the query and executes each statement in order. Uses the query cache to avoid re-parsing identical queries.
Sourcepub fn execute_with_params(
&self,
sql: &str,
params: ParamVec,
) -> Result<Box<dyn QueryResult>>
pub fn execute_with_params( &self, sql: &str, params: ParamVec, ) -> Result<Box<dyn QueryResult>>
Execute a SQL query with positional parameters
Parameters are substituted for $1, $2, etc. placeholders in the query. Uses the query cache for efficient re-execution of parameterized queries. Note: Callers should try try_fast_path_with_params() first before calling this.
Sourcepub fn try_fast_path_with_params(
&self,
sql: &str,
params: &[Value],
) -> Option<Result<Box<dyn QueryResult>>>
pub fn try_fast_path_with_params( &self, sql: &str, params: &[Value], ) -> Option<Result<Box<dyn QueryResult>>>
Try fast path execution with borrowed params slice Returns None if fast path doesn’t apply, Some(result) otherwise
Sourcepub fn execute_with_named_params(
&self,
sql: &str,
params: FxHashMap<String, Value>,
) -> Result<Box<dyn QueryResult>>
pub fn execute_with_named_params( &self, sql: &str, params: FxHashMap<String, Value>, ) -> Result<Box<dyn QueryResult>>
Execute a SQL query with named parameters
Parameters are substituted for :name placeholders in the query. Uses the query cache for efficient re-execution of parameterized queries.
Sourcepub fn execute_with_context(
&self,
sql: &str,
ctx: &ExecutionContext,
) -> Result<Box<dyn QueryResult>>
pub fn execute_with_context( &self, sql: &str, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>
Execute a SQL query with a full execution context Uses the query cache for efficient re-execution.
Sourcepub fn query_cache(&self) -> &QueryCache
pub fn query_cache(&self) -> &QueryCache
Get the query cache
Sourcepub fn cache_stats(&self) -> CacheStats
pub fn cache_stats(&self) -> CacheStats
Get query cache statistics
Sourcepub fn clear_cache(&self)
pub fn clear_cache(&self)
Clear the query cache
Sourcepub fn semantic_cache(&self) -> &SemanticCache
pub fn semantic_cache(&self) -> &SemanticCache
Get the semantic cache
Sourcepub fn semantic_cache_stats(&self) -> SemanticCacheStatsSnapshot
pub fn semantic_cache_stats(&self) -> SemanticCacheStatsSnapshot
Get semantic cache statistics
Sourcepub fn clear_semantic_cache(&self)
pub fn clear_semantic_cache(&self)
Clear the semantic cache
Sourcepub fn invalidate_semantic_cache(&self, table_name: &str)
pub fn invalidate_semantic_cache(&self, table_name: &str)
Invalidate semantic cache for a specific table
Call this after INSERT, UPDATE, DELETE, or TRUNCATE on a table.
Sourcepub fn execute_program(&self, program: &Program) -> Result<Box<dyn QueryResult>>
pub fn execute_program(&self, program: &Program) -> Result<Box<dyn QueryResult>>
Execute a parsed program
Sourcepub fn execute_program_with_context(
&self,
program: &Program,
ctx: &ExecutionContext,
) -> Result<Box<dyn QueryResult>>
pub fn execute_program_with_context( &self, program: &Program, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>
Execute a parsed program with context
Sourcepub fn execute_statement(
&self,
statement: &Statement,
ctx: &ExecutionContext,
) -> Result<Box<dyn QueryResult>>
pub fn execute_statement( &self, statement: &Statement, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>
Execute a single statement
Sourcepub fn install_transaction(&self, tx: Box<dyn Transaction>)
pub fn install_transaction(&self, tx: Box<dyn Transaction>)
Install an external storage transaction as the active transaction.
Used by the programmatic Transaction API to delegate SELECT queries to the full executor pipeline (aggregates, JOINs, window functions, etc.) while keeping the transaction’s uncommitted changes visible.
Sourcepub fn take_transaction(&self) -> Option<Box<dyn Transaction>>
pub fn take_transaction(&self) -> Option<Box<dyn Transaction>>
Take back the storage transaction from the active transaction slot.
Returns the transaction so the caller can continue using it for further DML operations after the SELECT delegation completes.
Sourcepub fn begin_transaction(&self) -> Result<Box<dyn Transaction>>
pub fn begin_transaction(&self) -> Result<Box<dyn Transaction>>
Begin a new transaction
Sourcepub fn begin_transaction_with_isolation(
&self,
isolation: IsolationLevel,
) -> Result<Box<dyn Transaction>>
pub fn begin_transaction_with_isolation( &self, isolation: IsolationLevel, ) -> Result<Box<dyn Transaction>>
Begin a new transaction with a specific isolation level
Sourcepub fn get_or_create_plan(&self, sql: &str) -> Result<CachedPlanRef>
pub fn get_or_create_plan(&self, sql: &str) -> Result<CachedPlanRef>
Get or create a cached plan for a SQL statement.
Parses the SQL and caches the plan if not already cached. Returns a lightweight CachedPlanRef that can be stored and reused for repeated execution without re-parsing or cache lookup overhead.
Sourcepub fn execute_with_cached_plan(
&self,
plan: &CachedPlanRef,
ctx: &ExecutionContext,
) -> Result<Box<dyn QueryResult>>
pub fn execute_with_cached_plan( &self, plan: &CachedPlanRef, ctx: &ExecutionContext, ) -> Result<Box<dyn QueryResult>>
Execute a pre-cached plan directly, skipping cache lookup.
This is the fast path for prepared statements: the caller holds a
CachedPlanRef obtained from get_or_create_plan() and passes it
here on every execution, avoiding normalize + hash + RwLock read
per call.
Auto Trait Implementations§
impl !Freeze for Executor
impl !RefUnwindSafe for Executor
impl Send for Executor
impl Sync for Executor
impl Unpin for Executor
impl UnsafeUnpin for Executor
impl !UnwindSafe for Executor
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CompactArcDrop for T
impl<T> CompactArcDrop for T
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more