Skip to main content

Module cost_optimizer

Module cost_optimizer 

Source
Expand description

Cost-Based Query Optimizer with Cardinality Estimation (Task 6)

Provides cost-based query optimization for SOCH-QL with:

  • Cardinality estimation using sketches (HyperLogLog, CountMin)
  • Index selection: compare cost(table_scan) vs cost(index_seek)
  • Column projection pushdown to LSCS layer
  • Token-budget-aware planning

§Cost Model

cost(plan) = I/O_cost + CPU_cost + memory_cost

I/O_cost = blocks_read × C_seq + seeks × C_random Where: C_seq = 0.1 ms/block (sequential read) C_random = 5 ms/seek (random seek)

CPU_cost = rows_processed × C_filter + sorts × N × log(N) × C_compare

§Selectivity Estimation

Uses CountMinSketch for predicate selectivity and HyperLogLog for distinct counts.

§Token Budget Planning

Given max_tokens, estimates result size and injects LIMIT clause: max_rows = (max_tokens - header_tokens) / tokens_per_row

Structs§

AggregateExpr
Aggregate expression
ColumnStats
Column statistics
CostBasedOptimizer
Cost-based query optimizer
CostModelConfig
Cost model configuration with empirically-derived constants
Histogram
Histogram for range selectivity estimation
IndexStats
Index statistics
JoinOrderOptimizer
Join order optimizer using dynamic programming
KeyRange
Key range for index seeks
TableStats
Table statistics for cost estimation

Enums§

AggregateFunction
Aggregate functions
IndexType
Index types
JoinType
Join type
PhysicalPlan
Physical query plan node
Predicate
Query predicate for cost estimation
SortDirection
Sort direction