Expand description
Cost-Based Query Optimizer with Cardinality Estimation (Task 6)
Provides cost-based query optimization for SOCH-QL with:
- Cardinality estimation using sketches (HyperLogLog, CountMin)
- Index selection: compare cost(table_scan) vs cost(index_seek)
- Column projection pushdown to LSCS layer
- Token-budget-aware planning
§Cost Model
cost(plan) = I/O_cost + CPU_cost + memory_cost
I/O_cost = blocks_read × C_seq + seeks × C_random Where: C_seq = 0.1 ms/block (sequential read) C_random = 5 ms/seek (random seek)
CPU_cost = rows_processed × C_filter + sorts × N × log(N) × C_compare
§Selectivity Estimation
Uses CountMinSketch for predicate selectivity and HyperLogLog for distinct counts.
§Token Budget Planning
Given max_tokens, estimates result size and injects LIMIT clause: max_rows = (max_tokens - header_tokens) / tokens_per_row
Structs§
- Aggregate
Expr - Aggregate expression
- Column
Stats - Column statistics
- Cost
Based Optimizer - Cost-based query optimizer
- Cost
Model Config - Cost model configuration with empirically-derived constants
- Histogram
- Histogram for range selectivity estimation
- Index
Stats - Index statistics
- Join
Order Optimizer - Join order optimizer using dynamic programming
- KeyRange
- Key range for index seeks
- Table
Stats - Table statistics for cost estimation
Enums§
- Aggregate
Function - Aggregate functions
- Index
Type - Index types
- Join
Type - Join type
- Physical
Plan - Physical query plan node
- Predicate
- Query predicate for cost estimation
- Sort
Direction - Sort direction