Expand description
CEB adapter (Cardinality Estimation Benchmark, Negi et al.).
Real subset CSV columns (after scripts/fetch_ceb.sh exports the
pickle to CSV):
query_id,subplan_idtrue_rows(ground truth from PostgreSQLEXPLAIN ANALYZE)est_rows(PostgreSQL optimiser estimate)- optional:
query_class,template_id
What we extract:
Cardinality—log10(true_rows / est_rows)per(query_id, subplan_id). This is the only public dataset in our stack with ground-truth cardinalities, which is why we treat its results as the cardinality-mismatch motif’s primary empirical evidence.
What we cannot extract:
- Latency, plan changes over time, contention, cache I/O — CEB is a
batch benchmark, not a temporal trace; we synthesise time as
t = subplan_index_within_query + query_index * 1.0so the trace is well-defined.