Expand description
SQLShare adapter (Jain et al., SIGMOD 2016).
Real subset CSV columns (as released by UW eScience):
query_id,user_id,runtime_seconds,submitted_at,query_text
What we extract:
PlanRegression—runtime − rolling_baseline(runtime)per(user_id, normalised_query_skeleton). Normalisation = strip literals, collapse whitespace, lowercase. This stands in for query digest because SQLShare predates digest IDs.WorkloadPhase— JS divergence over the per-user query-skeleton histogram in 1-day buckets.
What we cannot extract:
Cardinality,Contention,CacheIo— none of these are in the released metadata. The paper’s Table on “what each dataset supplies” marks these as N/A for SQLShare.