Expand description
Perturbation harness.
Real SQL workloads lack ground-truth labels for “what went wrong when.” The honest replacement (per Strategy A in the panel discussion) is controlled perturbation injection: take a clean trace, deterministically inject a known fault inside a known time window, run DSFB-Database, and check whether the emitted episodes overlap that window.
Each perturbation:
- has a name + class
- is restricted to a
[t_start, t_end]window (the ground-truth window) - is deterministic given a seed
- is documented in
spec/perturbations.yaml
The five perturbations cover the five motif classes one-to-one so the evaluation cleanly maps motif → injection → window → F1.
Structs§
Enums§
Functions§
- tpcds_
with_ perturbations - Build a TPC-DS-shaped trace with all five perturbations injected at disjoint, documented windows. The returned (stream, ground-truth windows) pair is the empirical evidence for §8 of the paper.
- tpcds_
with_ perturbations_ scaled - Same harness, but with each perturbation’s magnitude multiplied by
scale.scale = 1.0reproduces the canonical pinned-fingerprint stream exactly (same RNG draw sequence, same byte output). Lower scales produce subthreshold perturbations — the residual is still present but barely above noise — and the stress sweep (stress-sweepsubcommand) reports per-motif F1 across a range of scales so we can see where each motif breaks down, not just that it works at the published baseline.