Module perturbation

Expand description

Perturbation harness.

Real SQL workloads lack ground-truth labels for “what went wrong when.” The honest replacement (per Strategy A in the panel discussion) is controlled perturbation injection: take a clean trace, deterministically inject a known fault inside a known time window, run DSFB-Database, and check whether the emitted episodes overlap that window.

Each perturbation:

has a name + class
is restricted to a [t_start, t_end] window (the ground-truth window)
is deterministic given a seed
is documented in spec/perturbations.yaml

The five perturbations cover the five motif classes one-to-one so the evaluation cleanly maps motif → injection → window → F1.

Structs§

PerturbationWindow

Enums§

PerturbationClass

Functions§

tpcds_with_perturbations: Build a TPC-DS-shaped trace with all five perturbations injected at disjoint, documented windows. The returned (stream, ground-truth windows) pair is the empirical evidence for §8 of the paper.
tpcds_with_perturbations_scaled: Same harness, but with each perturbation’s magnitude multiplied by scale. scale = 1.0 reproduces the canonical pinned-fingerprint stream exactly (same RNG draw sequence, same byte output). Lower scales produce subthreshold perturbations — the residual is still present but barely above noise — and the stress sweep (stress-sweep subcommand) reports per-motif F1 across a range of scales so we can see where each motif breaks down, not just that it works at the published baseline.