Expand description
Published-baseline change-point detectors for the paper’s §7 bake-off.
The bake-off answers a reviewer’s mandatory question: on the same residual streams that dsfb-database consumes, how do standard change-point baselines score against the same ground-truth windows?
We implement three classics from the change-point literature:
adwin— ADWIN (Bifet & Gavaldà, 2007, “Learning from time-changing data with adaptive windowing”). Online, streaming, no distributional assumption.bocpd— Bayesian Online Change-Point Detection (Adams & MacKay, 2007). Online, Gaussian Normal-Gamma conjugate predictive.pelt— Pruned Exact Linear Time (Killick, Fearnhead & Eckley, 2012). Offline, optimal under an L2 cost for changes in mean.
All three are faithful reference implementations rather than thin calls to external crates — we want every bake-off number to be reproducible from this file alone, with no hidden dependency drift.
Each detector returns a list of change-point timestamps. The
run_detector helper wraps those into crate::grammar::Episodes
on the matching motif so that crate::metrics::evaluate scores the
baseline with the exact same TP / FP / FN rules that grade the
dsfb-database motif grammar — apples-to-apples.
§Charitable conversion choice
Change-point detectors report points; motif episodes are intervals.
We convert a change-point at time t into an episode [t, t + dwell]
where dwell is the motif’s min_dwell_seconds. This is deliberately
generous to the baselines — any ground-truth window within
min_dwell of the reported change-point counts as a TP, which
maximises the baselines’ recall. An adversarial reviewer can
re-derive stricter numbers from the emitted CSVs by shrinking the
dwell; the raw change-point times are in the CSV alongside the
wrapped episodes.
Modules§
- adwin
- ADWIN (ADaptive WINdowing) change-point detector.
- bocpd
- BOCPD — Bayesian Online Change-Point Detection.
- pelt
- PELT — Pruned Exact Linear Time change-point detection.
Traits§
- Change
Point Detector - A detector operates on an ordered univariate
(t, value)series and reports change-point timestamps. Implementations must be deterministic for a given input — the whole bake-off is pinned to byte-identical CSVs, so any randomness kills the replay guarantee.
Functions§
- run_
detector - Run a detector against one motif of a residual stream and emit
Episodes that the standard metrics path can score.