1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
//! Published-baseline change-point detectors for the paper's §7 bake-off.
//!
//! The bake-off answers a reviewer's mandatory question: *on the same
//! residual streams that dsfb-database consumes, how do standard
//! change-point baselines score against the same ground-truth windows?*
//!
//! We implement three classics from the change-point literature:
//!
//! * [`adwin`] — ADWIN (Bifet & Gavaldà, 2007, *"Learning from
//! time-changing data with adaptive windowing"*). Online, streaming,
//! no distributional assumption.
//! * [`bocpd`] — Bayesian Online Change-Point Detection (Adams & MacKay,
//! 2007). Online, Gaussian Normal-Gamma conjugate predictive.
//! * [`pelt`] — Pruned Exact Linear Time (Killick, Fearnhead & Eckley,
//! 2012). Offline, optimal under an L2 cost for changes in mean.
//!
//! All three are faithful reference implementations rather than thin
//! calls to external crates — we want every bake-off number to be
//! reproducible from this file alone, with no hidden dependency drift.
//!
//! Each detector returns a list of *change-point timestamps*. The
//! [`run_detector`] helper wraps those into [`crate::grammar::Episode`]s
//! on the matching motif so that [`crate::metrics::evaluate`] scores the
//! baseline with the exact same TP / FP / FN rules that grade the
//! dsfb-database motif grammar — apples-to-apples.
//!
//! ### Charitable conversion choice
//!
//! Change-point detectors report points; motif episodes are intervals.
//! We convert a change-point at time `t` into an episode `[t, t + dwell]`
//! where `dwell` is the motif's `min_dwell_seconds`. This is deliberately
//! generous to the baselines — any ground-truth window within
//! `min_dwell` of the reported change-point counts as a TP, which
//! maximises the baselines' recall. An adversarial reviewer can
//! re-derive stricter numbers from the emitted CSVs by shrinking the
//! dwell; the raw change-point times are in the CSV alongside the
//! wrapped episodes.
use crate;
use crate;
use BTreeMap;
/// A detector operates on an ordered univariate `(t, value)` series and
/// reports change-point timestamps. Implementations must be deterministic
/// for a given input — the whole bake-off is pinned to byte-identical
/// CSVs, so any randomness kills the replay guarantee.
/// Run a detector against one motif of a residual stream and emit
/// [`Episode`]s that the standard metrics path can score.
///
/// The series is built per-channel: for each distinct channel string
/// under the motif's residual class we invoke `detector.detect`
/// independently, then union the resulting change-points. This mirrors
/// the channel granularity the dsfb motif state machine gets for free
/// — without it the baseline would be unfairly starved of the channel
/// signal that the motif grammar consumes.