Expand description
DSFB-Debug: episode aggregation — Trace Event Collapse implementation.
§Trace Event Collapse — paper §7’s “primary developer-facing delta”
This module is the structural-aggregation core. It collapses per-(window, signal) anomaly events into a small number of typed structural episodes. The collapse is the operator-visible delta of DSFB-Debug versus flat alerting: 11 raw cell-level alerts on F-11 collapse into 3 typed episodes (RSCR 3.67×); 52 raw alerts on the AIOps Challenge KPI fixture collapse into 1 episode (RSCR 52×).
§Algorithm
aggregate_episodes is a deterministic run-length aggregator over
the per-(window, signal) PolicyState grid. An episode opens when
any signal transitions to Review or Escalate; an episode
closes when all signals return to Silent or Watch for
correlation_window consecutive windows. The closed episode
carries:
episode_id— sequential id from 0start_window/end_window— inclusive rangepeak_grammar_state— max grammar state observedprimary_reason_code— most-frequent reason codepolicy_state— peak policy state (Watch / Review / Escalate)contributing_signal_count— distinct signals in non-Admissible state during the episodestructural_signature—(dominant_drift_direction, peak_slew_magnitude, duration_windows, signal_correlation)matched_motif— left asUnknown; populated by the bank’smatch_episode_with_consensusafter fusion.
§compute_metrics — RSCR + fault-recall + clean-window FP rate
Computes the paper §13 headline metrics from a closed episode list:
| Metric | Formula |
|---|---|
| RSCR | raw_alerts / max(1, dsfb_episode_count) |
| episode_precision | episodes_overlapping_labeled_fault / dsfb_episode_count |
| fault_recall | labeled_faults_captured_by_at_least_one_episode / total_labeled_faults |
| investigation_load_reduction_pct | (1 - dsfb / raw) × 100 |
| clean_window_false_episode_rate | episodes_in_clean_window / clean_window_count |
All formulas are repeated inline here so a reader doesn’t need to cross-reference the paper.
§Determinism (Theorem 9)
The aggregator is a pure function of the policy-state grid. Iteration order is deterministic (row-major over windows then signals). Tie-breakers (most-frequent-reason-code resolves ties by lower enum index) preserve byte-identical output across replays.
Functions§
- aggregate_
episodes - Aggregate per-window policy evaluations into episodes.
- compute_
metrics - Compute benchmark metrics from episodes and fault labels. Paper §7.4