Module bocpd

Expand description

Bayesian Online Change-Point Detection (BOCPD) for Resize Regime Detection.

This module implements BOCPD to replace heuristic threshold-based regime detection in the resize coalescer. It provides a principled Bayesian approach to detecting transitions between Steady and Burst regimes.

§Mathematical Model

§Observation Model

We observe inter-arrival times x_t between consecutive resize events. The observation model conditions on the current regime:

Steady regime: x_t ~ Exponential(λ_steady)
               where λ_steady = 1/μ_steady, μ_steady ≈ 200ms (slow events)

Burst regime:  x_t ~ Exponential(λ_burst)
               where λ_burst = 1/μ_burst, μ_burst ≈ 20ms (rapid events)

The likelihood for observation x under each regime:

P(x | steady) = λ_steady × exp(-λ_steady × x)
P(x | burst)  = λ_burst × exp(-λ_burst × x)

§Run-Length Model

BOCPD maintains a distribution over run-lengths r_t, where r represents the number of observations since the last changepoint.

The run-length posterior is recursively updated:

P(r_t = 0 | x_1:t) ∝ Σᵣ P(r_{t-1} = r | x_1:t-1) × H(r) × P(x_t | r)
P(r_t = r+1 | x_1:t) ∝ P(r_{t-1} = r | x_1:t-1) × (1 - H(r)) × P(x_t | r)

where H(r) is the hazard function (probability of changepoint at run-length r).

§Hazard Function

We use a constant hazard model with geometric prior on run-length:

H(r) = 1/λ_hazard

where λ_hazard is the expected run-length between changepoints.
Default: λ_hazard = 50 (expect changepoint every ~50 observations)

This implies:

P(changepoint at r) = (1 - 1/λ)^r × (1/λ)
E[run-length] = λ_hazard

§Run-Length Truncation

To achieve O(K) complexity per update, we truncate the run-length distribution at maximum K:

K = 100 (default)

For r ≥ K: P(r_t = r | x_1:t) is merged into P(r_t = K | x_1:t)

This approximation is accurate when K >> λ_hazard, since most mass concentrates on recent run-lengths.

§Regime Posterior

We maintain a separate regime indicator:

Regime detection via likelihood ratio:

LR_t = P(x_1:t | burst) / P(x_1:t | steady)

P(burst | x_1:t) = LR_t × P(burst) / (LR_t × P(burst) + P(steady))

where P(burst) = 0.2 (prior probability of burst regime)

The regime posterior is integrated over all run-lengths:

P(burst | x_1:t) = Σᵣ P(burst | r, x_1:t) × P(r | x_1:t)

§Decision Rule

The coalescing delay is selected based on the burst posterior:

Let p_burst = P(burst | x_1:t)

If p_burst < 0.3:  delay = steady_delay_ms  (16ms, responsive)
If p_burst > 0.7:  delay = burst_delay_ms   (40ms, coalescing)
Otherwise:         delay = interpolate(16ms, 40ms, p_burst)

Always respect hard_deadline_ms (100ms) regardless of regime.

§Log-Bayes Factor for Explainability

We compute the log10 Bayes factor for each decision:

LBF = log10(P(x_t | burst) / P(x_t | steady))

Interpretation:
- LBF > 1:  Strong evidence for burst
- LBF > 2:  Decisive evidence for burst
- LBF < -1: Strong evidence for steady
- LBF < -2: Decisive evidence for steady

§Invariants

Normalized posterior: Σᵣ P(r_t = r) = 1 (up to numerical precision)
Deterministic: Same observation sequence → same posteriors
Bounded complexity: O(K) per observation update
Bounded memory: O(K) state vector
Monotonic regime confidence: p_burst increases with rapid events

§Failure Modes

Condition	Behavior	Rationale
x = 0 (instant event)	x = ε = 1ms	Avoid log(0) in likelihood
x > 10s (very slow)	Clamp to 10s	Numerical stability
All posterior mass at r=K	Normal operation	Truncation working
λ_hazard = 0	Use default λ=50	Avoid division by zero
K = 0	Use default K=100	Ensure valid state

§Configuration

BocpdConfig {
    // Observation model
    mu_steady_ms: 200.0,    // Expected inter-arrival in steady (ms)
    mu_burst_ms: 20.0,      // Expected inter-arrival in burst (ms)

    // Hazard function
    hazard_lambda: 50.0,    // Expected run-length between changepoints

    // Truncation
    max_run_length: 100,    // K for O(K) complexity

    // Decision thresholds
    steady_threshold: 0.3,  // p_burst below this → steady
    burst_threshold: 0.7,   // p_burst above this → burst

    // Priors
    burst_prior: 0.2,       // P(burst) a priori
}

§Performance

Time complexity: O(K) per observation
Space complexity: O(K) for run-length posterior
Default K=100: ~100 multiplications per resize event
Suitable for: Up to 1000 events/second without concern

§References

Adams & MacKay (2007): “Bayesian Online Changepoint Detection”
The run-length truncation follows standard BOCPD practice
Hazard function choice is geometric (constant hazard)

Structs§

BocpdConfig: Configuration for the BOCPD regime detector.
BocpdDetector: Bayesian Online Change-Point Detection for regime classification.
BocpdEvidence: Evidence from a BOCPD update step.

Enums§

BocpdRegime: Detected regime from BOCPD analysis.

Module bocpd

Module bocpd Copy item path

§Mathematical Model

§Observation Model

§Run-Length Model

§Hazard Function

§Run-Length Truncation

§Regime Posterior

§Decision Rule

§Log-Bayes Factor for Explainability

§Invariants

§Failure Modes

§Configuration

§Performance

§References

Structs§

Enums§

Module bocpd