Module diff_strategy

Expand description

Bayesian Diff Strategy Selection.

This module provides an adaptive strategy selector for buffer diffing, choosing between full diff, dirty-row diff, or full redraw based on expected cost using a Bayesian change-rate model.

§Cost Model

We model the cost of each strategy as:

Cost = c_scan × cells_scanned + c_emit × cells_emitted + c_overhead

Where:

c_scan = cost per cell comparison (memory load + compare)
c_emit = cost per changed cell emitted (ANSI escape + write)
c_overhead = fixed overhead per frame

§Strategy Costs

Let:

N = width × height (total cells)
D = number of dirty rows
W = width (cells per row)
p = change rate (fraction of cells changed)

§Full Diff (`compute`)

Scans all rows with row-skip fast path for unchanged rows:

Cost_full = c_row × H + c_scan × D × W + c_emit × (p × N)

where c_row is the cost of the row-equality fast path check.

§Dirty-Row Diff (`compute_dirty`)

Scans only rows marked dirty. When available, use a scan-cell estimate (e.g., dirty-span coverage) to refine the scan cost:

Cost_dirty = c_scan × ScanCells + c_emit × (p × N)

Where ScanCells defaults to D × W when no estimate is provided.

§Full Redraw

No diff computation; emit all cells:

Cost_redraw = c_emit × N

§Bayesian Change-Rate Posterior

We maintain a Beta prior/posterior over the change rate p:

p ~ Beta(α, β)

Prior: α₀ = 1, β₀ = 19  (E[p] = 0.05, expecting ~5% change rate)

Update per frame:
  α ← α + N_changed
  β ← β + (N_scanned - N_changed)

Posterior mean: E[p] = α / (α + β)
Posterior variance: Var[p] = αβ / ((α+β)² × (α+β+1))

§Decision Rule

Select strategy with minimum expected cost:

strategy = argmin { E[Cost_full], E[Cost_dirty], E[Cost_redraw] }

Using E[p] from the posterior to compute expected costs.

§Conservative Mode

For worst-case scenarios, use the upper 95th percentile of p:

p_95 = quantile(Beta(α, β), 0.95)

This provides a more conservative estimate when the posterior variance is high (early frames, unstable UI).

§Decay / Forgetting

To adapt to changing workloads, we apply exponential decay:

α ← α × decay + N_changed
β ← β × decay + (N_scanned - N_changed)

where decay ∈ (0, 1) (default 0.95). This weights recent frames more heavily, allowing the posterior to track non-stationary change patterns.

§Invariants

Deterministic: Same inputs → same strategy selection
O(1) update: Posterior update is constant time per frame
Bounded posterior: α, β ∈ [ε, MAX] to avoid numerical issues
Monotonic dirty tracking: Dirty rows are a superset of changed rows

§Failure Modes

Condition	Behavior	Rationale
α, β → 0	Clamp to ε = 1e-6	Avoid degenerate Beta
α, β → ∞	Cap at MAX = 1e6	Prevent overflow
D = 0 (no dirty)	Use dirty-row diff	O(height) check, optimal
D = H (all dirty)	Full diff if p low, redraw if p high	Cost-based decision
Dimension mismatch	Full redraw	Buffer resize scenario

Structs§

ChangeRateEstimator: Beta-Binomial estimator for change-rate p.
DiffStrategyConfig: Configuration for the diff strategy selector.
DiffStrategySelector: Bayesian diff strategy selector.
StrategyEvidence: Evidence supporting a strategy decision.

Enums§

DiffStrategy: The diff strategy to use for the current frame.