Skip to main content

Module loop_guard

Module loop_guard 

Source
Expand description

Certified termination (#968): ONE exhaustion/stagnation policy for every damped inner loop.

§The bug genus this kills

Every hang in the tracker’s history (#874, #789, #683, #744, the survival-AFT cluster, #826’s 42-minute frozen-residual stall) traces to the same structural flaw: termination safety was a per-branch, hand-replicated convention. The #874 postmortem is the canonical specimen — the LM gain-reject branch lacked the exhaustion guard its sibling screening-reject branch in the SAME file already had. Guard drift between sibling branches is the control-flow twin of the objective↔gradient desync class, and the cure is the same: a single source of truth that branches consume and cannot locally re-derive.

§The policy pieces

madsen_can_retry / madsen_retry_exhausted own the damped-retry exhaustion question for Madsen-style Levenberg–Marquardt loops: a retry is alive while the damping is finite and below MADSEN_DAMPING_CAP, and dead once attempts run out or damping leaves that window. Both engines (reweight.rs Madsen-LM and the custom_family.rs spectral Newton) must answer this question through these functions — never through a local predicate.

IterationBound and RejectEscalator are the two distinct safety mechanisms of an unbounded damped-retry loop, kept as two types on purpose. The bound owns the per-iteration hard count: it ticks once at the top of EVERY pass — including continue paths that neither accept a step nor reach a reject ritual (Fisher fallback, special cases) — and is the net that makes an unbounded loop {} safe. The escalator owns the geometric damping discipline applied on REJECTS only. A single type coupling “count++” to “reject” would either double-count iterations or silently assume every non-accepting pass reaches a reject ritual — the exact unbounded-loop hole the guard exists to close (see the #968 thread’s design note).

FlatStreak owns the consecutive-window discipline every stagnation detector shares: a streak that grows on “flat” readings, resets on recovery, and fires once it spans the window. Loops that own a scale-aware flatness predicate of their own (the custom_family joint-Newton objective-flat counter, the blockwise frozen-loglik divergence detector) consume it directly — they answer the question attempt caps cannot see: a loop that still “makes progress” every iteration but whose MERIT is frozen. #744 ran to cycle 1199/1200 at a flat residual; #826 burned a CI timeout on a frozen joint residual. The caller feeds its descent quantity (penalized NLL, residual norm, |g|) through its own flatness predicate once per iteration; the streak reports a plateau once flat readings span a consecutive window — long before any iteration cap.

§Verdicts, not panics

Exhaustion is an escalation event: the consuming loop converts LoopVerdict::Plateaued / LoopVerdict::Exhausted into its honest terminal status (StalledAtValidMinimum, LmStepSearchExhausted, …) and unwinds. Never a hang, never a panic, never a silent wrong answer.

§Migration map (each step deleted a hand-rolled guard)

  1. (done) reweight.rs lm_can_retry/lm_retry_exhausted local fns + the local LM_MAX_LAMBDA const deleted; call sites consume this module’s policy.
  2. (done) The 7 copies of the reweight.rs reject ritual (loop_lambda *= factor; factor *= 2.0; continue) collapsed onto RejectEscalator::escalate, and the per-iteration hard count moved into IterationBound, so neither discipline can drift per-branch.
  3. (done) custom_family.rs: the joint-Newton objective-flat counter and the blockwise frozen-loglik divergence streak both ride FlatStreak — the #826-class exit discipline now lives here, not in per-loop counters. The richer certificate machinery those loops layer on top (geometric-tail bound, clamped-step side condition) stays local: it is policy about what counts as flat, which the loops rightly own; the streak/window discipline is what must not fork.
  4. (dropped) Terminal-verdict reporting into heartbeat scopes: the [JN-EXIT]/[PIRLS] per-exit log lines already name why a loop ended; a parallel verdict channel in the process monitor would be redundant global state.

Structs§

FlatStreak
Consecutive-flatness streak: the window discipline shared by every stagnation detector in the tree. The caller owns the flatness predicate (scale-aware objective tolerance, frozen log-likelihood, sub-tolerance relative improvement, …); this type owns the part that historically forked per loop — grow on flat, reset on recovery, fire once the streak spans the window, and keep firing while it persists.
IterationBound
Per-iteration hard bound for a damped retry loop: the net that makes an unbounded loop {} safe. Tick it once at the top of EVERY pass — accepted, rejected, or any continue path that reaches neither — and ask IterationBound::exhausted_at wherever the loop’s exhaustion question is posed. Created fresh per outer iteration.
RejectEscalator
Geometric damping escalator for one reject chain (Madsen–Nielsen–Tingleff eq 3.16: the multiplier starts at 2 and doubles on every rejection, so successive bumps are ×2, ×4, ×8, …). Owns the factor and the reject count as one indivisible discipline — no branch can bump the damping without advancing the schedule, the drift mode behind #874. Deliberately does NOT own the per-iteration count; that is IterationBound’s job (see module docs for why the two must not be one type).

Enums§

LoopVerdict
Terminal verdict of a guarded loop. Continue is the only non-terminal answer; the two terminal verdicts are ESCALATION events the consumer must convert into an honest status, never swallow.

Constants§

MADSEN_DAMPING_CAP
Damping ceiling for Madsen-style LM retries. Beyond this the proposed step is numerically a zero step — retrying cannot make progress, so the retry chain is declared dead. (Moved verbatim from reweight.rs, where it was a file-local convention; see module docs for why it must be shared.)
MADSEN_INITIAL_REJECT_FACTOR
Initial damping multiplier on the first rejection of an iteration. Doubles on every further rejection (geometric escalation), reaching MADSEN_DAMPING_CAP from λ = 1 in ~12 rejections — the established reweight.rs schedule, now owned here.
PLATEAU_DEFAULT_WINDOW
Default consecutive-window length for a FlatStreak stagnation detector: how many successive flat readings must accumulate before the loop is declared plateaued. Two is the established in-tree streak convention (reweight.rs soft-acceptance) — one noisy reading can fake a plateau, two consecutive cannot — plus one for the headroom a merit that is genuinely creeping (not frozen) needs to escape.

Functions§

inner_convergence_is_truthful
Convergence-truthfulness invariant for an inner-solve terminal verdict (gam#1040).
madsen_can_retry
Is a damped retry still alive at this damping level?
madsen_retry_exhausted
Has the retry chain exhausted its budget — by attempt count or by the damping leaving the productive window?
slow_geometric_rate_exceeds_projection_cap
Deterministic slow-geometric-rate stall predicate (gam#979 survival marginal-slope hang).