Module sensitivity

Expand description

ONE sensitivity operator (#935): every “how does the fit move?” question is the same solve.

At a penalized optimum the stationarity condition g(β̂; t) = 0 makes every sensitivity of the fit one object — the factored fitted curvature applied to a perturbation of the score:

  ∂β̂/∂t = −H⁻¹ · ∂g/∂t

for ANY perturbation channel t: smoothing parameters (the REML outer gradient), case weights (ALO / leave-one-out / Cook’s distance), responses (data attribution). One identity, read off in whichever direction a diagnostic needs it.

Before this, the tree computed H⁻¹· in independent dialects with independent factorizations — AloFactoredHessian (runtime.rs), an ift_dbeta_drho_from_solver solve-closure and a separate coned variant (evidence.rs), and the projected pseudo-inverse of the rank-deficient LAML kernel (unified.rs) — so each site had to answer on its own the question that actually causes bugs: which inverse is “H⁻¹”? The large-scale fix 0dc469bd and the #901 layer-2 investigation are both incidents of two sites answering differently.

FitSensitivity is the single answer. It is built once at the optimum from whichever factored form the solver already has — a faer Cholesky factor, a raw lower-triangular (arrow-Schur) factor, or the projected pseudo-inverse U · M⁻¹ · Uᵀ (the #752/#901 intrinsic-quotient convention) — and every consumer asks it, never a factor directly. Consumers therefore cannot disagree about the inverse, and every batching/cone improvement made inside FitSensitivity::apply_multi is inherited by all of them at once.

The channels, each a one-line restatement of the identity above:

mode_response — −H⁻¹ ∂g/∂t, the REML outer gradient’s ∂β̂/∂ρ (evidence ift_dbeta_drho).
mode_response_coned — the same response confined to its cone of influence (#779); the lazy/local form the smoothing-correction IFT uses.
leverage_block — H⁻¹Xᵀ, whose column i is at once ALO’s per-row solve and the case/response channel.
case_deletion — dfbetas + Cook’s distance, the leave-one-out channel, one scaled column of H⁻¹Xᵀ each.

What is deliberately NOT folded in: the matrix-free hop.solve_multi (PCG/GPU), the constrained kernel K_T = K_S − K_S Aᵀ(A K_S Aᵀ)⁻¹A K_S, and alo.rs’s zero-copy StableSolver loop. Those are distinct inverse representations, not duplicate spellings of the same factored inverse — routing them through here would regress performance and couple unrelated concerns rather than remove the bug class.

Structs§

CaseDeletionInfluence: Exact (Gaussian) / one-step (GLM) case-deletion influence produced by FitSensitivity::case_deletion. See that method for the identities.
FitSensitivity: The one sensitivity operator built at the optimum. See module docs.

Enums§

FittedInverse: The fitted curvature in whichever factored form the solver produced — the SINGLE place that knows how to invert it.