Expand description
ONE sensitivity operator (#935): every “how does the fit move?” question is the same solve.
At a penalized optimum the stationarity condition g(β̂; t) = 0 makes
every sensitivity of the fit one object — the factored fitted curvature
applied to a perturbation of the score:
∂β̂/∂t = −H⁻¹ · ∂g/∂tfor ANY perturbation channel t: smoothing parameters (the REML outer
gradient), case weights (ALO / leave-one-out / Cook’s distance),
responses (data attribution). One identity, read off in whichever
direction a diagnostic needs it.
Before this, the tree computed H⁻¹· in independent dialects with
independent factorizations — AloFactoredHessian (runtime.rs), an
ift_dbeta_drho_from_solver solve-closure and a separate coned variant
(evidence.rs), and the projected pseudo-inverse of the rank-deficient
LAML kernel (unified.rs) — so each site had to answer on its own the
question that actually causes bugs: which inverse is “H⁻¹”? The
large-scale fix 0dc469bd and the #901 layer-2 investigation are both
incidents of two sites answering differently.
FitSensitivity is the single answer. It is built once at the optimum
from whichever factored form the solver already has — a faer Cholesky
factor, a raw lower-triangular (arrow-Schur) factor, or the projected
pseudo-inverse U · M⁻¹ · Uᵀ (the #752/#901 intrinsic-quotient
convention) — and every consumer asks it, never a factor directly.
Consumers therefore cannot disagree about the inverse, and every
batching/cone improvement made inside FitSensitivity::apply_multi is
inherited by all of them at once.
The channels, each a one-line restatement of the identity above:
mode_response—−H⁻¹ ∂g/∂t, the REML outer gradient’s∂β̂/∂ρ(evidenceift_dbeta_drho).mode_response_coned— the same response confined to its cone of influence (#779); the lazy/local form the smoothing-correction IFT uses.leverage_block—H⁻¹Xᵀ, whose columniis at once ALO’s per-row solve and the case/response channel.case_deletion— dfbetas + Cook’s distance, the leave-one-out channel, one scaled column ofH⁻¹Xᵀeach.
What is deliberately NOT folded in: the matrix-free hop.solve_multi
(PCG/GPU), the constrained kernel K_T = K_S − K_S Aᵀ(A K_S Aᵀ)⁻¹A K_S,
and alo.rs’s zero-copy StableSolver loop. Those are distinct inverse
representations, not duplicate spellings of the same factored inverse —
routing them through here would regress performance and couple unrelated
concerns rather than remove the bug class.
Structs§
- Case
Deletion Influence - Exact (Gaussian) / one-step (GLM) case-deletion influence produced by
FitSensitivity::case_deletion. See that method for the identities. - FitSensitivity
- The one sensitivity operator built at the optimum. See module docs.
Enums§
- Fitted
Inverse - The fitted curvature in whichever factored form the solver produced — the SINGLE place that knows how to invert it.