Expand description
Penalized multinomial-logit (softmax) GLM driver — fixed-λ inner solve.
This is the principled vector-response companion to the scalar PIRLS path:
the inner-loop Newton solver for a multi-class GAM at fixed smoothing
parameters λ, using the canonical multinomial-logit likelihood
(MultinomialLogitLikelihood) and the existing dense block-Fisher
assembly in gam_solve::pirls::dense_block_xtwx /
gam_solve::pirls::dense_block_xtwy.
§What this module does
Solve, for the reference-coded multinomial-logit GAM with K classes and
design matrix X ∈ ℝ^{N×P},
β̂ = argmin_β { − log L(β) + ½ Σ_{a=0}^{K-2} λ_a · β_a^T S β_a }where β = [β_0; β_1; …; β_{K-2}] is the stacked coefficient vector in
output-major order (β_a ∈ ℝ^P is the coefficient block for class a),
S ∈ ℝ^{P×P} is the smoothing penalty matrix (shared across classes,
replicated as I_{K-1} ⊗ S over the full parameter space), and λ_a is
a per-class smoothing parameter.
The likelihood uses class K - 1 as the reference (η_{K-1} ≡ 0), so the
softmax gauge is fixed at the η level and no additional sum-to-zero
projection is required.
§Layering
-
Fixed-λ inner solve —
fit_penalized_multinomialis the canonical coefficient-space Newton solver at given smoothing parametersλ, built on the sharedcrate::penalized_vector_glmengine. -
REML / LAML smoothing-parameter selection —
fit_penalized_multinomial_formularoutes throughcrate::custom_family::fit_custom_family_with_rho_priorso the per-active-classλ_aare selected by the outer REML/LAML loop; the caller’sinit_lambdais only a warm-start seed. The multinomial [crate::multinomial_reml::MultinomialFamily]CustomFamilyimpl calls the fixed-λ math above as its inner solve at each ρ trial and supplies the dense per-row Hessian block for the outer trace terms. -
Formula → design integration —
build_formula_design_for_multinomialparses the Wilkinson formula and assemblesXand the per-termSblocks; thefit_multinomial_formula_pyfuncFFI shim wires the Pythongamfit.fit(..., family='multinomial')entry straight to this path.
§Convergence
The damped-Newton-with-backtracking scaffold lives once in the shared
crate::penalized_vector_glm engine: at each iteration the
assembled penalized Hessian H + I_{K-1} ⊗ (λ_a S) is factored via faer’s
symmetric-PD-with-fallback path, the full Newton step δ = −H^{-1} ∇F is
computed, and accepted with step halving if the objective fails to decrease
(up to a small backtracking budget). The convergence test is the relative
coefficient step norm ‖δ‖ / (1 + ‖β‖) ≤ tol, matching the existing pyffi
reference path. This module is the softmax adapter over that engine: it
supplies the dense (K-1)×(K-1) Fisher block, the residual, and the
log-likelihood through MultinomialLogitLikelihood, and owns the
class-count / simplex preconditions. The independent-binomial sibling
crate::binomial_multi is the same engine with a row-diagonal
Fisher block instead.
Structs§
- Multinomial
FitInputs - Inputs to
fit_penalized_multinomial. - Multinomial
FitOutputs - Outputs of
fit_penalized_multinomial. - Multinomial
Saved Model - Saved-model payload for a multinomial fit driven by a Wilkinson formula.
- Multinomial
Smooth Significance - One row of the multinomial smooth-significance table (#1101): the Wood
rank-truncated Wald test for one
(active class, smooth term)pair. - Multinomial
Smooth Term Span - One smooth term’s coefficient span within a class block, plus its
unpenalized nullspace dimension and a display label (#1101). The Wald
smooth-significance test in
summary()slices the joint covariance / influence ata·P + col_start .. a·P + col_endfor active classa.
Functions§
- fit_
penalized_ multinomial - Fit a penalized multinomial-logit GAM at fixed
λ. - fit_
penalized_ multinomial_ formula - Top-level formula-driven multinomial fit.
- predict_
multinomial_ formula - Replay the saved termspec to build the predict-time design on a fresh dataset, then evaluate softmax probabilities. The predict dataset must carry the same feature columns the training data did, matched by name — it need not reproduce the training column order, and in particular need not carry the response column (prediction is for label-free new data).
- predict_
multinomial_ formula_ with_ se - Predict class probabilities AND delta-method per-class probability standard
errors for a saved multinomial model on fresh data (#1101). Replays the
saved termspec to build the predict design exactly as
predict_multinomial_formula, then applies the softmax-Jacobian delta method against the stored joint posterior covariance. Returns(probs (N,K), prob_se (N,K) | None);prob_seisNonefor a legacy model fitted before covariance was surfaced.