Module anova_atom

Expand description

Post-fit functional-ANOVA carve of a fitted product-manifold atom (#975).

§The carving problem

Two circular attributes in superposition (weekday θ₁, month θ₂) trace a torus in activation space. Is that ONE T² atom or TWO superposed S¹ atoms? Reconstruction cannot tell — same surface — so a learner without a principled criterion carves arbitrarily and “the dictionary” is an artifact of the carve. The GAM-native answer is functional ANOVA over the product manifold:

  g(θ₁, θ₂) = g₀ + f₁(θ₁) + f₂(θ₂) + f₁₂(θ₁, θ₂)

with sum-to-zero centering against the EMPIRICAL CODE MEASURE (the averaging measure is itself a gauge choice; we pin it to the code sample and say so). Then superposition = additivity (f₁₂ ≡ 0 ⇔ the torus IS two superposed circles, and fission along ANOVA lines is lossless) and binding = interaction (f₁₂ ≠ 0 is genuine joint structure; the atom is irreducible).

§Why not just covariance in activations?

Covariance is a second-moment statistic of the POINT CLOUD; the carve question is about the FUNCTIONAL FACTORIZATION of the surface. A bound torus and two superposed circles can trace the same point set with the same second moments — covariance sees the embedding, not whether the decoder map factors additively through the two angles. Independence of the codes (θ₁ ⫫ θ₂) is a third, separate property: codes can be dependent while the decoder is perfectly additive, and vice versa. Only the ANOVA interaction block answers “one atom or two”.

§Two inequivalent binding notions (both first-class here)

Representational binding: non-additivity of the DECODER g — does the surface embed as two superposed atoms?
Computational binding: non-additivity of the pulled-back READOUT h(θ₁,θ₂) = F(g(θ₁,θ₂)) (logit jets through the forward map, #980) — does the model USE the two angles jointly?

All four quadrants occur. Independent steerability (“turn the weekday knob without dragging month behavior”) requires additivity in BOTH senses, so the carve decision distinguishes them explicitly (FissionDecision): the same machinery runs twice — once on the decoder coefficients, once on readout-pulled-back coefficients — and choosing with only the representational arm is reported as such, never silently.

§Not everything is clean — the quantitative dial

A real model can be sort-of-bound: f₁₂ small but nonzero, or binding present in the readout but not the embedding. The carve therefore never emits a bare verdict: CarveReport::interaction_fraction is the fraction of (centered) surface energy carried by the interaction — a continuous “how bound” number — and the planted-partial-binding power curve lives on exactly this dial. The binding test rejects when the data PROVES f₁₂ ≠ 0; fission additionally demands the interaction be energetically negligible, because absence of evidence is not evidence of absence. Atoms failing both stay whole and CONTESTED — the demote-never-reject philosophy: the claim goes to the evidence ledger (structure_evidence::ClaimKind::BindingEdge, p-value calibrated via structure_evidence::log_e_from_p_calibrator) and earns a probe budget, instead of a silent carve either way.

§Post-fit by design

This module is a PURE READ of a fitted tensor-product decoder: the caller supplies the factor bases evaluated on the code sample and the per-output-dim coefficient matrices (plus, optionally, their posterior covariance for the Wald test). It deliberately does NOT add an in-fit ANOVA basis kind: two independent circles are just two atoms summing — ordinary superposition, the default multi-atom model — so the product machinery is only ever needed at the moment a fitted pair shows dependent codes and the structure search must adjudicate merge-vs-keep. That adjudication consumes this carve.

§The gauge inside the test (load-bearing)

On a partition-of-unity factor basis (B-splines: Σ_j φ_j ≡ 1) the empirically centered basis functions φ̃_j = φ_j − mean_n φ_j(θ_n) carry one exact linear dependence per factor: Σ_j φ̃_j ≡ 0. The coefficient directions u vᵀ + w uᵀ (u the dependence vector) change NOTHING about f₁₂ — they are pure gauge, their posterior values are penalty-set noise, and a Wald statistic that includes them is wrong. The binding test therefore projects the interaction block onto the gauge quotient (C ↦ P₁ C P₂, P_i = I − û_i û_iᵀ) before testing; the quotient dimension (M₁−1)(M₂−1) is the test’s honest rank.

Structs§

AnovaBlocks: The exact ANOVA reparameterization of one output dimension’s tensor coefficient matrix C (M₁ × M₂) under empirical-measure centering. With m_i the empirical mean of factor i’s basis over the code sample and φ̃ = φ − m, the surface decomposes EXACTLY (an identity, not an approximation):
CarveInput: Inputs for one notion’s carve over one fitted product atom.
CarveReport: What the carve concluded for one binding notion.
ChildDecoder: One child atom’s 1-D decoder for one output dimension, expressed on the CENTERED factor basis plus an explicit constant — basis-agnostic, no partition-of-unity assumption baked in. The child surface is constant + φ̃(θ)ᵀ·centered_coeffs.
FissionPlan: The lossless-on-the-additive-part split: child atoms inheriting the main-effect blocks. Gauge choice (documented, fixed): the grand mean g₀ rides with child A; child B is centered. The interaction energy the split discards is DECLARED in reconstruction_defect — by the fission rule it is ≤ FISSION_MAX_INTERACTION_FRACTION, but it is never silently zero.
FittedAtomCarveInput: The real-fit producer of a representational CarveInput from a fitted d = 2 product atom (#993).
PairSurfaceFit: A pair-component fit from RAW coordinates: the factor bases it was fit on (the grid engine’s per-axis uniform cubic B-splines, evaluated on the sample — exactly what CarveInput consumes, one measure end to end) plus the TensorSurfaceFit carve product and which backend produced it.
TensorSurfaceFit: A penalized tensor-surface fit over the code sample: the producer of CarveInputs for BOTH binding notions (#993 items 1–2).

Enums§

BindingNotion: Which binding notion a carve report speaks about (see module docs; the two are independent and a complete adjudication runs both).
FissionDecision: The joint adjudication over both notions — three-valued on purpose: the representational and computational carves differ exactly on the off-diagonal quadrants, so collapsing them silently is the one forbidden move.
PairSurfaceBackend: Which estimator produced a PairSurfaceFit.

Constants§

FISSION_MAX_INTERACTION_FRACTION: Interaction energy fraction at or below which the interaction block is energetically negligible and lossless fission is on the table. The bar is the finite-sample NOISE FLOOR of the interaction estimate, not exact algebraic zero. A planted, exactly-additive coefficient matrix carves to numerical zero (≈ f64 roundoff), but a real REML fit of a genuinely separable surface over noisy scattered codes cannot drive its penalized interaction block below the variance its own estimator injects: a 5%-noise pair fit lands at ~1e-4 of centered surface energy (a relative amplitude of 1e-2, ≈ √fraction). 1e-4 sits just above that estimator floor so a separable atom actually fissions end to end (the production fit_pair_surface → carve path, which the planted in-module tests do not exercise), while staying far below any genuine interaction — the bound panels carry fractions orders of magnitude larger, and the companion binding Wald test resolves small-but-real interactions besides. Auto-applied — no knob.

Functions§

anova_blocks: The exact reparameterization (see AnovaBlocks).
basis_means: Empirical mean of each basis column over the code sample — the centering vector m that pins the ANOVA gauge to the empirical code measure.
carve: The carve: exact ANOVA split, interaction energy, gauge-projected binding test, and the fission plan when this notion permits one.
carve_input_from_fitted_atom: Build the representational carve inputs for a fitted d = 2 product atom directly from its FUSED tensor basis and decoder (#993).
fission_decision: Joint adjudication across the two binding notions (see FissionDecision). representational must be a BindingNotion::Representational report; computational, when the #980 pulled-back coefficients were available, the matching BindingNotion::Computational one.
fit_pair_surface: THE pair-component estimator (#1031): fit the Layer-B ANOVA pair interaction surface from RAW coordinates, auto-routed with no knobs.
fit_tensor_surface: Fit the tensor-product surface y_d(θ₁,θ₂) ≈ φ¹(θ₁)ᵀ C_d φ²(θ₂) to sampled responses by ridge-penalized least squares with the ridge strength chosen by GAUSSIAN REML (profiled σ², exact 1-D criterion on the design’s eigenbasis — no GCV, per policy), returning coefficients AND their scale-included posterior covariance.