Expand description
#976 — evidence-guarded dictionary structure search: atom birth / death /
fission / fusion as anytime-valid hypothesis tests, with a deterministic,
serializable SearchLedger as the honesty surface.
§What this is
The two documented SAE pathologies, restated statistically:
- Feature absorption (an A⇒B hierarchy makes sparsity fold B’s content into A’s direction): an absorbing atom’s code distribution carries substructure — detectable misspecification, found by a within-atom audit and corrected by a FISSION move.
- Feature shattering (one curved family smeared across many
near-duplicate flat atoms): shattered atoms have dependent codes
(
gam_sae::atom_codes::CoactivationStats::dependence) and joint structure when refit together — corrected by a FUSION move.
This module owns the MOVE ENGINE: canonical deterministic proposal order,
structural-hash deduplication, e-process-gated acceptance, and the ledger.
It is generic over the fitter — the caller supplies the state type and four
closures (apply / evaluate / null-sup / refit), exactly the surface
run_atom_birth_gate already pins down. Warm structure inheritance is
enforced by construction: a candidate state is built FROM the parent state
(apply_move(&parent, &mv)), never from scratch — cold restarts after
structure moves are both slow and collapse-prone, so the API gives them no
entry point.
§Acceptance is a hypothesis test, not a threshold (#984)
The original #976 design accepted a move when
Δ(neg log evidence) < −margin under the Laplace normalizer. That is the
K vs K+1 boundary / Davies-regime comparison where likelihood-ratio
thresholds are invalid (the null sits on the boundary of the alternative;
the new atom’s parameters vanish under the null). Acceptance here is
therefore routed through the universal-inference e-process gates of
gam_terms::inference::structure_evidence:
- Birth / fission / fusion each assert structure BEYOND what the
current dictionary class expresses, so each runs an [
AtomBirthGate] (the mechanics are claim-generic: predictable alternative, honest null sup, Ville threshold at the α fixed inMoveBudget). A move is applied only when its claim is Certified; otherwise the structure is unchanged and the claim stays Contested in theStructureLedgerwith its banked evidence — the input to the #984 probe-design loop. - Death is never certifiable, by construction. The K−1 class is nested
inside the current class, so the split-likelihood e-value satisfies
E ≤ 1pointwise (the null sup dominates any sub-model fit): no amount of data can prove an atom unnecessary — only fail to prove it necessary. The demote-never-reject philosophy is therefore not a policy choice here, it is what the math leaves: a death proposal DEMOTES an atom whoseAtomExistsclaim has never certified (trigger: diverged ARD precision), and is VETOED for a certified atom (a Ville crossing is permanent — later evidence retreat cannot un-prove existence).
§Determinism
No RNG, no clock. Proposals are sorted by the canonical order (deaths by
ARD precision descending, fissions by audit significance ascending, fusions
by code dependence descending, births last by proposal mass descending; ties
broken by structural hash), deduplicated by the caller-computed structural
hash (the TermCollectionSpec hash machinery, #869), and processed
sequentially. Identical inputs ⇒ identical serialized SearchLedger —
which is what keeps replicate-null comparisons (#910/#943) valid across
structure changes.
The ledger reports a certified local mode: the moves explored, the evidence for accepted ones, and the evidence gaps to rejected alternatives. No global-optimality theater.
Structs§
- Collapse
Event - An assignment-collapse event from the joint fit (#976 Layer-1 guard): an
atom’s support fell below the active-mass floor and was either re-seeded
(bounded budget) or recorded as terminally collapsed — an observable event,
never a silent death and never a fit error. Terminal collapses are the
natural death-proposal feed for the next
searchround. - Move
Budget - The search round’s budget and error level.
- Move
Proposal - One proposal: the move, its trigger statistic (the canonical-order key,
kind-specific — see
StructureMovedocs), the caller-computed structural hash of the POST-move specification (dedup key), and the structural claim the move asserts (registered in theStructureLedgerso the dictionary certificate covers it). - Move
Record - One ledger line: the proposal exactly as ranked, plus its verdict.
- Search
Ledger - The serialized honesty surface of one search round: every proposal in canonical order with its verdict, plus any collapse events the joint fit recorded. Identical inputs produce a byte-identical serialization.
- Search
Outcome - Result of one search round: the (possibly restructured) state and the ledger.
Enums§
- Collapse
Action - The guard’s response to an active-mass breach.
- Move
Verdict - The per-proposal outcome. Every proposal handed to
searchgets exactly one record — the no-silent-caps rule. - Structure
Move - One proposed structural move. Atom indices are STABLE IDENTIFIERS for the
duration of one
searchround: the caller’sapply_movemust not reindex surviving atoms (mark dead atoms inactive, append born atoms) — the engine relies on this to detect conflicting proposals.
Functions§
- canonical_
order - Sort proposals into the canonical deterministic order: kind rank (deaths, fissions, fusions, births), then the kind’s trigger direction, then structural hash. Pure — no RNG, no clock — so the search path, and with it the ledger, is a function of the inputs alone.
- search
- Run one evidence-guarded structure-search round.