Skip to main content

Module atom_codes

Module atom_codes 

Source
Expand description

Per-point sparse atom codes for multi-manifold reconstruction.

This module owns the storage of per-observation soft assignments over a library of K candidate manifold-atoms (see crate::assignment::SaeAssignment for the surrounding selection/gate layer). The two key types are:

  • BitVec — a minimal dependency-free bitset used to record the active support S_n ⊆ {0, …, K−1} of each observation. We avoid pulling in the external bitvec crate to keep this module aligned with the rest of gam’s “no extra deps for new primitives” policy.

  • SparseAtomCode — the per-point pair (active_mask, weights) whose semantics are documented on the type. Reconstruction at point n is

    Ẑ_n  =  Σ_{k ∈ S_n}  w_{n,k}  ·  decoder_k(t_{n,k})

    so weights[k] is meaningful only when active_mask.get(k) == true. We store weights densely (Vec<f64> of length K) rather than sparsely; for the typical SAE workload K is small (tens to low hundreds), and the dense layout lets us reuse ndarray views and simple BLAS-shaped loops downstream. The mask carries the discrete active-set information; the weights carry the soft amplitudes.

§Per-point block locality (arrow structure)

Each SparseAtomCode is the per-row ext-coordinate block for observation n restricted to the K atoms. Combined with the per-atom on-manifold coordinate t_{n,k} ∈ ℝ^{d_k} (held in the per-atom latent-coordinate blocks of crate::assignment::SaeAssignment), the row-local ext-coordinate vector is

  ext_n  =  ( a_{n,1..K}  ;  t_{n,1,·}  ;  …  ;  t_{n,K,·} )

whose interaction graph with the shared decoder coefficients B_1..B_K is exactly the arrow / bordered-Hessian pattern from latent_coord.md §2.2. The Schur complement that Piece 1 uses to eliminate β before the per-row solve generalises here with one change: the row-n block now couples to only the active subset S_n of decoder borders, not to all K of them. That is the structural fact this module records.

Structs§

BitVec
Minimal bit-vector. Backing storage is Vec<u64> words.
CoactivationStats
Pairwise co-activation summary for two atoms (see SparseAtomCodes::coactivation). All probabilities are empirical popcount ratios over the active-support masks.
SparseAtomCode
Per-point sparse code over K candidate atoms.
SparseAtomCodes
Storage for the per-row codes of all N observations.