Skip to main content

SaeReconstructionRowProgram

Struct SaeReconstructionRowProgram 

Source
pub struct SaeReconstructionRowProgram {
    pub atoms: Vec<AtomRowBasisJet>,
    pub gate_value: Vec<f64>,
    pub logits: Vec<f64>,
    pub gate_scale: Vec<f64>,
    pub gate_shift: Vec<f64>,
    pub gate: RowGate,
    pub logit_slot: Vec<Option<usize>>,
    pub coord_slot: Vec<Vec<usize>>,
    pub n_primaries: usize,
}
Expand description

One row of the SAE reconstruction as a jet program: the per-atom basis jets, the gate, the current gate-logit values, and the primary layout that maps (atom logit, atom latent axis) to a seeded tower variable slot.

Fields§

§atoms: Vec<AtomRowBasisJet>

Per-atom basis jets at the current row.

§gate_value: Vec<f64>

Current gate activations ζ_k at the row (softmax/sigmoid values).

§logits: Vec<f64>

Current gate logits ℓ_k at the row.

§gate_scale: Vec<f64>

Per-atom multiplicative scale for independent logistic gates. This is the IBP stick-breaking prior π_k for IBP-MAP, 1 for active JumpReLU, and 0 for JumpReLU rows at/below the hard threshold. Unused for softmax.

§gate_shift: Vec<f64>

Per-atom logistic shift (IBP offset / JumpReLU threshold); unused for softmax.

§gate: RowGate

The gate nonlinearity.

§logit_slot: Vec<Option<usize>>

Tower slot of atom k’s gate logit primary, or None if the gate logit is not a free primary for this atom (softmax K==1).

§coord_slot: Vec<Vec<usize>>

Tower slot of atom k’s latent axis j primary (coord_slot[k][j]).

§n_primaries: usize

Total number of seeded primaries (= K of the tower).

Implementations§

Source§

impl SaeReconstructionRowProgram

Source

pub fn reconstruction_column_packed<const K: usize>( &self, out_col: usize, ) -> Order2<K>

The reconstruction output column c as the PACKED order-2 jet Order2<K>: value .value(), gradient .g()[a] = ∂ẑ_c/∂p_a, Hessian .h()[a][b] = ∂²ẑ_c/∂p_a∂p_b.

This is the production path (#932): the arrow-Schur logdet consumer reads ONLY the order-≤2 channels of the reconstruction, so it builds the packed Order2<K> scalar — value/gradient/Hessian only — instead of the dense Tower4<K> (which materialises the entire K⁴ t3/t4 tensor every row only to discard it). For K up to 16 the dense tower’s tensor build is ~19× the instruction count of the order-2 channels alone; this collapses it to the channels actually read. The packed (v, g, H) is BIT-IDENTICAL to the order-≤2 channels of [Self::reconstruction_column_tower] (the Order2 newtype delegates to the same Tower2 arithmetic the dense tower’s order-≤2 channels use); the t3/t4 oracle pins the dense path.

Source

pub fn reconstruction_all_columns_packed<const K: usize>( &self, ) -> Vec<Order2<K>>

All out_dim reconstruction columns as packed Order2<K> jets, with the per-row redundant sub-jets HOISTED out of the output-column loop (#932 perf). reconstruction_column_packed(c) rebuilds, for every output column c, both the per-atom softmax gate jet ζ_k (K exps + a recip

  • a K×K Hessian — the dominant cost) AND each per-atom basis jet Φ_{k,b} — yet neither depends on c: the gate is a function of the logits only, and the basis jet is the local Taylor model of Φ_b in the coords, the decoder coefficient B_{b,c} being the only c-dependent factor. The consumer (fill_reconstruction_channels_from_program) calls it once per c, so the gate and basis jets are recomputed out_dim× redundantly.

This builds each atom’s gate jet ONCE (K total) and each atom’s basis jets ONCE (n_basis per atom), then assembles every column by the cheap reductions decoded_{k,c} = Σ_b Φ_{k,b}·B_{b,c} and ẑ_c = Σ_k ζ_k·decoded_{k,c}. The result is bit-identical to calling Self::reconstruction_column_packed per column (same Leibniz products in the same order) — only the redundant recomputation is removed — measured ~9× faster at K=8, out_dim=16 on the per-row hot path.

Source

pub fn reconstruction_column<const K: usize>(&self, out_col: usize) -> Tower4<K>

The reconstruction output column as the full dense Tower4<K> carrying every value/gradient/Hessian/t3/t4 channel. This is the #932 oracle ground truth: the production Self::reconstruction_column_packed order-2 path is pinned against its order-≤2 channels, and the FD-witness tests use its t3/t4. Not on the per-row hot path.

Source

pub fn beta_border_tower_packed<const K: usize>( &self, atom: usize, basis_col: usize, ) -> Order2<K>

The β border-channel local-variable sub-jet as the PACKED order-2 jet Order2<K>. The consumer reads only .value() (the beta channel) and .g()[a] (the beta_deriv / beta_l_deriv mixed channel — the reconstruction is linear in β so the Hessian-in-β vanishes and only value+gradient are needed). Built from the SAME packed gate / basis primitives as Self::reconstruction_column, so the dense t3/t4 tensor is never materialised on this per-row hot path (#932 Tower4→Order2 cutover).

Source

pub fn beta_border_tower<const K: usize>( &self, atom: usize, basis_col: usize, ) -> Tower4<K>

The β border-channel sub-jet as the full dense Tower4<K> — the #932 oracle ground truth the packed Self::beta_border_tower_packed is pinned against. Not on the per-row hot path.

Source

pub fn beta_border_towers_packed<const K: usize>( &self, channels: &[(usize, usize)], ) -> Vec<Order2<K>>

Packed β border-channel sub-jets for a batch of (atom, basis_col) channels, with the per-atom gate jets HOISTED and the softmax denominator SHARED across atoms (#932 perf): the gate jet ζ_k (the dominant K-exp / K×K-Hessian cost) is a function of the row’s logits only, not of basis_col, and every atom’s gate shares one softmax denominator / reciprocal. Self::all_gates builds all K gates once (K exps + 1 recip per row); each channel then just multiplies its atom’s cached gate by its basis jet. Each result is bit-identical to Self::beta_border_tower_packed for the same (atom, basis_col) (same gate.mul(basis) product), in the input order.

Source

pub fn beta_border_order1_packed<const K: usize>( &self, channels: &[(usize, usize)], ) -> Vec<Order1<K>>

Packed β border-channel sub-jets for a batch of channels as the FIRST-order jet Order1<K> — value + gradient ONLY, no Hessian. The β-border consumer (fill_beta_border_channels_from_program) reads exactly .value() (the beta channel) and .g()[a] (the mixed beta_deriv / beta_l_deriv channel); the reconstruction is linear in β so the Hessian-in-β vanishes and the K×K Hessian that Self::beta_border_towers_packed’s Order2 builds is computed-and-discarded every call. This method drops that work: Order1’s value/gradient are BIT-IDENTICAL to Order2’s (the order-≤1 channels never read a Hessian), proven by the order1_* oracle, while the per-channel gate.mul(basis) skips the Hessian product.

Same hoisting as Self::beta_border_towers_packed: gate jets built once via Self::all_gates, each channel multiplies its atom’s gate by its basis jet.

Source

pub fn out_dim(&self) -> usize

The number of reconstruction output columns.

Source§

impl SaeReconstructionRowProgram

Structural layout signature of a row program: the part that MUST be identical across rows for them to share one SIMD op graph (slot mapping, per-atom basis/latent/decoder shape, primary count). The per-row numeric data (phi/d_phi/d2_phi/decoder VALUES, logits) is what differs between lanes; the layout is what is shared.

Source

pub fn reconstruction_all_columns_batch4<const K: usize>( rows: [&Self; 4], ) -> Option<[Vec<Order2<K>>; 4]>

All out_dim reconstruction columns for FOUR softmax-aligned rows at once, returned per row. Each row’s column vector is BIT-IDENTICAL to Self::reconstruction_all_columns_packed on that row (same hoisting, same Leibniz products in the same order — lane i mirrors the scalar row-i path). Returns None if the four rows are not softmax-aligned, so the caller can fall back to the scalar per-row path.

Source

pub fn beta_border_order1_batch4<const K: usize>( rows: [&Self; 4], channels: &[(usize, usize)], ) -> Option<[Vec<Order1<K>>; 4]>

Packed β-border FIRST-order jets for a batch of (atom, basis_col) channels, for FOUR softmax-aligned rows at once, returned per row. Each row’s channel vector is BIT-IDENTICAL to Self::beta_border_order1_packed on that row. Returns None if the rows are not softmax-aligned.

Trait Implementations§

Source§

impl Clone for SaeReconstructionRowProgram

Source§

fn clone(&self) -> SaeReconstructionRowProgram

Returns a duplicate of the value. Read more
1.0.0 (const: unstable) · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more
Source§

impl Debug for SaeReconstructionRowProgram

Source§

fn fmt(&self, f: &mut Formatter<'_>) -> Result

Formats the value using the given formatter. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Allocation for T
where T: RefUnwindSafe + Send + Sync,

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> ByRef<T> for T

Source§

fn by_ref(&self) -> &T

Source§

impl<ST, DT> CastableFrom<ST, Initialized, Initialized> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<ST, DT> CastableFrom<ST, Uninit, Uninit> for DT
where ST: ?Sized, DT: ?Sized,

Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> DistributionExt for T
where T: ?Sized,

Source§

fn rand<T>(&self, rng: &mut (impl Rng + ?Sized)) -> T
where Self: Distribution<T>,

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T, U> Imply<T> for U
where T: ?Sized, U: ?Sized,

Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Read<Exclusive, BecauseExclusive> for T
where T: ?Sized,

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<SS, SP> SupersetOf<SS> for SP
where SS: SubsetOf<SP>,

Source§

fn to_subset(&self) -> Option<SS>

The inverse inclusion map: attempts to construct self from the equivalent element of its superset. Read more
Source§

fn is_in_subset(&self) -> bool

Checks if self is actually part of its subset T (and can be converted to it).
Source§

fn to_subset_unchecked(&self) -> SS

Use with care! Same as self.to_subset but without any property checks. Always succeeds.
Source§

fn from_subset(element: &SS) -> SP

The inclusion map: converts self to the equivalent element of its superset.
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

Source§

fn vzip(self) -> V