Expand description
Shape and memory planning for high-K dictionary score routing.
Sparse SAE dictionary routers all have the same hot loop: score a minibatch
of n_rows residual rows against n_items candidate atoms/blocks, keep a
tiny online top-s, and never materialize the full n_rows x n_items score
matrix. This module owns the reusable admission and tile-size invariants for
that pattern. Domain crates still own their kernels and selection semantics.
Structs§
- Dictionary
Score Route Plan - Device admission and tile geometry for one minibatch-by-dictionary score route.
Constants§
- DEFAULT_
DICTIONARY_ SCORE_ MIN_ ELEMS - Minimum
n_rows * n_itemsscore elements before a cold device route is worth its launch and host/device transfer cost. - DEFAULT_
DICTIONARY_ SCORE_ TILE_ ELEMS - Maximum score elements per device launch. With
f32scores this is 8 MiB, matching the library row-chunk target and keeping peak score memory bounded independent of dictionary width.