Expand description
zilla-muf: shared structured-matrix and numerical primitives for sparse attention and state space models (SSMs).
The unifying idea (Mamba-2’s “SSD” duality): sparse attention and
SSMs both reduce to multiplying by a structured matrix
(semiseparable / low-displacement-rank) instead of a dense one.
Everything here is a reusable, tensor-framework-agnostic piece of
that shared engine — every public function takes plain slices, so the
attention and SSM crates can wrap these calls with their own tensor
types (candle, burn, raw Vec<f32>, …).
Modules§
- complex_
ops - Complex-arithmetic helpers for S4-style models, which diagonalize the state matrix into complex eigenvalues.
- discretize
- Continuous -> discrete state-space conversion (zero-order hold, …).
- fft_
conv - FFT-based long convolution (S4 kernels, FNet-style attention).
Feature-gated because it pulls in
rustfft. - scan
- Linear-recurrence scans:
h_t = a_t * h_{t-1} + b_t. Holds the sequential reference and the chunked (optionally parallel) scan that both architectures actually call. - stable_
ops - Numerically stable elementwise ops shared by both architectures:
softmax / log-sum-exp, segment-sum (
segsum), and gating activations. - structured
- Structured matrix-vector products — the unifying object: semiseparable, Toeplitz, Cauchy, and Vandermonde matvecs.