Expand description
§mamba-rs
Mamba SSM and Mamba-3 SISO in Rust with optional CUDA GPU acceleration. Supports Mamba SSM (Gu & Dao, 2023) and Mamba-3 SISO (Lahoti et al., 2026) on CPU and GPU, with full inference and training pipelines.
Standalone — no PyTorch, no Triton, no Burn, no Candle. Kernels compile at runtime via NVRTC.
§Capabilities
- Mamba SSM and Mamba-3 SISO architectures
- CPU and GPU (CUDA) paths for both
- Full training with BPTT through the recurrent SSM state + AdamW
WeightDtype::{F32, Bf16, F16}— f32 compute regardless of storage- CUDA Graph capture for inference and training steps
- Batch-invariant bf16 inference (custom GEMM kernel; logits are bit-identical across batch sizes for the same prompt)
- HuggingFace safetensors loader for Mamba SSM checkpoints
§Module Structure
mamba_ssm— Mamba SSM (CPU + GPU forward, backward, training)mamba3_siso— Mamba-3 SISO (CPU + GPU forward, backward, training)module— high-level backbone and LM wrappers, HF integrationops— shared dimensions, BLAS, norms, fast-math helpersconfig,state,weights,serialize— Mamba SSM data types
§References
- Gu & Dao, Mamba: Linear-Time Sequence Modeling with Selective State Spaces, ICLR 2024.
- Lahoti et al., Mamba-3: Improved Sequence Modeling using State Space Principles, ICLR 2026.
Re-exports§
pub use config::MambaConfig;pub use mamba_ssm::cpu::inference::MambaLayerScratch;pub use mamba_ssm::cpu::inference::MambaStepScratch;pub use mamba_ssm::cpu::inference::mamba_block_step;pub use mamba_ssm::cpu::inference::mamba_layer_step;pub use mamba_ssm::cpu::inference::mamba_step;pub use mamba_ssm::cpu::inference::mamba_step_no_proj;pub use module::MambaBackbone;pub use state::MambaLayerState;pub use state::MambaState;pub use weights::MambaLayerWeights;pub use weights::MambaWeights;pub use mamba3_siso::Mamba3Config;pub use mamba3_siso::Mamba3Dims;pub use mamba3_siso::Mamba3LayerState;pub use mamba3_siso::Mamba3LayerWeights;pub use mamba3_siso::Mamba3State;pub use mamba3_siso::Mamba3StepScratch;pub use mamba3_siso::Mamba3Weights;
Modules§
- config
- inference
- mamba3_
siso - Mamba-3 SISO (Single-Input Single-Output) implementation.
- mamba_
ssm - Mamba SSM (Selective State Space Model).
- module
- High-level Mamba wrappers.
- ops
- Shared operations: dimensions, BLAS, math, normalization utilities.
- serialize
- Weight serialization via safetensors format (HuggingFace standard).
- state
- train
- weights