Skip to main content

Crate mamba_rs

Crate mamba_rs 

Source
Expand description

§mamba-rs

Mamba SSM and Mamba-3 SISO in Rust with optional CUDA GPU acceleration. Supports Mamba SSM (Gu & Dao, 2023) and Mamba-3 SISO (Lahoti et al., 2026) on CPU and GPU, with full inference and training pipelines.

Standalone — no PyTorch, no Triton, no Burn, no Candle. Kernels compile at runtime via NVRTC.

§Capabilities

  • Mamba SSM and Mamba-3 SISO architectures
  • CPU and GPU (CUDA) paths for both
  • Full training with BPTT through the recurrent SSM state + AdamW
  • WeightDtype::{F32, Bf16, F16} — f32 compute regardless of storage
  • CUDA Graph capture for inference and training steps
  • Batch-invariant bf16 inference (custom GEMM kernel; logits are bit-identical across batch sizes for the same prompt)
  • HuggingFace safetensors loader for Mamba SSM checkpoints

§Module Structure

  • mamba_ssm — Mamba SSM (CPU + GPU forward, backward, training)
  • mamba3_siso — Mamba-3 SISO (CPU + GPU forward, backward, training)
  • module — high-level backbone and LM wrappers, HF integration
  • ops — shared dimensions, BLAS, norms, fast-math helpers
  • config, state, weights, serialize — Mamba SSM data types

§References

  • Gu & Dao, Mamba: Linear-Time Sequence Modeling with Selective State Spaces, ICLR 2024.
  • Lahoti et al., Mamba-3: Improved Sequence Modeling using State Space Principles, ICLR 2026.

Re-exports§

pub use config::MambaConfig;
pub use mamba_ssm::cpu::inference::MambaLayerScratch;
pub use mamba_ssm::cpu::inference::MambaStepScratch;
pub use mamba_ssm::cpu::inference::mamba_block_step;
pub use mamba_ssm::cpu::inference::mamba_layer_step;
pub use mamba_ssm::cpu::inference::mamba_step;
pub use mamba_ssm::cpu::inference::mamba_step_no_proj;
pub use module::MambaBackbone;
pub use state::MambaLayerState;
pub use state::MambaState;
pub use weights::MambaLayerWeights;
pub use weights::MambaWeights;
pub use mamba3_siso::Mamba3Config;
pub use mamba3_siso::Mamba3Dims;
pub use mamba3_siso::Mamba3LayerState;
pub use mamba3_siso::Mamba3LayerWeights;
pub use mamba3_siso::Mamba3State;
pub use mamba3_siso::Mamba3StepScratch;
pub use mamba3_siso::Mamba3Weights;

Modules§

config
inference
mamba3_siso
Mamba-3 SISO (Single-Input Single-Output) implementation.
mamba_ssm
Mamba SSM (Selective State Space Model).
module
High-level Mamba wrappers.
ops
Shared operations: dimensions, BLAS, math, normalization utilities.
serialize
Weight serialization via safetensors format (HuggingFace standard).
state
train
weights