Expand description
§BitMamba
A 1.58-bit Mamba language model with infinite context window.
§Features
- Infinite Context Window - Mamba’s SSM maintains fixed-size state
- 1.58-bit Weights - BitNet-style quantization
- CPU Inference - No GPU required
- OpenAI-Compatible API - Works with Cline, Continue, etc.
§Quick Start
use bitmamba::load;
fn main() -> anyhow::Result<()> {
let (model, tokenizer) = load()?;
let prompt = "def fibonacci(n):";
let tokens = tokenizer.encode(prompt, true).unwrap();
let output = model.generate(tokens.get_ids(), 50, 0.7)?;
println!("{}", tokenizer.decode(&output, true).unwrap());
Ok(())
}Re-exports§
pub use model::BitMambaStudent;pub use model::BitLinear;pub use model::RMSNorm;pub use model::BitMambaBlock;
Modules§
- model
- BitMamba Model - Shared model code
Constants§
- DEFAULT_
MODEL_ REPO - Default model repository on Hugging Face
Functions§
- load
- Load the BitMamba model and tokenizer from Hugging Face
- load_
from_ repo - Load the BitMamba model and tokenizer from a specific Hugging Face repository
- load_
model_ from_ file - Load the BitMamba model from a local safetensors file
- load_
tokenizer_ from_ file - Load tokenizer from a local file