Crate bitmamba

Crate bitmamba 

Source
Expand description

§BitMamba

A 1.58-bit Mamba language model with infinite context window.

§Features

  • Infinite Context Window - Mamba’s SSM maintains fixed-size state
  • 1.58-bit Weights - BitNet-style quantization
  • CPU Inference - No GPU required
  • OpenAI-Compatible API - Works with Cline, Continue, etc.

§Quick Start

use bitmamba::load;

fn main() -> anyhow::Result<()> {
    let (model, tokenizer) = load()?;
     
    let prompt = "def fibonacci(n):";
    let tokens = tokenizer.encode(prompt, true).unwrap();
    let output = model.generate(tokens.get_ids(), 50, 0.7)?;
     
    println!("{}", tokenizer.decode(&output, true).unwrap());
    Ok(())
}

Re-exports§

pub use model::BitMambaStudent;
pub use model::BitLinear;
pub use model::RMSNorm;
pub use model::BitMambaBlock;

Modules§

model
BitMamba Model - Shared model code

Constants§

DEFAULT_MODEL_REPO
Default model repository on Hugging Face

Functions§

load
Load the BitMamba model and tokenizer from Hugging Face
load_from_repo
Load the BitMamba model and tokenizer from a specific Hugging Face repository
load_model_from_file
Load the BitMamba model from a local safetensors file
load_tokenizer_from_file
Load tokenizer from a local file