Skip to main content

Crate oxide_rs

Crate oxide_rs 

Source
Expand description

Oxide-rs

Fast AI Inference Library & CLI in Rust - A lightweight, CPU-based LLM inference engine inspired by llama.cpp.

§Features

  • GGUF model support (LLaMA, LFM2 architectures)
  • Full tokenizer compatibility (SPM, BPE, WPM, UGM, RWKV)
  • Automatic chat templates from GGUF files
  • Streaming token generation
  • Multiple sampling strategies (temperature, top-k, top-p)
  • Interactive REPL and one-shot modes
  • Memory-mapped loading for instant startup

§Quick Start

§CLI Usage

# Install via cargo
cargo install oxide-rs

# Run interactively
oxide-rs -m model.gguf

# One-shot generation
oxide-rs -m model.gguf --once --prompt "Hello!"

§Library Usage

use oxide_rs::{generate, GenerateOptions};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let result = generate(
        "model.gguf",
        GenerateOptions::default(),
        "Hello, how are you?",
    )?;
    println!("{}", result);
    Ok(())
}

§Builder API

For more control, use the Model builder:

use oxide_rs::Model;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    let mut model = Model::new("model.gguf")
        .with_options(oxide_rs::GenerateOptions {
            max_tokens: 256,
            temperature: 0.7,
            ..Default::default()
        })
        .load()?;

    let response = model.generate("What is Rust?")?;
    println!("{}", response);
    Ok(())
}

§Requirements

  • Rust 1.70+ (2021 edition)
  • A GGUF quantized model file with embedded chat template

Re-exports§

pub use inference::BatchConfig;
pub use inference::DynamicBatcher;
pub use inference::Generator;
pub use inference::PagedAttentionConfig;
pub use inference::PagedKvCache;
pub use inference::PrefixCache;
pub use inference::PrefixCacheConfig;
pub use inference::SimdLevel;
pub use inference::StreamEvent;
pub use inference::ThreadPinnerConfig;
pub use inference::ThreadPinner;
pub use model::download;
pub use model::format_size;
pub use model::get_hf_cache_dir;
pub use model::get_model_info;
pub use model::list_models;
pub use model::list_repo_files;
pub use model::register_model;
pub use model::unregister_model;
pub use model::ModelEntry;
pub use model::GgufMetadata;
pub use model::Model as ModelWrapper;
pub use model::TokenizerWrapper;

Modules§

cli
inference
model
server
tui

Structs§

GenerateOptions
Configuration options for text generation.
Model
High-level model wrapper with builder pattern for text generation.

Functions§

generate
Simple one-shot text generation function.