Expand description
Oxide-rs
Fast AI Inference Library & CLI in Rust - A lightweight, CPU-based LLM inference engine inspired by llama.cpp.
§Features
- GGUF model support (LLaMA, LFM2 architectures)
- Full tokenizer compatibility (SPM, BPE, WPM, UGM, RWKV)
- Automatic chat templates from GGUF files
- Streaming token generation
- Multiple sampling strategies (temperature, top-k, top-p)
- Interactive REPL and one-shot modes
- Memory-mapped loading for instant startup
§Quick Start
§CLI Usage
# Install via cargo
cargo install oxide-rs
# Run interactively
oxide-rs -m model.gguf
# One-shot generation
oxide-rs -m model.gguf --once --prompt "Hello!"§Library Usage
ⓘ
use oxide_rs::{generate, GenerateOptions};
fn main() -> Result<(), Box<dyn std::error::Error>> {
let result = generate(
"model.gguf",
GenerateOptions::default(),
"Hello, how are you?",
)?;
println!("{}", result);
Ok(())
}§Builder API
For more control, use the Model builder:
ⓘ
use oxide_rs::Model;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let mut model = Model::new("model.gguf")
.with_options(oxide_rs::GenerateOptions {
max_tokens: 256,
temperature: 0.7,
..Default::default()
})
.load()?;
let response = model.generate("What is Rust?")?;
println!("{}", response);
Ok(())
}§Requirements
- Rust 1.70+ (2021 edition)
- A GGUF quantized model file with embedded chat template
§Links
Re-exports§
pub use inference::BatchConfig;pub use inference::DynamicBatcher;pub use inference::Generator;pub use inference::PagedAttentionConfig;pub use inference::PagedKvCache;pub use inference::PrefixCache;pub use inference::PrefixCacheConfig;pub use inference::SimdLevel;pub use inference::StreamEvent;pub use inference::ThreadPinnerConfig;pub use inference::ThreadPinner;pub use model::download;pub use model::format_size;pub use model::get_hf_cache_dir;pub use model::get_model_info;pub use model::list_models;pub use model::list_repo_files;pub use model::register_model;pub use model::unregister_model;pub use model::ModelEntry;pub use model::GgufMetadata;pub use model::Model as ModelWrapper;pub use model::TokenizerWrapper;
Modules§
Structs§
- Generate
Options - Configuration options for text generation.
- Model
- High-level model wrapper with builder pattern for text generation.
Functions§
- generate
- Simple one-shot text generation function.