oxibonsai
Pure Rust 1-bit LLM inference engine for PrismML Bonsai models — umbrella crate.
Re-exports all OxiBonsai subcrates for convenience. Add this single dependency to get access to the entire OxiBonsai ecosystem:
[]
= "0.1.0"
# Enable optional subsystems:
= { = "0.1.0", = ["full"] }
Subcrates
| Crate | Description |
|---|---|
oxibonsai-core |
GGUF loader, tensor types, configuration |
oxibonsai-kernels |
1-bit compute kernels (dequant, GEMV, GEMM, SIMD) |
oxibonsai-model |
Qwen3-8B transformer, KV cache, attention |
oxibonsai-runtime |
Inference engine, sampling, OpenAI-compatible server |
oxibonsai-tokenizer |
Pure Rust BPE tokenizer (optional) |
oxibonsai-rag |
Retrieval-augmented generation pipeline (optional) |
oxibonsai-eval |
Model evaluation framework (optional) |
oxibonsai-serve |
Standalone OpenAI-compatible server (optional) |
Feature Flags
| Flag | Description |
|---|---|
server |
HTTP server (axum) |
rag |
RAG pipeline |
native-tokenizer |
Pure Rust BPE tokenizer |
eval |
Evaluation harness |
simd-avx2 |
AVX2+FMA SIMD kernels |
simd-avx512 |
AVX-512 SIMD kernels |
simd-neon |
NEON SIMD kernels |
wasm |
WASM-safe build |
full |
All optional features |
License
Apache-2.0 — COOLJAPAN OU