🫏 OddOnkey
A dead-simple Rust wrapper around Ollama. Auto-installs Ollama, auto-pulls models, and lets you prompt a local LLM in two lines of code.
let mut model = new.await?;
let answer = model.prompt.await?;
No config files. No Docker. No API keys. Just add the crate and go.
Features
- Zero setup — automatically installs Ollama and pulls the requested model if needed.
- Conversation history — multi-turn chat with full context, out of the box.
- Streaming — token-by-token output via a standard
Streamimpl. - Embeddings — single or batch embedding vectors in one call.
- Generation options — temperature, top-p, top-k, context size, repeat penalty, seed, etc.
- Progress bar (opt-in) — visual feedback during model download & server start.
- Per-prompt report (opt-in) — duration, estimated tokens, throughput, request/response sizes.
- Hexagonal architecture — swap the Ollama backend for any LLM by implementing one trait.
Quick start
Add to your Cargo.toml:
[]
= { = "https://github.com/SaucisseBot/Oddonkey.git" }
= { = "1", = ["full"] }
Then:
use OddOnkey;
async
Optional features
Enable in Cargo.toml:
= { = "...", = ["progress", "report"] }
| Feature | Description |
|---|---|
progress |
Shows an indicatif progress bar during model pull and a spinner while the server starts. |
report |
Enables the PromptReport struct (also togglable at runtime via .enable_report(true)). |
Usage
Builder pattern
let mut model = builder
.base_url // custom Ollama URL
.progress // show progress bar
.report // collect per-prompt stats
.build
.await?;
System pre-prompts
model.add_preprompt;
// or replace all pre-prompts:
model.set_preprompt;
Generation options
use GenerationOptions;
model.set_options;
Streaming
use StreamExt;
let mut stream = model.prompt_stream.await?;
let mut full = Stringnew;
while let Some = stream.next.await
model.push_assistant_message;
Embeddings
let vec = model.embed.await?;
let vecs = model.embed_batch.await?;
Per-prompt report
let answer = model.prompt.await?;
if let Some = model.last_report
Output:
── report ──────────────────────────────────
model : mistral
duration : 1423 ms
prompt tokens : ~12 (est.)
completion tkns : ~87 (est.)
tokens/sec : 61.1
request size : 245 bytes
response size : 534 bytes
────────────────────────────────────────────
Architecture
OddOnkey uses a hexagonal (ports & adapters) architecture:
src/
├── lib.rs # re-exports
├── domain/ # pure value objects (no I/O)
│ ├── error.rs # OddOnkeyError
│ ├── message.rs # ChatMessage
│ ├── options.rs # GenerationOptions
│ └── report.rs # PromptReport
├── ports/
│ └── llm_provider.rs # LlmProvider trait
├── adapters/
│ └── ollama/ # Ollama HTTP adapter
│ ├── client.rs # LlmProvider implementation
│ ├── installer.rs # auto-install & server start
│ ├── pull.rs # model pull with progress
│ ├── stream.rs # TokenStream
│ └── types.rs # Ollama JSON DTOs
└── core/
├── oddonkey.rs # OddOnkey struct (backend-agnostic)
└── builder.rs # OddOnkeyBuilder
To add a new backend, implement the LlmProvider trait — no changes to core/ needed.
Examples
License
MIT