🫏 OddOnkey
A dead-simple Rust wrapper around Ollama. Auto-installs Ollama, auto-pulls models, and lets you prompt a local LLM in two lines of code.
let mut model = new.await?;
let answer = model.prompt.await?;
No config files. No API keys. Just add the crate and go.
Table of Contents
- Features
- Quick Start
- Optional Features
- Usage
- Architecture
- Examples
- Minimum Supported Rust Version
- Contributing
- License
Features
| Zero setup | Automatically installs Ollama and pulls the requested model if needed. |
| Conversation history | Multi-turn chat with full context, out of the box. |
| Streaming | Token-by-token output via a standard Stream implementation. |
| Embeddings | Single or batch embedding vectors in one call. |
| Generation options | Temperature, top-p, top-k, context size, repeat penalty, seed, and more. |
| Progress bar (opt-in) | Visual feedback during model download and server start. |
| Per-prompt report (opt-in) | Duration, estimated tokens, throughput, request/response sizes. |
| Docker mode (opt-in) | Run Ollama in a container — zero local install outside Docker. |
| Hexagonal architecture | Swap the Ollama backend for any LLM by implementing one trait. |
Quick Start
Add to your Cargo.toml:
[]
= "0.2"
= { = "1", = ["full"] }
Then:
use OddOnkey;
async
That's it — Ollama is installed and the model is pulled automatically on first run.
Optional Features
Enable in Cargo.toml:
= { = "0.2", = ["progress", "report"] }
| Feature | Description |
|---|---|
progress |
Shows an indicatif progress bar during model pull and a spinner while the server starts. |
report |
Enables the PromptReport struct (also togglable at runtime via .enable_report(true)). |
docker |
Run Ollama inside a Docker container — zero local install outside Docker. Requires Docker on the host. |
Usage
Builder Pattern
For fine-grained control, use the builder:
let mut model = builder
.base_url // custom Ollama URL
.progress // show progress bar
.report // collect per-prompt stats
.build
.await?;
Docker Mode (zero local install)
With the docker feature enabled, Ollama runs entirely inside a Docker container — nothing is installed on the host except Docker itself.
= { = "0.2", = ["docker"] }
let mut model = builder
.docker // run Ollama in Docker
.docker_gpu // optional: GPU passthrough (NVIDIA Container Toolkit)
.docker_port // optional: custom host port
.docker_cleanup // optional: remove container + data on drop
.progress
.build
.await?;
// Same API as always
let answer = model.prompt.await?;
// When `model` is dropped, the container and its volume are destroyed automatically.
The container (oddonkey-ollama) persists pulled models across restarts by default. Enable docker_cleanup(true) for zero-trace disposable runs.
You can also manage the container directly:
use DockerManager;
let mgr = new.gpu;
mgr.stop?; // stop the container (models persist)
mgr.destroy?; // stop + remove container and volume
System Pre-prompts
model.add_preprompt;
// or replace all pre-prompts:
model.set_preprompt;
Generation Options
use GenerationOptions;
model.set_options;
Streaming
use StreamExt;
let mut stream = model.prompt_stream.await?;
let mut full = Stringnew;
while let Some = stream.next.await
// Save the exchange in history for follow-up context
model.push_assistant_message;
Embeddings
// Single text
let vec = model.embed.await?;
// Batch
let vecs = model.embed_batch.await?;
Per-prompt Report
Enable with .report(true) on the builder or .enable_report(true) at runtime:
let answer = model.prompt.await?;
if let Some = model.last_report
Output:
── report ──────────────────────────────────
model : mistral
duration : 1423 ms
prompt tokens : ~12 (est.)
completion tkns : ~87 (est.)
tokens/sec : 61.1
request size : 245 bytes
response size : 534 bytes
────────────────────────────────────────────
Architecture
OddOnkey uses a hexagonal (ports & adapters) architecture. The core logic knows nothing about HTTP or Docker — it depends only on the LlmProvider trait.
src/
├── lib.rs # public re-exports
├── core/
│ ├── oddonkey.rs # OddOnkey struct (backend-agnostic)
│ └── builder.rs # OddOnkeyBuilder
├── domain/ # pure value objects (no I/O)
│ ├── error.rs # OddOnkeyError
│ ├── message.rs # ChatMessage
│ ├── options.rs # GenerationOptions
│ └── report.rs # PromptReport
├── ports/
│ └── llm_provider.rs # LlmProvider trait
└── adapters/
├── ollama/ # Ollama HTTP adapter
│ ├── client.rs # LlmProvider implementation
│ ├── installer.rs # auto-install & server start
│ ├── pull.rs # model pull with optional progress
│ ├── stream.rs # TokenStream
│ └── types.rs # Ollama JSON DTOs
└── docker/ # Docker adapter (feature-gated)
└── manager.rs # container lifecycle management
To add a new backend (e.g. llama.cpp, vLLM, a remote API), implement the LlmProvider trait — no changes to core/ required.
Examples
Run the bundled examples:
# Interactive pirate chat
# Streaming token-by-token
# Embeddings + cosine similarity
# Per-prompt timing report
# Use a specific model
Minimum Supported Rust Version
OddOnkey targets Rust 1.75+ (edition 2021).
Contributing
Contributions are welcome! Please open an issue or submit a pull request on GitHub.
- Fork the repo
- Create a feature branch (
git checkout -b feat/my-feature) - Commit your changes (
git commit -m "feat: add my feature") - Push to the branch (
git push origin feat/my-feature) - Open a Pull Request