🫏 OddOnkey

A dead-simple Rust wrapper around Ollama. Auto-installs Ollama, auto-pulls models, and lets you prompt a local LLM in two lines of code.

let mut model = OddOnkey::new("mistral").await?;
let answer = model.prompt("What is the capital of France?").await?;

No config files. No API keys. Just add the crate and go.

Features
Quick Start
Optional Features
Usage
Architecture
Examples
Minimum Supported Rust Version
Contributing
License

Features


Zero setup	Automatically installs Ollama and pulls the requested model if needed.
Conversation history	Multi-turn chat with full context, out of the box.
Streaming	Token-by-token output via a standard `Stream` implementation.
Embeddings	Single or batch embedding vectors in one call.
Generation options	Temperature, top-p, top-k, context size, repeat penalty, seed, and more.
Progress bar (opt-in)	Visual feedback during model download and server start.
Per-prompt report (opt-in)	Duration, estimated tokens, throughput, request/response sizes.
Docker mode (opt-in)	Run Ollama in a container — zero local install outside Docker.
Hexagonal architecture	Swap the Ollama backend for any LLM by implementing one trait.

Quick Start

Add to your Cargo.toml:

[dependencies]
oddonkey = "0.2"
tokio = { version = "1", features = ["full"] }

Then:

use oddonkey::OddOnkey;

#[tokio::main]
async fn main() {
    let mut model = OddOnkey::new("mistral").await.unwrap();
    model.add_preprompt("You are a helpful assistant.");
    let answer = model.prompt("What is 2+2?").await.unwrap();
    println!("{answer}");
}

That's it — Ollama is installed and the model is pulled automatically on first run.

Optional Features

Enable in Cargo.toml:

oddonkey = { version = "0.2", features = ["progress", "report"] }

Feature	Description
`progress`	Shows an `indicatif` progress bar during model pull and a spinner while the server starts.
`report`	Enables the `PromptReport` struct (also togglable at runtime via `.enable_report(true)`).
`docker`	Run Ollama inside a Docker container — zero local install outside Docker. Requires Docker on the host.

Usage

Builder Pattern

For fine-grained control, use the builder:

let mut model = OddOnkey::builder("mistral")
    .base_url("http://localhost:11434") // custom Ollama URL
    .progress(true)                     // show progress bar
    .report(true)                       // collect per-prompt stats
    .build()
    .await?;

Docker Mode (zero local install)

With the docker feature enabled, Ollama runs entirely inside a Docker container — nothing is installed on the host except Docker itself.

oddonkey = { version = "0.2", features = ["docker"] }

let mut model = OddOnkey::builder("mistral")
    .docker(true)           // run Ollama in Docker
    .docker_gpu(true)       // optional: GPU passthrough (NVIDIA Container Toolkit)
    .docker_port(11434)     // optional: custom host port
    .docker_cleanup(true)   // optional: remove container + data on drop
    .progress(true)
    .build()
    .await?;

// Same API as always
let answer = model.prompt("Hello!").await?;
// When `model` is dropped, the container and its volume are destroyed automatically.

The container (oddonkey-ollama) persists pulled models across restarts by default. Enable docker_cleanup(true) for zero-trace disposable runs.

You can also manage the container directly:

use oddonkey::DockerManager;

let mgr = DockerManager::new().gpu(true);
mgr.stop()?;    // stop the container (models persist)
mgr.destroy()?; // stop + remove container and volume

System Pre-prompts

model.add_preprompt("You are a friendly pirate.");
// or replace all pre-prompts:
model.set_preprompt("You are a concise assistant.");

Generation Options

use oddonkey::GenerationOptions;

model.set_options(
    GenerationOptions::default()
        .temperature(0.3)
        .num_ctx(8192)
        .top_p(0.9)
        .top_k(40)
        .repeat_penalty(1.1)
        .seed(42)
);

Streaming

use tokio_stream::StreamExt;

let mut stream = model.prompt_stream("Tell me a joke").await?;
let mut full = String::new();
while let Some(tok) = stream.next().await {
    let tok = tok?;
    print!("{tok}");
    full.push_str(&tok);
}
// Save the exchange in history for follow-up context
model.push_assistant_message("Tell me a joke", &full);

Embeddings

// Single text
let vec = model.embed("Rust is awesome").await?;

// Batch
let vecs = model.embed_batch(&["hello", "world"]).await?;

Per-prompt Report

Enable with .report(true) on the builder or .enable_report(true) at runtime:

let answer = model.prompt("Explain borrow checking.").await?;
if let Some(report) = model.last_report() {
    println!("{report}");
}

Output:

── report ──────────────────────────────────
model           : mistral
duration        : 1423 ms
prompt tokens   : ~12 (est.)
completion tkns : ~87 (est.)
tokens/sec      : 61.1
request size    : 245 bytes
response size   : 534 bytes
────────────────────────────────────────────

Architecture

OddOnkey uses a hexagonal (ports & adapters) architecture. The core logic knows nothing about HTTP or Docker — it depends only on the LlmProvider trait.

src/
├── lib.rs                  # public re-exports
├── core/
│   ├── oddonkey.rs          # OddOnkey struct (backend-agnostic)
│   └── builder.rs           # OddOnkeyBuilder
├── domain/                 # pure value objects (no I/O)
│   ├── error.rs             # OddOnkeyError
│   ├── message.rs           # ChatMessage
│   ├── options.rs           # GenerationOptions
│   └── report.rs            # PromptReport
├── ports/
│   └── llm_provider.rs      # LlmProvider trait
└── adapters/
    ├── ollama/              # Ollama HTTP adapter
    │   ├── client.rs         # LlmProvider implementation
    │   ├── installer.rs      # auto-install & server start
    │   ├── pull.rs           # model pull with optional progress
    │   ├── stream.rs         # TokenStream
    │   └── types.rs          # Ollama JSON DTOs
    └── docker/              # Docker adapter (feature-gated)
        └── manager.rs        # container lifecycle management

To add a new backend (e.g. llama.cpp, vLLM, a remote API), implement the LlmProvider trait — no changes to core/ required.

Examples

Run the bundled examples:

# Interactive pirate chat
cargo run --example chat

# Streaming token-by-token
cargo run --example stream

# Embeddings + cosine similarity
cargo run --example embeddings

# Per-prompt timing report
cargo run --example report --features report

# Use a specific model
cargo run --example chat -- llama3

Minimum Supported Rust Version

OddOnkey targets Rust 1.75+ (edition 2021).

Contributing

Contributions are welcome! Please open an issue or submit a pull request on GitHub.

Fork the repo
Create a feature branch (git checkout -b feat/my-feature)
Commit your changes (git commit -m "feat: add my feature")
Push to the branch (git push origin feat/my-feature)
Open a Pull Request

License

MIT

oddonkey 0.2.0