ollama-kit

Runtime helpers for ollama-rs: configure a shared reqwest client (timeouts, auth), cap concurrent calls / retry transients, and optionally ensure a model exists locally. The Ollama client is exposed as-is—no mirrored generate/chat API.

Cargo.toml

ollama-kit = "0.1"
# optional: `ExecutionGuard::run_stream`, `OllamaRuntime::run_stream`
ollama-kit = { version = "0.1", features = ["stream"] }

What it gives you

Piece	Role
`OllamaRuntime::new` + `RuntimeConfig`	Parses `base_url`, default ports, trailing path slash; rejects `Production` + `auto_pull`.
`client()`	`&ollama_rs::Ollama` — call `generate`, `list_local_models`, etc. unchanged.
`run` / `guard()`	Semaphore (`max_concurrent`), per-attempt timeout, `max_retries`; maps errors to `RuntimeError`.
`ensure_model` / `models()`	Lists tags; pulls only if `Development` and `auto_pull`.

Outbound HTTP only: this crate does not start or supervise the Ollama binary.

Example

use std::time::Duration;

use ollama_kit::{OllamaRuntime, RuntimeConfig, RuntimeMode};

async fn demo() -> ollama_kit::Result<()> {
    let rt = OllamaRuntime::new(RuntimeConfig {
        base_url: "http://127.0.0.1:11434".into(),
        timeout: Duration::from_secs(120),
        connect_timeout: Duration::from_secs(10),
        max_retries: 2,
        max_concurrent: 4,
        auto_pull: false,
        mode: RuntimeMode::Production,
        auth: None,
    })
    .await?;

    rt.ensure_model("mistral").await?;

    let _models = rt.run(|| rt.client().list_local_models()).await?;
    Ok(())
}

Docs

API reference: docs.rs/ollama-kit

MSRV / stability

Pinned ollama-rs patch in Cargo.toml; bump when you deliberately upgrade. Breaking changes follow semver starting from 0.1.

License

MIT — see LICENSE.