ollama-kit 0.1.0

# ollama-kit

Runtime helpers for [`ollama-rs`](https://docs.rs/ollama-rs): configure a shared `reqwest` client (timeouts, auth), cap concurrent calls / retry transients, and optionally ensure a model exists locally. The `Ollama` client is exposed as-is—no mirrored generate/chat API.

## Cargo.toml

```toml
ollama-kit = "0.1"
# optional: `ExecutionGuard::run_stream`, `OllamaRuntime::run_stream`
ollama-kit = { version = "0.1", features = ["stream"] }
```

## What it gives you

| Piece | Role |
|-------|------|
| `OllamaRuntime::new` + `RuntimeConfig` | Parses `base_url`, default ports, trailing path slash; rejects `Production` + `auto_pull`. |
| `client()` | `&ollama_rs::Ollama` — call `generate`, `list_local_models`, etc. unchanged. |
| `run` / `guard()` | Semaphore (`max_concurrent`), per-attempt timeout, `max_retries`; maps errors to `RuntimeError`. |
| `ensure_model` / `models()` | Lists tags; pulls only if `Development` and `auto_pull`. |

Outbound HTTP only: this crate does **not** start or supervise the Ollama binary.

## Example

```rust
use std::time::Duration;

use ollama_kit::{OllamaRuntime, RuntimeConfig, RuntimeMode};

async fn demo() -> ollama_kit::Result<()> {
    let rt = OllamaRuntime::new(RuntimeConfig {
        base_url: "http://127.0.0.1:11434".into(),
        timeout: Duration::from_secs(120),
        connect_timeout: Duration::from_secs(10),
        max_retries: 2,
        max_concurrent: 4,
        auto_pull: false,
        mode: RuntimeMode::Production,
        auth: None,
    })
    .await?;

    rt.ensure_model("mistral").await?;

    let _models = rt.run(|| rt.client().list_local_models()).await?;
    Ok(())
}
```

## Docs

API reference: **[docs.rs/ollama-kit](https://docs.rs/ollama-kit)**

## MSRV / stability

Pinned `ollama-rs` patch in `Cargo.toml`; bump when you deliberately upgrade. Breaking changes follow semver starting from `0.1`.

## License

MIT — see [`LICENSE`](LICENSE).