# ollama-kit
Runtime helpers for [`ollama-rs`](https://docs.rs/ollama-rs): configure a shared `reqwest` client (timeouts, auth), cap concurrent calls / retry transients, and optionally ensure a model exists locally. The `Ollama` client is exposed as-is—no mirrored generate/chat API.
## Cargo.toml
```toml
ollama-kit = "0.1"
# optional: `ExecutionGuard::run_stream`, `OllamaRuntime::run_stream`
ollama-kit = { version = "0.1", features = ["stream"] }
```
## What it gives you
| `OllamaRuntime::new` + `RuntimeConfig` | Parses `base_url`, default ports, trailing path slash; rejects `Production` + `auto_pull`. |
| `client()` | `&ollama_rs::Ollama` — call `generate`, `list_local_models`, etc. unchanged. |
| `run` / `guard()` | Semaphore (`max_concurrent`), per-attempt timeout, `max_retries`; maps errors to `RuntimeError`. |
| `ensure_model` / `models()` | Lists tags; pulls only if `Development` and `auto_pull`. |
Outbound HTTP only: this crate does **not** start or supervise the Ollama binary.
## Example
```rust
use std::time::Duration;
use ollama_kit::{OllamaRuntime, RuntimeConfig, RuntimeMode};
async fn demo() -> ollama_kit::Result<()> {
let rt = OllamaRuntime::new(RuntimeConfig {
base_url: "http://127.0.0.1:11434".into(),
timeout: Duration::from_secs(120),
connect_timeout: Duration::from_secs(10),
max_retries: 2,
max_concurrent: 4,
auto_pull: false,
mode: RuntimeMode::Production,
auth: None,
})
.await?;
rt.ensure_model("mistral").await?;
let _models = rt.run(|| rt.client().list_local_models()).await?;
Ok(())
}
```
## Docs
API reference: **[docs.rs/ollama-kit](https://docs.rs/ollama-kit)**
## MSRV / stability
Pinned `ollama-rs` patch in `Cargo.toml`; bump when you deliberately upgrade. Breaking changes follow semver starting from `0.1`.
## License
MIT — see [`LICENSE`](LICENSE).