# ollama-api-rs
A Rust SDK for the Ollama API with async support and OpenAI compatibility.
[](https://crates.io/crates/ollama-api-rs)
[](https://docs.rs/ollama-api-rs)
[](https://codeberg.org/cloudflavor/ollama-api-rs/src/branch/main/LICENSE)
## Features
- Async/await support
- Easy client configuration with `ModelClient::builder()`
- Streaming responses (chat and generation)
- Full compatibility with Ollama API
- OpenAI-compatible endpoints (`/v1/chat/completions`, `/v1/embeddings`, `/v1/responses`)
- Modular design with separate modules for chat, generate, embed, and model operations
- Comprehensive error handling with custom error types
- Convenience constructors: `Message::user()`, `Message::assistant()`, `Message::system()`, `ChatMessage::user()`
- Complete API coverage including:
- Chat completions with tool calling
- Text generation
- Embeddings (single and batch)
- Model management (list, show, copy, delete, pull, push, create)
- Model lifecycle (load/unload)
- Blob management
- Running models introspection
## Installation
Add this to your `Cargo.toml`:
```toml
[dependencies]
ollama-api-rs = "0.3.0"
```
Then import it in your Rust code as:
```rust
use oai_sdk::{ModelClient, ChatRequest, Message};
```
For local-only features (blob management, model lifecycle, running models introspection):
```toml
[dependencies]
ollama-api-rs = { version = "0.3.0", features = ["local"] }
```
## Authentication
For cloud access to ollama.com or private models, configure authentication:
```rust
let client = ModelClient::builder()
.base_url("https://ollama.com")
.auth_token("your-auth-token")
.build()?;
```
## OpenAI Compatibility
Ollama provides OpenAI-compatible endpoints that work with standard OpenAI client libraries:
- `POST /v1/chat/completions` - Chat completions
- `POST /v1/embeddings` - Embeddings generation
- `POST /v1/responses` - Response generation
Use base URL `http://localhost:11434/v1/` with any API key:
```python
from openai import OpenAI
client = OpenAI(
base_url='http://localhost:11434/v1/',
api_key='ollama', # required but ignored
)
```
## Usage
### Basic Chat Completion
```rust
use oai_sdk::{ModelClient, ChatRequest, Message};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = ChatRequest {
model: "llama3.1:8b".to_string(),
messages: vec![Message::user("Why is the sky blue?")],
..Default::default()
};
let response = client.chat(request).await?;
println!("{}", response.message.content);
Ok(())
}
```
### Streaming Chat Responses
```rust
use oai_sdk::{ModelClient, ChatRequest, Message};
use tokio_stream::StreamExt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = ChatRequest {
model: "llama3.1:8b".to_string(),
messages: vec![Message::user("Write a short story about Rust.")],
stream: true,
..Default::default()
};
let mut stream = client.chat_stream(request).await?;
while let Some(result) = stream.next().await {
match result {
Ok(response) => print!("{}", response.message.content),
Err(e) => eprintln!("Error: {}", e),
}
}
Ok(())
}
```
### Text Generation
```rust
use oai_sdk::{ModelClient, GenerateRequest};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = GenerateRequest {
model: "llama3.1:8b".to_string(),
prompt: "Why is the sky blue?".to_string(),
..Default::default()
};
let response = client.generate(request).await?;
println!("{}", response.response);
Ok(())
}
```
### Streaming Text Generation
```rust
use oai_sdk::{ModelClient, GenerateRequest};
use tokio_stream::StreamExt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = GenerateRequest {
model: "llama3.1:8b".to_string(),
prompt: "Write a haiku about Rust".to_string(),
stream: true,
..Default::default()
};
let mut stream = client.generate_stream(request).await?;
while let Some(result) = stream.next().await {
match result {
Ok(response) => print!("{}", response.response),
Err(e) => eprintln!("Error: {}", e),
}
}
Ok(())
}
```
### Embeddings (Single)
```rust
use oai_sdk::{ModelClient, EmbedRequest, EmbedInput};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = EmbedRequest {
model: "llama3:8b".to_string(),
input: EmbedInput::Single("Hello, world!".to_string()),
truncate: Some(true),
..Default::default()
};
let response = client.embed(request).await?;
println!("Embeddings: {:?}", response.embeddings);
Ok(())
}
```
### Batch Embeddings
```rust
use oai_sdk::{ModelClient, EmbedRequest, EmbedInput};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = EmbedRequest {
model: "llama3:8b".to_string(),
input: EmbedInput::Multiple(vec![
"Hello, world!".to_string(),
"Goodbye, world!".to_string(),
]),
truncate: Some(true),
..Default::default()
};
let response = client.embed(request).await?;
println!("Batch embeddings: {:?}", response.embeddings);
Ok(())
}
```
### Legacy Embeddings
```rust
use oai_sdk::{ModelClient, EmbeddingsRequest};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = EmbeddingsRequest {
model: "llama3:8b".to_string(),
prompt: "Hello, world!".to_string(),
truncate: Some(true),
..Default::default()
};
let response = client.embeddings(request).await?;
println!("Legacy embedding: {:?}", response.embedding);
Ok(())
}
```
### Tool Calling
```rust
use oai_sdk::{ModelClient, ChatRequest, Message, Tool, ToolFunction};
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let tools = vec![
Tool {
tool_type: "function".to_string(),
function: ToolFunction {
name: "get_current_weather".to_string(),
description: "Get the current weather for a location".to_string(),
parameters: json!({
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The location to get the weather for"
},
"format": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location", "format"]
}),
}
}
];
let request = ChatRequest {
model: "llama3.1:8b".to_string(),
messages: vec![Message::user("What is the weather in Tokyo?")],
tools: Some(tools),
..Default::default()
};
let response = client.chat(request).await?;
if let Some(tool_calls) = response.message.tool_calls {
for tool_call in tool_calls {
println!("Tool call: {}", tool_call.function.name);
println!("Arguments: {}",
serde_json::to_string_pretty(&tool_call.function.arguments)?);
}
}
Ok(())
}
```
### OpenAI-Compatible Chat
```rust
use oai_sdk::{ModelClient, ChatCompletionsRequest, ChatMessage};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = ChatCompletionsRequest {
model: "llama3.1:8b".to_string(),
messages: vec![ChatMessage::user("Why is the sky blue?")],
stream: Some(false),
..Default::default()
};
let response = client.chat_completions(request).await?;
println!("{}", response.choices[0].message.content);
Ok(())
}
```
### Model Management
```rust
use oai_sdk::{ModelClient, ShowModelRequest, CopyModelRequest, DeleteModelRequest};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let models = client.list_models().await?;
for model in models {
println!("Model: {} ({})", model.name, model.details.parameter_size);
}
let request = ShowModelRequest {
model: "llama3.1:8b".to_string(),
verbose: Some(true),
};
let info = client.show_model(request).await?;
println!("Model info: {:?}", info);
let copy_req = CopyModelRequest {
source: "llama3.1:8b".to_string(),
destination: "llama3-backup".to_string(),
};
client.copy_model(copy_req).await?;
println!("Model copied successfully");
let delete_req = DeleteModelRequest {
model: "llama3-backup".to_string(),
};
client.delete_model(delete_req).await?;
println!("Model deleted successfully");
Ok(())
}
```
### Model Lifecycle (Load/Unload)
Requires the `local` feature: `cargo add ollama-api-rs --features local`
```rust
use oai_sdk::ModelClient;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
client.load_model("llama3.1:8b").await?;
println!("Model loaded into memory");
client.unload_model("llama3.1:8b").await?;
println!("Model unloaded from memory");
Ok(())
}
```
### Blob Management
Requires the `local` feature: `cargo add ollama-api-rs --features local`
```rust
use oai_sdk::ModelClient;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let digest = "sha256:abc123...";
let exists = client.blob_exists(digest).await?;
println!("Blob exists: {}", exists);
let content = b"model blob content";
client.push_blob(digest, content).await?;
println!("Blob pushed successfully");
Ok(())
}
```
## API Coverage
| `POST /api/chat` | `chat()`, `chat_stream()` | `chat` | default |
| `POST /api/generate` | `generate()`, `generate_stream()` | `generate` | default |
| `POST /api/embed` | `embed()` | `embed` | default |
| `POST /api/embeddings` | `embeddings()` | `embed` | default |
| `GET /api/tags` | `list_models()` | `model` | default |
| `POST /api/show` | `show_model()` | `model` | default |
| `POST /api/copy` | `copy_model()` | `model` | default |
| `DELETE /api/delete` | `delete_model()` | `model` | default |
| `POST /api/pull` | `pull_model()` | `model` | default |
| `POST /api/push` | `push_model()` | `model` | default |
| `POST /api/create` | `create_model()` | `model` | default |
| `GET /api/ps` | `list_running_models()` | `model` | `local` |
| `GET /api/version` | `get_version()` | `client` | default |
| `HEAD /api/blobs/:digest` | `blob_exists()` | `client` | `local` |
| `POST /api/blobs/:digest` | `push_blob()` | `client` | `local` |
| `POST /v1/chat/completions` | `chat_completions()` | `openai` | default |
| `POST /v1/embeddings` | `openai_embeddings()` | `openai` | default |
| `POST /v1/responses` | `responses()` | `openai` | default |
### Model Lifecycle (requires `local` feature)
The following methods are available when the `local` feature is enabled:
- `load_model()` / `unload_model()` - Load/unload models into memory
## Modules
The crate is organized into the following modules:
- `chat` - Chat completion functionality (with streaming and tool support)
- `generate` - Text generation functionality (with streaming support)
- `embed` - Embeddings functionality (single and batch)
- `model` - Model management functionality (CRUD, pull, push)
- `openai` - OpenAI-compatible endpoints (chat, embeddings, responses)
- `client` - Core client functionality, blob management, and model lifecycle
- `error` - Error types and handling
## Examples
See the [examples](./examples) directory for more comprehensive examples:
- `basic_chat.rs` - Simple chat interface
- `streaming_chat.rs` - Streaming chat responses
- `embeddings.rs` - Generating embeddings with the modern API
- `model_management.rs` - Managing models (list, show, copy, delete)
- `model_lifecycle.rs` - Loading and unloading models into memory (requires `local`)
- `tool_calling.rs` - Using tool calling functionality
- `openai_compatibility.rs` - Using OpenAI-compatible endpoints
## Testing
Run the tests with:
```bash
cargo test
```
The tests include both integration tests that require a running Ollama instance and mock tests that don't.
For E2E tests against a real Ollama instance:
```bash
cargo test --test e2e_test -- --ignored
```
## License
Apache 2.0
## Author
Victor Palade <victor@cloudflavor.io>
Website: https://cloudflavor.io