Expand description
§Library Name Note
This library is published as ollama-api-rs on crates.io.
Users should write use oai_sdk::{ModelClient, ChatRequest, Message};
§Features
- Async/await support - Built on top of Tokio for efficient async operations
- Easy configuration - Simple client setup with
ModelClient::builder() - Streaming responses - Real-time streaming for both chat and generation
- Full Ollama API compatibility - Complete coverage of all Ollama API endpoints
- Modular design - Separate modules for chat, generate, embed, and model operations
- Comprehensive error handling - Custom error types with detailed context
- Tool calling - Support for function/tool calling in chat completions
- Structured outputs - JSON schema validation support for responses
- Model lifecycle management - Load/unload models programmatically
- Blob management - Push and check model blobs
- Batch embeddings - Efficient batch processing for embeddings
§Examples
§Basic Chat Completion
use oai_sdk::{ModelClient, ChatRequest, Message};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = ChatRequest {
model: "llama3.1:8b".to_string(),
messages: vec![Message::user("Why is the sky blue?")],
stream: false,
..Default::default()
};
let response = client.chat(request).await?;
println!("{}", response.message.content);
Ok(())
}§Streaming Chat
use oai_sdk::{ModelClient, ChatRequest, Message};
use tokio_stream::StreamExt;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = ChatRequest {
model: "llama3.1:8b".to_string(),
messages: vec![Message::user("Write a story about Rust")],
stream: true,
..Default::default()
};
let mut stream = client.chat_stream(request).await?;
while let Some(result) = stream.next().await {
match result {
Ok(response) => print!("{}", response.message.content),
Err(e) => eprintln!("Error: {}", e),
}
}
Ok(())
}§Text Generation
use oai_sdk::{ModelClient, GenerateRequest};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = GenerateRequest {
model: "llama3.1:8b".to_string(),
prompt: "Why is the sky blue?".to_string(),
..Default::default()
};
let response = client.generate(request).await?;
println!("{}", response.response);
Ok(())
}§Embeddings
use oai_sdk::{ModelClient, EmbedRequest, EmbedInput};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = EmbedRequest {
model: "llama3:8b".to_string(),
input: EmbedInput::Single("Hello, world!".to_string()),
truncate: Some(true),
..Default::default()
};
let response = client.embed(request).await?;
println!("Embeddings: {:?}", response.embeddings);
Ok(())
}§Tool Calling
use oai_sdk::{ModelClient, ChatRequest, Message, Tool, ToolFunction};
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let tools = vec![
Tool {
tool_type: "function".to_string(),
function: ToolFunction {
name: "get_current_weather".to_string(),
description: "Get the current weather for a location".to_string(),
parameters: json!({
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The location to get the weather for"
},
"format": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location", "format"]
}),
}
}
];
let request = ChatRequest {
model: "llama3.1:8b".to_string(),
messages: vec![Message::user("What is the weather in Tokyo?")],
tools: Some(tools),
..Default::default()
};
let response = client.chat(request).await?;
if let Some(tool_calls) = response.message.tool_calls {
for tool_call in tool_calls {
println!("Tool call: {}", tool_call.function.name);
}
}
Ok(())
}§Model Management
use oai_sdk::{ModelClient, ShowModelRequest, CopyModelRequest, DeleteModelRequest};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let models = client.list_models().await?;
for model in models {
println!("Model: {}", model.name);
}
let request = ShowModelRequest {
model: "llama3.1:8b".to_string(),
verbose: Some(true),
};
let info = client.show_model(request).await?;
println!("Model info: {:?}", info);
let copy_req = CopyModelRequest {
source: "llama3.1:8b".to_string(),
destination: "llama3-backup".to_string(),
};
client.copy_model(copy_req).await?;
let delete_req = DeleteModelRequest {
model: "llama3-backup".to_string(),
};
client.delete_model(delete_req).await?;
Ok(())
}§OpenAI-Compatible Endpoints
use oai_sdk::{ModelClient, ChatCompletionsRequest, ChatMessage};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
let request = ChatCompletionsRequest {
model: "llama3.1:8b".to_string(),
messages: vec![ChatMessage::user("Why is the sky blue?")],
stream: Some(false),
..Default::default()
};
let response = client.chat_completions(request).await?;
println!("{}", response.choices[0].message.content);
Ok(())
}§Model Lifecycle (requires local feature)
use oai_sdk::ModelClient;
let client = ModelClient::builder()
.base_url("http://localhost:11434")
.build()?;
client.load_model("llama3.1:8b").await?;
println!("Model loaded");
client.unload_model("llama3.1:8b").await?;
println!("Model unloaded");
Ok(())§API Modules
chat- Chat completion with streaming and tool supportgenerate- Text generation with streaming supportembed- Single and batch embeddingsmodel- Model management (CRUD, pull, push, running models)openai- OpenAI-compatible endpoints (chat, embeddings, responses)client- Core client, blob management, model lifecycleerror- Error types and handling
Structs§
- Chat
Completions Request - Request for chat completions
- Chat
Completions Response - Response for chat completions
- Chat
Message - A chat message
- Chat
Request - Request for chat completion.
- Chat
Response - Response for chat completion.
- Copy
Model Request - Request for copying a model.
- Create
Model Request - Request for creating a model.
- Delete
Model Request - Request for deleting a model.
- Embed
Request - Request for embeddings.
- Embed
Response - Response for embeddings.
- Embeddings
Request - Request for legacy embeddings.
- Embeddings
Response - Response for legacy embeddings.
- Generate
Request - Request for text generation.
- Generate
Response - Response for text generation.
- List
Models Response - Response for listing models.
- List
Running Models Response - Response for listing running models.
- Message
- A message in a chat.
- Model
Client - A client for interacting with the Ollama API.
- Model
Client Builder - A builder for creating a
ModelClient. - Model
Details - Details about a model.
- Model
Info - Information about a model.
- OpenAI
Embedding - Embedding vector
- OpenAI
Embeddings Request - Request for embeddings
- OpenAI
Embeddings Response - Response for embeddings
- Pull
Model Request - Request for pulling a model.
- Push
Model Request - Request for pushing a model.
- Responses
Request - Request for responses endpoint
- Responses
Response - Response for responses endpoint
- Running
Model - A running model.
- Show
Model Request - Request for showing model information.
- Show
Model Response - Response for showing model information.
- Status
Response - Status response for streaming operations.
- Tool
- A tool that can be used by the model.
- Tool
Call - A tool call.
- Tool
Call Function - A tool call function.
- Tool
Function - A tool function.
- Version
Response - Response for version information.
Enums§
- Embed
Input - Input for embeddings.
- Format
- Format for the response.
- License
- License information.
- Ollama
Error - Errors that can occur when using the Ollama client
- OpenAI
Embeddings Input - Input for embeddings
Type Aliases§
- Result
- Result type alias for Ollama client operations