Open Agent SDK (Rust)
Build production-ready AI agents in Rust using your own hardware
What you can build:
- Copy editors that analyze manuscripts and track writing patterns
- Git commit generators that write meaningful commit messages
- Market analyzers that research competitors and summarize findings
- Code reviewers, data analysts, research assistants, and more
Why local?
- No API costs - use your hardware, not OpenAI's
- Privacy - your data never leaves your machine
- Control - pick your model (Qwen, Llama, Mistral, etc.)
How fast? From zero to working agent in under 5 minutes. Rust-native performance (zero-cost abstractions, no GC), fearless concurrency, and production-ready quality with 85+ tests.
Overview
Open Agent SDK (Rust) provides a clean, streaming API for working with OpenAI-compatible local model servers. 100% feature parity with the Python SDK—complete with streaming, tool call aggregation, hooks, and automatic tool execution—built on Tokio for high-performance async I/O.
Supported Providers
Supported (OpenAI-Compatible Endpoints)
- LM Studio -
http://localhost:1234/v1 - Ollama -
http://localhost:11434/v1 - llama.cpp server - OpenAI-compatible mode
- vLLM - OpenAI-compatible API
- Text Generation WebUI - OpenAI extension
- Any OpenAI-compatible local endpoint
- Local gateways proxying cloud models - e.g., Ollama or custom gateways that route to cloud providers
Note on LM Studio: LM Studio is particularly well-tested with this SDK and provides reliable OpenAI-compatible API support. If you're looking for a user-friendly local model server with excellent compatibility, LM Studio is highly recommended.
Not Supported (Use Official SDKs)
- Claude/OpenAI direct - Use their official SDKs, unless you proxy through a local OpenAI-compatible gateway
- Cloud provider SDKs - Bedrock, Vertex, Azure, etc. (proxied via local gateway is fine)
Quick Start
Installation
[]
= "0.1.0"
= { = "1", = ["full"] }
= "0.3"
= "1.0"
For development:
Simple Query (LM Studio)
use ;
use StreamExt;
async
Multi-Turn Conversation (Ollama)
use ;
async
Function Calling with Tools
Define tools using the builder pattern for clean, type-safe function calling:
use ;
use json;
async
Advanced: Manual Tool Execution
For custom execution logic or result interception:
// Disable auto-execution
let options = builder
.system_prompt
.model
.base_url
.tool
.auto_execute_tools // Manual mode
.build?;
let mut client = new?;
client.send.await?;
while let Some = client.receive.await
Key Features:
- Automatic execution - Tools run automatically with safety limits
- Type-safe schemas - Automatic JSON schema generation from parameters
- OpenAI-compatible - Works with any OpenAI function calling endpoint
- Clean builder API - Fluent API for tool definition
- Hook integration - PreToolUse/PostToolUse hooks work in both modes
See examples/calculator_tools.rs and examples/auto_execution_demo.rs for complete examples.
Multimodal Vision Support
Send images alongside text to vision-capable models like llava, qwen-vl, or minicpm-v. The SDK handles OpenAI Vision API formatting automatically.
Simple Image + Text
use ;
// From URL
let msg = user_with_image?;
client.send_message.await?;
// From local file path (NEW!)
let msg = new;
client.send_message.await?;
// From base64 data
let msg = user_with_base64_image?;
client.send_message.await?;
// Control detail level for token costs
let msg = user_with_image_detail?;
client.send_message.await?;
Supported Image Sources:
ImageBlock::from_url(url)- HTTPS/HTTP URLsImageBlock::from_file_path(path)- Local filesystem (automatically encodes as base64)- Supports:
.jpg,.jpeg,.png,.gif,.webp,.bmp,.svg - MIME type inferred from file extension
- File is read and encoded automatically
- Supports:
ImageBlock::from_base64(data, mime)- Manual base64 with explicit MIME type
Token Cost Management
Control image processing costs using ImageDetail levels:
ImageDetail::Low- Lower resolution (typically more cost-effective)ImageDetail::High- Higher resolution (typically more detailed analysis)ImageDetail::Auto- Model decides (balanced default)
⚠️ Token Costs Vary by Model:
OpenAI's Vision API uses ~85 tokens (Low) and variable tokens based on dimensions (High), but local models may have completely different token costs—or no token costs for images at all. The ImageDetail setting may even be ignored by some models.
Always benchmark your specific model instead of relying on OpenAI's published values for capacity planning.
Complex Multi-Image Messages
use ;
let msg = new;
Key Features:
send_message()API - Send pre-built messages with images viaclient.send_message(msg).await?- Automatic serialization - Images converted to OpenAI Vision API format
- Multiple sources - URLs, local file paths, or base64 data
- Backward compatible - Text-only messages still work with
send("text") - Data URIs supported - Base64-encoded images transmitted seamlessly
- Token cost control - Choose detail level based on use case
See examples/vision_example.rs for comprehensive working examples including local file paths.
Context Management
Local models have fixed context windows (typically 8k-32k tokens). The SDK provides utilities for manual history management—no silent mutations, you stay in control.
Token Estimation & Truncation
use ;
let mut client = new?;
// Long conversation...
for i in 0..50
// Check token usage
let tokens = estimate_tokens;
println!;
// Manually truncate when needed
if tokens > 28000
Recommended Patterns
1. Stateless Agents (Best for single-task agents):
// Process each task independently - no history accumulation
for task in tasks
2. Manual Truncation (At natural breakpoints):
use truncate_messages;
let mut client = new?;
for task in tasks
3. External Memory (RAG-lite for research agents):
// Store important facts in database, keep conversation context small
let mut database = new;
let mut client = new?;
client.send.await?;
// Save response to database
database.insert;
// Clear history, query database when needed
let truncated = truncate_messages;
*client.history_mut = truncated;
Why Manual?
The SDK intentionally does not auto-compact history because:
- Domain-specific needs: Copy editors need different strategies than research agents
- Token accuracy varies: Each model family has different tokenizers
- Risk of breaking context: Silently removing messages could break tool chains
- Natural limits exist: Compaction doesn't bypass model context windows
See examples/context_management.rs for complete patterns and usage.
Lifecycle Hooks
Monitor and control agent behavior at key execution points with zero-cost Rust hooks.
Quick Example
use ;
// Security gate - block dangerous operations
let hooks = new
.add_pre_tool_use
.add_post_tool_use;
// Register hooks in AgentOptions
let options = builder
.system_prompt
.model
.base_url
.hooks
.build?;
let mut client = new?;
Hook Types
PreToolUse - Fires before tool execution
- Block operations: Return
Some(HookDecision::block(reason)) - Modify inputs: Return
Some(HookDecision::modify_input(json!({}), reason)) - Allow: Return
Some(HookDecision::continue_())
PostToolUse - Fires after tool result added to history
- Observational (tool already executed)
- Use for audit logging, metrics, result validation
- Return
NoneorSome(HookDecision::...)
UserPromptSubmit - Fires before sending prompt to API
- Block prompts: Return
Some(HookDecision::block(reason)) - Modify prompts: Return
Some(HookDecision::modify_prompt(text, reason)) - Allow: Return
Some(HookDecision::continue_())
Common Patterns
Pattern 1: Redirect to Sandbox
hooks.add_pre_tool_use
Pattern 2: Compliance Audit Log
let audit_log = new;
let log_clone = audit_log.clone;
hooks.add_post_tool_use
Hook Execution Flow
- Hooks run sequentially in the order registered
- First non-None decision wins (short-circuit behavior)
- Hooks run inline on async runtime (spawn tasks for heavy work)
- Works with both Client and query() function
See examples/hooks_example.rs and examples/multi_tool_agent.rs for comprehensive patterns.
Interrupt Capability
Cancel long-running operations cleanly without corrupting client state. Perfect for timeouts, user cancellations, or conditional interruptions.
Interrupt Quick Example
use ;
use ;
async
Common Interrupt Patterns
1. Conditional Interruption
let mut full_text = Stringnew;
while let Some = client.receive.await
2. Concurrent Cancellation
use select;
let stream_task = async ;
let cancel_task = async ;
select!
How It Works
When you call client.interrupt():
- Active stream closure - HTTP stream closed immediately (not just a flag)
- Clean state - Client remains in valid state for reuse
- Partial output - Text blocks flushed to history, incomplete tools skipped
- Idempotent - Safe to call multiple times
- Thread-safe - Can be called from separate async tasks
See examples/interrupt_demo.rs for comprehensive patterns.
Practical Examples
We've included production-ready agents that demonstrate real-world usage:
Git Commit Agent
Analyzes your staged git changes and writes professional commit messages following conventional commit format.
# Stage your changes
# Run the agent
# Output:
# Found staged changes in 3 file(s)
# Analyzing changes and generating commit message...
#
# Suggested commit message:
# feat(auth): Add OAuth2 integration with refresh tokens
#
# - Implement token refresh mechanism
# - Add secure cookie storage for tokens
# - Update login flow to support OAuth2 providers
Features:
- Analyzes diff to determine commit type (feat/fix/docs/etc)
- Writes clear, descriptive commit messages
- Follows conventional commit standards
Log Analyzer Agent
examples/log_analyzer_agent.rs
Intelligently analyzes application logs to identify patterns, errors, and provide actionable insights.
# Analyze a log file
Features:
- Automatic error pattern detection
- Time-based analysis (peak error times)
- Root cause suggestions
- Supports multiple log formats
Why These Examples?
These agents demonstrate:
- Practical Value: Solve real problems developers face daily
- Tool Integration: Show how to integrate with system commands (git, file I/O)
- Structured Output: Parse and format LLM responses for actionable results
- Privacy-First: Keep your code and logs local while getting AI assistance
Why Not Just Use OpenAI Client?
Without open-agent-sdk (raw reqwest):
use Client;
let client = new;
let response = client
.post
.json
.send
.await?;
// Complex parsing of SSE chunks
// Extract delta content
// Handle tool calls manually
// Track conversation state yourself
With open-agent-sdk:
use ;
let options = builder
.system_prompt
.model
.base_url
.build?;
let mut stream = query.await?;
// Clean message types (TextBlock, ToolUseBlock)
// Automatic streaming and tool call handling
Value: Familiar patterns + Less boilerplate + Rust performance
Why Rust?
Performance: Zero-cost abstractions mean no runtime overhead. Streaming responses with Tokio delivers throughput comparable to C/C++ while maintaining memory safety.
Safety: Compile-time guarantees prevent data races, null pointer dereferences, and buffer overflows. Your agents won't crash from memory issues.
Concurrency: Fearless concurrency with async/await lets you run multiple agents or handle hundreds of concurrent requests without fear of race conditions.
Production Ready: Strong type system catches bugs at compile time. Comprehensive error handling with Result types. No surprises in production.
Small Binaries: Standalone executables under 10MB. Deploy anywhere without runtime dependencies.
API Reference
AgentOptions
builder
.system_prompt // System prompt
.model // Model name (required)
.base_url // OpenAI-compatible endpoint (required)
.tool // Add tools for function calling
.hooks // Lifecycle hooks for monitoring/control
.auto_execute_tools // Enable automatic tool execution
.max_tool_iterations // Max tool calls per query in auto mode
.max_tokens // Tokens to generate (None = provider default)
.temperature // Sampling temperature
.timeout // Request timeout in seconds
.api_key // API key (default: "not-needed")
.build?
query()
Simple single-turn query function.
pub async
Returns a stream yielding ContentBlock items.
Client
Multi-turn conversation client with tool monitoring.
let mut client = new?;
client.send.await?;
while let Some = client.receive.await
Message Types
ContentBlock::Text(TextBlock)- Text content from modelContentBlock::ToolUse(ToolUseBlock)- Tool calls from modelContentBlock::ToolResult(ToolResultBlock)- Tool execution results
Tool System
use tool;
let my_tool = tool
.param
.build;
Recommended Models
Local models (LM Studio, Ollama, llama.cpp):
- GPT-OSS-120B - Best in class for speed and quality
- Qwen 3 30B - Excellent instruction following, good for most tasks
- GPT-OSS-20B - Solid all-around performance
- Mistral 7B - Fast and efficient for simple agents
Cloud-proxied via local gateway:
- kimi-k2:1t-cloud - Tested and working via Ollama gateway
- deepseek-v3.1:671b-cloud - High-quality reasoning model
- qwen3-coder:480b-cloud - Code-focused models
Project Structure
open-agent-sdk-rust/
├── src/
│ ├── client.rs # query() and Client implementation
│ ├── config.rs # Configuration builder
│ ├── context.rs # Token estimation and truncation
│ ├── error.rs # Error types
│ ├── hooks.rs # Lifecycle hooks
│ ├── lib.rs # Public exports
│ ├── retry.rs # Retry logic with exponential backoff
│ ├── tools.rs # Tool system
│ ├── types.rs # Core types (AgentOptions, ContentBlock, etc.)
│ └── utils.rs # SSE parsing and tool call aggregation
├── examples/
│ ├── simple_query.rs # Basic streaming query
│ ├── calculator_tools.rs # Function calling (manual mode)
│ ├── auto_execution_demo.rs # Automatic tool execution
│ ├── multi_tool_agent.rs # Production agent with 5 tools and hooks
│ ├── hooks_example.rs # Lifecycle hooks patterns
│ ├── context_management.rs # Context management patterns
│ ├── interrupt_demo.rs # Interrupt capability patterns
│ ├── git_commit_agent.rs # Production: Git commit generator
│ ├── log_analyzer_agent.rs # Production: Log analyzer
│ └── advanced_patterns.rs # Retry logic and concurrent requests
├── tests/
│ ├── integration_tests.rs
│ ├── hooks_integration_test.rs # Hooks integration tests
│ ├── auto_execution_test.rs # Auto-execution tests
│ └── advanced_integration_test.rs # Advanced integration tests
├── Cargo.toml
└── README.md
Examples
Production Agents
git_commit_agent.rs– Analyzes git diffs and writes professional commit messageslog_analyzer_agent.rs– Parses logs, finds patterns, suggests fixesmulti_tool_agent.rs– Complete production setup with 5 tools, hooks, and auto-execution
Core SDK Usage
simple_query.rs– Minimal streaming query (simplest quickstart)calculator_tools.rs– Manual tool execution patternauto_execution_demo.rs– Automatic tool execution patternhooks_example.rs– Lifecycle hooks patterns (security gates, audit logging)context_management.rs– Manual history management patternsinterrupt_demo.rs– Interrupt capability patterns (timeout, conditional, concurrent)advanced_patterns.rs– Retry logic and concurrent request handling
Documentation
- API Documentation
- Python SDK - Reference implementation
- Examples - Comprehensive usage examples
Testing
# Run all tests
# Run with output
# Run specific test
Test Coverage:
- 57 unit tests across 10 modules
- 28 integration tests
- 6 hooks integration tests
- 13 auto-execution tests
- 9 advanced integration tests
Requirements
- Rust 1.85+
- Tokio 1.0+ (async runtime)
- serde, serde_json (serialization)
- reqwest (HTTP client)
- futures (async streams)
License
MIT License - see LICENSE for details.
Acknowledgments
- Rust port of open-agent-sdk Python library
- API design inspired by claude-agent-sdk
- Built for local/open-source LLM enthusiasts
Status: v0.1.0 Published - 100% feature parity with Python SDK, production-ready
Star this repo if you're building AI agents with local models in Rust!