Open Agent SDK (Rust)
Build production-ready AI agents in Rust using your own hardware
What you can build:
- Copy editors that analyze manuscripts and track writing patterns
- Git commit generators that write meaningful commit messages
- Market analyzers that research competitors and summarize findings
- Code reviewers, data analysts, research assistants, and more
Why local?
- No API costs - use your hardware, not OpenAI's
- Privacy - your data never leaves your machine
- Control - pick your model (Qwen, Llama, Mistral, etc.)
How fast? From zero to working agent in under 5 minutes. Rust-native performance (zero-cost abstractions, no GC), fearless concurrency, and production-ready quality with 85+ tests.
Overview
Open Agent SDK (Rust) provides a clean, streaming API for working with OpenAI-compatible local model servers. 100% feature parity with the Python SDK—complete with streaming, tool call aggregation, hooks, and automatic tool execution—built on Tokio for high-performance async I/O.
Supported Providers
Supported (OpenAI-Compatible Endpoints)
- LM Studio -
http://localhost:1234/v1 - Ollama -
http://localhost:11434/v1 - llama.cpp server - OpenAI-compatible mode
- vLLM - OpenAI-compatible API
- Text Generation WebUI - OpenAI extension
- Any OpenAI-compatible local endpoint
- Local gateways proxying cloud models - e.g., Ollama or custom gateways that route to cloud providers
Not Supported (Use Official SDKs)
- Claude/OpenAI direct - Use their official SDKs, unless you proxy through a local OpenAI-compatible gateway
- Cloud provider SDKs - Bedrock, Vertex, Azure, etc. (proxied via local gateway is fine)
Quick Start
Installation
[]
= "0.1.0"
= { = "1", = ["full"] }
= "0.3"
= "1.0"
For development:
Simple Query (LM Studio)
use ;
use StreamExt;
async
Multi-Turn Conversation (Ollama)
use ;
async
Function Calling with Tools
Define tools using the builder pattern for clean, type-safe function calling:
use ;
use json;
async
Advanced: Manual Tool Execution
For custom execution logic or result interception:
// Disable auto-execution
let options = builder
.system_prompt
.model
.base_url
.tool
.auto_execute_tools // Manual mode
.build?;
let mut client = new;
client.send.await?;
while let Some = client.receive.await
Key Features:
- Automatic execution - Tools run automatically with safety limits
- Type-safe schemas - Automatic JSON schema generation from parameters
- OpenAI-compatible - Works with any OpenAI function calling endpoint
- Clean builder API - Fluent API for tool definition
- Hook integration - PreToolUse/PostToolUse hooks work in both modes
See examples/calculator_tools.rs and examples/auto_execution_demo.rs for complete examples.
Context Management
Local models have fixed context windows (typically 8k-32k tokens). The SDK provides utilities for manual history management—no silent mutations, you stay in control.
Token Estimation & Truncation
use ;
let mut client = new;
// Long conversation...
for i in 0..50
// Check token usage
let tokens = estimate_tokens;
println!;
// Manually truncate when needed
if tokens > 28000
Recommended Patterns
1. Stateless Agents (Best for single-task agents):
// Process each task independently - no history accumulation
for task in tasks
2. Manual Truncation (At natural breakpoints):
use truncate_messages;
let mut client = new;
for task in tasks
3. External Memory (RAG-lite for research agents):
// Store important facts in database, keep conversation context small
let mut database = new;
let mut client = new;
client.send.await?;
// Save response to database
database.insert;
// Clear history, query database when needed
let truncated = truncate_messages;
*client.history_mut = truncated;
Why Manual?
The SDK intentionally does not auto-compact history because:
- Domain-specific needs: Copy editors need different strategies than research agents
- Token accuracy varies: Each model family has different tokenizers
- Risk of breaking context: Silently removing messages could break tool chains
- Natural limits exist: Compaction doesn't bypass model context windows
See examples/context_management.rs for complete patterns and usage.
Lifecycle Hooks
Monitor and control agent behavior at key execution points with zero-cost Rust hooks.
Quick Example
use ;
// Security gate - block dangerous operations
let hooks = new
.add_pre_tool_use
.add_post_tool_use;
// Register hooks in AgentOptions
let options = builder
.system_prompt
.model
.base_url
.hooks
.build?;
let mut client = new;
Hook Types
PreToolUse - Fires before tool execution
- Block operations: Return
Some(HookDecision::block(reason)) - Modify inputs: Return
Some(HookDecision::modify_input(json!({}), reason)) - Allow: Return
Some(HookDecision::continue_())
PostToolUse - Fires after tool result added to history
- Observational (tool already executed)
- Use for audit logging, metrics, result validation
- Return
NoneorSome(HookDecision::...)
UserPromptSubmit - Fires before sending prompt to API
- Block prompts: Return
Some(HookDecision::block(reason)) - Modify prompts: Return
Some(HookDecision::modify_prompt(text, reason)) - Allow: Return
Some(HookDecision::continue_())
Common Patterns
Pattern 1: Redirect to Sandbox
hooks.add_pre_tool_use
Pattern 2: Compliance Audit Log
let audit_log = new;
let log_clone = audit_log.clone;
hooks.add_post_tool_use
Hook Execution Flow
- Hooks run sequentially in the order registered
- First non-None decision wins (short-circuit behavior)
- Hooks run inline on async runtime (spawn tasks for heavy work)
- Works with both Client and query() function
See examples/hooks_example.rs and examples/multi_tool_agent.rs for comprehensive patterns.
Interrupt Capability
Cancel long-running operations cleanly without corrupting client state. Perfect for timeouts, user cancellations, or conditional interruptions.
Interrupt Quick Example
use ;
use ;
async
Common Interrupt Patterns
1. Conditional Interruption
let mut full_text = Stringnew;
while let Some = client.receive.await
2. Concurrent Cancellation
use select;
let stream_task = async ;
let cancel_task = async ;
select!
How It Works
When you call client.interrupt():
- Active stream closure - HTTP stream closed immediately (not just a flag)
- Clean state - Client remains in valid state for reuse
- Partial output - Text blocks flushed to history, incomplete tools skipped
- Idempotent - Safe to call multiple times
- Thread-safe - Can be called from separate async tasks
See examples/interrupt_demo.rs for comprehensive patterns.
Practical Examples
We've included production-ready agents that demonstrate real-world usage:
Git Commit Agent
Analyzes your staged git changes and writes professional commit messages following conventional commit format.
# Stage your changes
# Run the agent
# Output:
# Found staged changes in 3 file(s)
# Analyzing changes and generating commit message...
#
# Suggested commit message:
# feat(auth): Add OAuth2 integration with refresh tokens
#
# - Implement token refresh mechanism
# - Add secure cookie storage for tokens
# - Update login flow to support OAuth2 providers
Features:
- Analyzes diff to determine commit type (feat/fix/docs/etc)
- Writes clear, descriptive commit messages
- Follows conventional commit standards
Log Analyzer Agent
examples/log_analyzer_agent.rs
Intelligently analyzes application logs to identify patterns, errors, and provide actionable insights.
# Analyze a log file
Features:
- Automatic error pattern detection
- Time-based analysis (peak error times)
- Root cause suggestions
- Supports multiple log formats
Why These Examples?
These agents demonstrate:
- Practical Value: Solve real problems developers face daily
- Tool Integration: Show how to integrate with system commands (git, file I/O)
- Structured Output: Parse and format LLM responses for actionable results
- Privacy-First: Keep your code and logs local while getting AI assistance
Why Not Just Use OpenAI Client?
Without open-agent-sdk (raw reqwest):
use Client;
let client = new;
let response = client
.post
.json
.send
.await?;
// Complex parsing of SSE chunks
// Extract delta content
// Handle tool calls manually
// Track conversation state yourself
With open-agent-sdk:
use ;
let options = builder
.system_prompt
.model
.base_url
.build?;
let mut stream = query.await?;
// Clean message types (TextBlock, ToolUseBlock)
// Automatic streaming and tool call handling
Value: Familiar patterns + Less boilerplate + Rust performance
Why Rust?
Performance: Zero-cost abstractions mean no runtime overhead. Streaming responses with Tokio delivers throughput comparable to C/C++ while maintaining memory safety.
Safety: Compile-time guarantees prevent data races, null pointer dereferences, and buffer overflows. Your agents won't crash from memory issues.
Concurrency: Fearless concurrency with async/await lets you run multiple agents or handle hundreds of concurrent requests without fear of race conditions.
Production Ready: Strong type system catches bugs at compile time. Comprehensive error handling with Result types. No surprises in production.
Small Binaries: Standalone executables under 10MB. Deploy anywhere without runtime dependencies.
API Reference
AgentOptions
builder
.system_prompt // System prompt
.model // Model name (required)
.base_url // OpenAI-compatible endpoint (required)
.tool // Add tools for function calling
.hooks // Lifecycle hooks for monitoring/control
.auto_execute_tools // Enable automatic tool execution
.max_tool_iterations // Max tool calls per query in auto mode
.max_tokens // Tokens to generate (None = provider default)
.temperature // Sampling temperature
.timeout // Request timeout in seconds
.api_key // API key (default: "not-needed")
.build?
query()
Simple single-turn query function.
pub async
Returns a stream yielding ContentBlock items.
Client
Multi-turn conversation client with tool monitoring.
let mut client = new;
client.send.await?;
while let Some = client.receive.await
Message Types
ContentBlock::Text(TextBlock)- Text content from modelContentBlock::ToolUse(ToolUseBlock)- Tool calls from modelContentBlock::ToolResult(ToolResultBlock)- Tool execution results
Tool System
use tool;
let my_tool = tool
.param
.build;
Recommended Models
Local models (LM Studio, Ollama, llama.cpp):
- GPT-OSS-120B - Best in class for speed and quality
- Qwen 3 30B - Excellent instruction following, good for most tasks
- GPT-OSS-20B - Solid all-around performance
- Mistral 7B - Fast and efficient for simple agents
Cloud-proxied via local gateway:
- kimi-k2:1t-cloud - Tested and working via Ollama gateway
- deepseek-v3.1:671b-cloud - High-quality reasoning model
- qwen3-coder:480b-cloud - Code-focused models
Project Structure
open-agent-sdk-rust/
├── src/
│ ├── client.rs # query() and Client implementation
│ ├── config.rs # Configuration builder
│ ├── context.rs # Token estimation and truncation
│ ├── error.rs # Error types
│ ├── hooks.rs # Lifecycle hooks
│ ├── lib.rs # Public exports
│ ├── retry.rs # Retry logic with exponential backoff
│ ├── tools.rs # Tool system
│ ├── types.rs # Core types (AgentOptions, ContentBlock, etc.)
│ └── utils.rs # SSE parsing and tool call aggregation
├── examples/
│ ├── simple_query.rs # Basic streaming query
│ ├── calculator_tools.rs # Function calling (manual mode)
│ ├── auto_execution_demo.rs # Automatic tool execution
│ ├── multi_tool_agent.rs # Production agent with 5 tools and hooks
│ ├── hooks_example.rs # Lifecycle hooks patterns
│ ├── context_management.rs # Context management patterns
│ ├── interrupt_demo.rs # Interrupt capability patterns
│ ├── git_commit_agent.rs # Production: Git commit generator
│ ├── log_analyzer_agent.rs # Production: Log analyzer
│ └── advanced_patterns.rs # Retry logic and concurrent requests
├── tests/
│ ├── integration_tests.rs
│ ├── hooks_integration_test.rs # Hooks integration tests
│ ├── auto_execution_test.rs # Auto-execution tests
│ └── advanced_integration_test.rs # Advanced integration tests
├── Cargo.toml
└── README.md
Examples
Production Agents
git_commit_agent.rs– Analyzes git diffs and writes professional commit messageslog_analyzer_agent.rs– Parses logs, finds patterns, suggests fixesmulti_tool_agent.rs– Complete production setup with 5 tools, hooks, and auto-execution
Core SDK Usage
simple_query.rs– Minimal streaming query (simplest quickstart)calculator_tools.rs– Manual tool execution patternauto_execution_demo.rs– Automatic tool execution patternhooks_example.rs– Lifecycle hooks patterns (security gates, audit logging)context_management.rs– Manual history management patternsinterrupt_demo.rs– Interrupt capability patterns (timeout, conditional, concurrent)advanced_patterns.rs– Retry logic and concurrent request handling
Documentation
- API Documentation
- Python SDK - Reference implementation
- Examples - Comprehensive usage examples
Testing
# Run all tests
# Run with output
# Run specific test
Test Coverage:
- 57 unit tests across 10 modules
- 28 integration tests
- 6 hooks integration tests
- 13 auto-execution tests
- 9 advanced integration tests
Requirements
- Rust 1.85+
- Tokio 1.0+ (async runtime)
- serde, serde_json (serialization)
- reqwest (HTTP client)
- futures (async streams)
License
MIT License - see LICENSE for details.
Acknowledgments
- Rust port of open-agent-sdk Python library
- API design inspired by claude-agent-sdk
- Built for local/open-source LLM enthusiasts
Status: v0.1.0 Published - 100% feature parity with Python SDK, production-ready
Star this repo if you're building AI agents with local models in Rust!