oxi-ai
Unified LLM API for Rust — streaming, multi-provider, tool calling, and context management.
Overview
oxi-ai provides a single, provider-agnostic interface for interacting with large language models. It handles streaming responses, tool/function calling, conversation context, token estimation, message compaction, and cross-provider message transformation.
Design Principles
- Provider-agnostic — same
ContextandMessagetypes work across all providers - Streaming-first — all LLM calls return async streams of
ProviderEvents - Type-safe — strongly typed messages, tool definitions, and content blocks
- Zero-cost — no runtime overhead for provider abstraction
Quick Start
Add to your Cargo.toml:
[]
= { = "path/to/oxi-ai" }
Basic usage:
use ;
async
Providers
Supported Providers
| Provider | API | Environment Variable |
|---|---|---|
| OpenAI | openai-completions |
OPENAI_API_KEY |
| Anthropic | anthropic-messages |
ANTHROPIC_API_KEY |
google-generative-ai |
GOOGLE_API_KEY |
|
| DeepSeek | openai-completions |
DEEPSEEK_API_KEY |
| Mistral | openai-completions |
MISTRAL_API_KEY |
| Groq | openai-completions |
GROQ_API_KEY |
| Cerebras | openai-completions |
CEREBRAS_API_KEY |
| xAI | openai-completions |
XAI_API_KEY |
| OpenRouter | openai-completions |
OPENROUTER_API_KEY |
| Azure OpenAI | azure-openai-responses |
AZURE_OPENAI_API_KEY |
Providers that use the openai-completions API share the same OpenAiProvider implementation with different base URLs.
Provider Trait
Implement the Provider trait to add custom providers:
use async_trait;
use ;
Provider Events
All streaming responses produce ProviderEvent variants:
| Event | Description |
|---|---|
TextStart |
Text content block begins |
TextDelta { delta } |
Incremental text chunk |
TextEnd |
Text content block ends |
ThinkingStart |
Thinking/reasoning block begins |
ThinkingDelta { delta } |
Incremental thinking text |
ThinkingEnd |
Thinking block ends |
ToolCallStart |
Tool call begins |
ToolCallDelta { delta } |
Incremental tool call arguments |
ToolCallEnd { tool_call } |
Complete tool call received |
Done { message } |
Response complete |
Error { error } |
Error response |
API Reference
Core Types
// Model definition
// Thinking levels
// Cache retention
// Stop reasons
Messages
Context
let mut ctx = new
.with_system_prompt;
ctx.add_user_message;
ctx.add_tool;
Tools
use ;
let tool = new;
// Validate arguments
validate_args?;
Model Registry
use ;
// Get a specific model
let model = get_model;
// List providers
let providers = get_providers; // ["anthropic", "cerebras", "deepseek", ...]
// List models for a provider
let models = get_models;
// Search by pattern
let results = search;
Token Estimation
use estimate_tokens;
let tokens = estimate_tokens;
Context Compaction
use ;
let manager = new;
// Automatically compacts context when it exceeds 80% of the context window
High-Level API
use complete;
let response = complete.await?;
Streaming Options
let options = StreamOptions ;
License
MIT