openai-oxide
Idiomatic Rust client for the OpenAI API — 1:1 parity with the official Python SDK.
Performance
Benchmarked against the official Python SDK and 2 Rust alternatives. All use the Responses API (POST /responses), GPT-5.4, warm connections, 5 iterations, median.
Sequential requests
| Test | openai-oxide | genai 0.6 | async-openai 0.33 | Python 2.29 |
|---|---|---|---|---|
| Plain text | 922ms | 948ms | 968ms | 966ms |
| Structured output | 1404ms | 1428ms | 3407ms | 1258ms |
| Function calling | 975ms | 1044ms | 1244ms | 1039ms |
| Multi-turn (2 reqs) | 2042ms | 2303ms | 2289ms | 2188ms |
| Web search | 2969ms | — | — | 3176ms |
| Nested structured | 5013ms | — | — | 4286ms |
| Agent loop (FC→result→JSON) | 3933ms | — | — | 4113ms |
| Rapid-fire (5 calls) | 4521ms | — | — | 4646ms |
| Prompt-cached | 4433ms | — | — | 4712ms |
Advanced patterns (oxide-only)
| Test | oxide | Python | Speedup |
|---|---|---|---|
| Streaming TTFT | 588ms | 659ms | 11% faster |
| Stream FC (early parse) | 909ms | — | -38% vs normal FC |
| Parallel 3x fan-out | 926ms | 1462ms | 37% faster |
| Hedged 2x race | 893ms | 958ms | 7% faster |
| WebSocket plain text | 721ms | — | -22% vs HTTP |
| WebSocket multi-turn | 1650ms | — | -19% vs HTTP |
oxide wins 10/13 tests vs Python. No other Rust or Python client has WebSocket mode, streaming FC early parse, hedged requests, or parallel fan-out built in.
Why it's fast
| Technique | What it does | Savings |
|---|---|---|
| HTTP/2 keep-alive while idle | Connections stay warm between requests | -200ms cold start |
| HTTP/2 adaptive windows | Auto-tuned flow control | Better throughput |
| Parallel fan-out | tokio::join! + HTTP/2 multiplex |
3 answers ≈ 1 latency |
| Hedged requests | Send 2 copies, take fastest | P99 -50-96% |
| Streaming TTFT | First token in ~588ms | -36% vs full response |
| Stream FC early parse | Yield function call on arguments.done |
-38% vs response.completed |
| WebSocket mode | Persistent wss:// — no per-turn HTTP |
-20-25% per request |
| Prompt cache key | Server-side system prompt caching | Up to -80% TTFT |
| Fast-path retry | No loop overhead for successful requests | -5-15ms |
| gzip + from_slice | Compressed responses, zero-copy deser | Bandwidth + alloc |
Run the benchmark yourself:
OPENAI_API_KEY=sk-...
Features
- Async-first (tokio + reqwest 0.13)
- Strongly typed requests and responses (serde)
- SSE streaming for Chat Completions and Responses API
- Automatic retries with exponential backoff
- Chainable builder pattern for requests
- Responses API with tool support (WebSearch, FileSearch, MCP, etc.)
- Structured outputs (JSON Schema with strict mode)
- Reasoning model support (o-series: effort, summary)
- Realtime API session creation (ephemeral tokens)
- 100% OpenAPI field coverage for Chat Completions
- Same resource structure as Python SDK:
client.chat().completions().create()
Feature Flags
Each API resource is behind an optional Cargo feature (all enabled by default):
# All resources (default)
= "0.9"
# Only chat + embeddings
= { = "0.8", = false, = ["chat", "embeddings"] }
Available features: chat, responses, embeddings, images, audio, files, fine-tuning, models, moderations, batches, uploads, beta.
Quick Start
Add to Cargo.toml:
[]
= "0.9"
= { = "1", = ["full"] }
use ;
async
Responses API
use ;
async
Streaming
use StreamExt;
use ;
async
BYOT (Bring Your Own Types)
Send custom fields or get raw JSON responses using create_raw():
use OpenAI;
use json;
async
Also available on client.responses().create_raw() and client.embeddings().create_raw().
Image Save Helper
Save generated images directly to disk:
let resp = client.images.generate.await?;
if let Some = &resp.data
Pagination
All list endpoints support automatic cursor-based pagination:
use StreamExt;
use ;
async
Configuration
use ;
// From environment variable OPENAI_API_KEY
let client = from_env?;
// Explicit API key
let client = new;
// Full configuration
let config = new
.base_url
.timeout_secs
.max_retries;
let client = with_config;
Implemented APIs
| API | Method | Status |
|---|---|---|
| Chat Completions | client.chat().completions().create() |
Done |
| Chat Completions (streaming) | client.chat().completions().create_stream() |
Done |
| Responses | client.responses().create() / create_stream() |
Done |
| Responses Tools | Function, WebSearch, FileSearch, CodeInterpreter, ComputerUse, Mcp | Done |
| Embeddings | client.embeddings().create() |
Done |
| Models | client.models().list() / retrieve() / delete() |
Done |
| Images | client.images().generate() / edit() / create_variation() |
Done |
| Audio Transcription | client.audio().transcriptions().create() |
Done |
| Audio Translation | client.audio().translations().create() |
Done |
| Audio Speech (TTS) | client.audio().speech().create() |
Done |
| Files | client.files().create() / list() / retrieve() / delete() / content() |
Done |
| Fine-tuning | client.fine_tuning().jobs().create() / list() / cancel() / list_events() |
Done |
| Moderations | client.moderations().create() |
Done |
| Batches | client.batches().create() / list() / retrieve() / cancel() |
Done |
| Uploads | client.uploads().create() / cancel() / complete() |
Done |
| Assistants (beta) | client.beta().assistants().create() / list() / retrieve() / delete() |
Done |
| Threads (beta) | client.beta().threads().create() / retrieve() / delete() / messages() |
Done |
| Runs (beta) | client.beta().runs(thread_id).create() / retrieve() / cancel() |
Done |
| Vector Stores (beta) | client.beta().vector_stores().create() / list() / retrieve() / delete() |
Done |
| Realtime (beta) | client.beta().realtime().sessions().create() |
Done |
Development
License
MIT