openai-oxide 0.9.0

Idiomatic Rust client for the OpenAI API — 1:1 parity with the official Python SDK
Documentation

openai-oxide

Idiomatic Rust client for the OpenAI API — 1:1 parity with the official Python SDK.

Performance

Benchmarked against the official Python SDK and 2 Rust alternatives. All use the Responses API (POST /responses), GPT-5.4, warm connections, 5 iterations, median.

Sequential requests

Test openai-oxide genai 0.6 async-openai 0.33 Python 2.29
Plain text 922ms 948ms 968ms 966ms
Structured output 1404ms 1428ms 3407ms 1258ms
Function calling 975ms 1044ms 1244ms 1039ms
Multi-turn (2 reqs) 2042ms 2303ms 2289ms 2188ms
Web search 2969ms 3176ms
Nested structured 5013ms 4286ms
Agent loop (FC→result→JSON) 3933ms 4113ms
Rapid-fire (5 calls) 4521ms 4646ms
Prompt-cached 4433ms 4712ms

Advanced patterns (oxide-only)

Test oxide Python Speedup
Streaming TTFT 588ms 659ms 11% faster
Stream FC (early parse) 909ms -38% vs normal FC
Parallel 3x fan-out 926ms 1462ms 37% faster
Hedged 2x race 893ms 958ms 7% faster
WebSocket plain text 721ms -22% vs HTTP
WebSocket multi-turn 1650ms -19% vs HTTP

oxide wins 10/13 tests vs Python. No other Rust or Python client has WebSocket mode, streaming FC early parse, hedged requests, or parallel fan-out built in.

Why it's fast

Technique What it does Savings
HTTP/2 keep-alive while idle Connections stay warm between requests -200ms cold start
HTTP/2 adaptive windows Auto-tuned flow control Better throughput
Parallel fan-out tokio::join! + HTTP/2 multiplex 3 answers ≈ 1 latency
Hedged requests Send 2 copies, take fastest P99 -50-96%
Streaming TTFT First token in ~588ms -36% vs full response
Stream FC early parse Yield function call on arguments.done -38% vs response.completed
WebSocket mode Persistent wss:// — no per-turn HTTP -20-25% per request
Prompt cache key Server-side system prompt caching Up to -80% TTFT
Fast-path retry No loop overhead for successful requests -5-15ms
gzip + from_slice Compressed responses, zero-copy deser Bandwidth + alloc

Run the benchmark yourself:

OPENAI_API_KEY=sk-... cargo run --example benchmark --features responses --release
python3 examples/bench_python.py  # Python comparison

Features

  • Async-first (tokio + reqwest 0.13)
  • Strongly typed requests and responses (serde)
  • SSE streaming for Chat Completions and Responses API
  • Automatic retries with exponential backoff
  • Chainable builder pattern for requests
  • Responses API with tool support (WebSearch, FileSearch, MCP, etc.)
  • Structured outputs (JSON Schema with strict mode)
  • Reasoning model support (o-series: effort, summary)
  • Realtime API session creation (ephemeral tokens)
  • 100% OpenAPI field coverage for Chat Completions
  • Same resource structure as Python SDK: client.chat().completions().create()

Feature Flags

Each API resource is behind an optional Cargo feature (all enabled by default):

# All resources (default)
openai-oxide = "0.9"

# Only chat + embeddings
openai-oxide = { version = "0.8", default-features = false, features = ["chat", "embeddings"] }

Available features: chat, responses, embeddings, images, audio, files, fine-tuning, models, moderations, batches, uploads, beta.

Quick Start

Add to Cargo.toml:

[dependencies]
openai-oxide = "0.9"
tokio = { version = "1", features = ["full"] }
use openai_oxide::{OpenAI, types::chat::*};

#[tokio::main]
async fn main() -> Result<(), openai_oxide::OpenAIError> {
    let client = OpenAI::from_env()?;

    let request = ChatCompletionRequest::new(
        "gpt-4o-mini",
        vec![
            ChatCompletionMessageParam::System {
                content: "You are a helpful assistant.".into(),
                name: None,
            },
            ChatCompletionMessageParam::User {
                content: UserContent::Text("Hello!".into()),
                name: None,
            },
        ],
    );

    let response = client.chat().completions().create(request).await?;
    println!("{}", response.choices[0].message.content.as_deref().unwrap_or(""));
    Ok(())
}

Responses API

use openai_oxide::{OpenAI, types::responses::*};

#[tokio::main]
async fn main() -> Result<(), openai_oxide::OpenAIError> {
    let client = OpenAI::from_env()?;

    let response = client.responses().create(
        ResponseCreateRequest::new("gpt-5.4")
            .input("What are the latest developments in Rust?")
            .tools(vec![ResponseTool::WebSearch {
                search_context_size: Some("medium".into()),
                user_location: None,
            }])
            .max_output_tokens(1024)
    ).await?;

    println!("{}", response.output_text());

    // Extract function calls
    for fc in response.function_calls() {
        println!("Tool: {}({})", fc.name, fc.arguments);
    }
    Ok(())
}

Streaming

use futures_util::StreamExt;
use openai_oxide::{OpenAI, types::chat::*};

#[tokio::main]
async fn main() -> Result<(), openai_oxide::OpenAIError> {
    let client = OpenAI::from_env()?;

    let request = ChatCompletionRequest::new(
        "gpt-4o-mini",
        vec![ChatCompletionMessageParam::User {
            content: UserContent::Text("Tell me a joke".into()),
            name: None,
        }],
    );

    let mut stream = client.chat().completions().create_stream(request).await?;
    while let Some(chunk) = stream.next().await {
        let chunk = chunk?;
        if let Some(delta) = chunk.choices.first().and_then(|c| c.delta.content.as_deref()) {
            print!("{delta}");
        }
    }
    Ok(())
}

BYOT (Bring Your Own Types)

Send custom fields or get raw JSON responses using create_raw():

use openai_oxide::OpenAI;
use serde_json::json;

#[tokio::main]
async fn main() -> Result<(), openai_oxide::OpenAIError> {
    let client = OpenAI::from_env()?;

    let raw = client.chat().completions().create_raw(&json!({
        "model": "gpt-4o",
        "messages": [{"role": "user", "content": "Hi"}],
        "custom_field": true
    })).await?;

    println!("{}", raw["choices"][0]["message"]["content"]);
    Ok(())
}

Also available on client.responses().create_raw() and client.embeddings().create_raw().

Image Save Helper

Save generated images directly to disk:

let resp = client.images().generate(req).await?;
if let Some(images) = &resp.data {
    images[0].save("output.png").await?;  // handles both URL and b64_json
}

Pagination

All list endpoints support automatic cursor-based pagination:

use futures_util::StreamExt;
use openai_oxide::{OpenAI, types::file::FileListParams};

#[tokio::main]
async fn main() -> Result<(), openai_oxide::OpenAIError> {
    let client = OpenAI::from_env()?;

    // Single page with params
    let page = client.files().list_page(
        FileListParams::new().limit(10)
    ).await?;

    // Auto-paginate through all results
    let mut stream = client.files().list_auto(FileListParams::new());
    while let Some(file) = stream.next().await {
        let file = file?;
        println!("{}: {}", file.id, file.filename);
    }
    Ok(())
}

Configuration

use openai_oxide::{OpenAI, ClientConfig};

// From environment variable OPENAI_API_KEY
let client = OpenAI::from_env()?;

// Explicit API key
let client = OpenAI::new("sk-...");

// Full configuration
let config = ClientConfig::new("sk-...")
    .base_url("https://api.openai.com/v1")
    .timeout_secs(30)
    .max_retries(3);
let client = OpenAI::with_config(config);

Implemented APIs

API Method Status
Chat Completions client.chat().completions().create() Done
Chat Completions (streaming) client.chat().completions().create_stream() Done
Responses client.responses().create() / create_stream() Done
Responses Tools Function, WebSearch, FileSearch, CodeInterpreter, ComputerUse, Mcp Done
Embeddings client.embeddings().create() Done
Models client.models().list() / retrieve() / delete() Done
Images client.images().generate() / edit() / create_variation() Done
Audio Transcription client.audio().transcriptions().create() Done
Audio Translation client.audio().translations().create() Done
Audio Speech (TTS) client.audio().speech().create() Done
Files client.files().create() / list() / retrieve() / delete() / content() Done
Fine-tuning client.fine_tuning().jobs().create() / list() / cancel() / list_events() Done
Moderations client.moderations().create() Done
Batches client.batches().create() / list() / retrieve() / cancel() Done
Uploads client.uploads().create() / cancel() / complete() Done
Assistants (beta) client.beta().assistants().create() / list() / retrieve() / delete() Done
Threads (beta) client.beta().threads().create() / retrieve() / delete() / messages() Done
Runs (beta) client.beta().runs(thread_id).create() / retrieve() / cancel() Done
Vector Stores (beta) client.beta().vector_stores().create() / list() / retrieve() / delete() Done
Realtime (beta) client.beta().realtime().sessions().create() Done

Development

cargo test                          # all tests
cargo test --features live-tests    # tests hitting real API (needs OPENAI_API_KEY)
cargo clippy -- -D warnings         # lint
cargo fmt -- --check                # format check
cargo run --example benchmark --features responses --release  # benchmark

License

MIT