ModelRelay Rust SDK

[dependencies]
modelrelay = "0.29.0"

API Matrix

All four combinations of async/blocking × streaming/non-streaming are supported:

Mode	Non-Streaming	Streaming
Async	`.send(&client)`	`.stream(&client)`
Blocking	`.send_blocking(&client)`	`.stream_blocking(&client)`

Use cases:

Async + Streaming (default): Real-time UIs, chatbots, lowest latency to first token
Async + Non-Streaming: Async backends where you don't need progressive output
Blocking + Streaming: CLI tools with live output, sync apps needing progressive display
Blocking + Non-Streaming: Scripts, CLI tools, sync backends without Tokio

Streaming Chat

use modelrelay::{Client, Config, ChatRequestBuilder};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::new(Config {
        api_key: Some(std::env::var("MODELRELAY_API_KEY")?),
        ..Default::default()
    })?;

    let mut stream = ChatRequestBuilder::new("claude-sonnet-4-20250514")
        .system("You are helpful.")
        .user("Hello!")
        .stream(&client.llm())
        .await?;

    while let Some(chunk) = stream.next().await? {
        if let Some(delta) = chunk.text_delta {
            print!("{}", delta);
        }
    }
    Ok(())
}

Structured Outputs

use schemars::JsonSchema;
use serde::{Deserialize, Serialize};

#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct Person {
    name: String,
    age: u32,
}

let result = ChatRequestBuilder::new("claude-sonnet-4-20250514")
    .user("Extract: John Doe is 30 years old")
    .structured::<Person>()
    .max_retries(2)
    .send(&client.llm())
    .await?;

println!("Name: {}, Age: {}", result.value.name, result.value.age);

Streaming Structured Outputs

Build progressive UIs that render fields as they complete:

#[derive(Debug, Serialize, Deserialize, JsonSchema)]
struct Article {
    title: String,
    summary: String,
    body: String,
}

let mut stream = ChatRequestBuilder::new("claude-sonnet-4-20250514")
    .user("Write an article about Rust")
    .stream_structured::<Article>(&client.llm())
    .await?;

while let Some(event) = stream.next().await? {
    // Render fields as soon as they're complete
    if event.complete_fields.contains("title") {
        render_title(&event.payload.title);  // Safe to display
    }
    if event.complete_fields.contains("summary") {
        render_summary(&event.payload.summary);
    }

    // Show streaming preview of incomplete fields
    if !event.complete_fields.contains("body") {
        render_body_preview(&format!("{}▋", event.payload.body));
    }
}

Customer-Attributed Requests

For metered billing, use for_customer() — the customer's tier determines the model:

let response = client.llm()
    .for_customer("customer-123")
    .user("Hello!")
    .send(&client.llm())
    .await?;

Customer Management (Backend)

// Create/update customer
let customer = client.customers().upsert(CustomerUpsertRequest {
    tier_id: "tier-uuid".into(),
    external_id: "your-user-id".into(),
    email: Some("user@example.com".into()),
    metadata: None,
}).await?;

// Create checkout session for subscription billing
let session = client.customers().create_checkout_session(
    "customer-uuid",
    CheckoutSessionRequest {
        success_url: "https://myapp.com/success".into(),
        cancel_url: "https://myapp.com/cancel".into(),
    },
).await?;

// Check subscription status
let status = client.customers().get_subscription("customer-uuid").await?;

Configuration

let client = Client::new(Config {
    api_key: Some("mr_sk_...".into()),
    connect_timeout: Some(Duration::from_secs(5)),
    retry: Some(RetryConfig { max_attempts: 3, ..Default::default() }),
    ..Default::default()
})?;

Features

client (default): async reqwest + Tokio
blocking: blocking client (no Tokio)
streaming (default): SSE streaming
tracing: spans/events for observability
mock: in-memory client for tests

modelrelay 0.29.0