ModelRelay Rust SDK
The ModelRelay Rust SDK is a responses-first, streaming-first client for building cross-provider LLM features without committing to any single vendor API.
It’s designed to feel great in Rust:
- One fluent builder (
ResponseBuilder) for sync/async, streaming/non-streaming, text/structured, and customer-attributed requests. - Structured outputs powered by real Rust types (
schemars::JsonSchema+serde::Deserialize) with schema generation, validation, and retry. - A practical tool-use toolkit (registry, typed arg parsing, retry loops, streaming tool deltas) for “LLM + tools” apps.
[]
= "5.1.0"
Quick Start (Async)
use ;
async
Chat-Like Text Helpers
For the most common path (system + user → assistant text), use the built-in convenience:
let text = client
.responses
.text
.await?;
println!;
For customer-attributed requests where the backend selects the model:
let customer = client.for_customer?;
let text = customer
.responses
.text
.await?;
Extracting Assistant Text
If you just need the assistant text, use:
let text = response.text;
let parts = response.text_chunks; // each assistant text content part, in order
These helpers:
- include only output items with
role == assistant - include only
textcontent parts
Why This SDK Feels Good
Fluent request building (value-style)
ResponseBuilder is a small, clonable value. You can compose “base requests” and reuse them:
use ResponseBuilder;
let base = new
.model
.system;
let a = base.clone.user;
let b = base.clone.user;
Streaming you can actually use
If you only want text, stream just deltas:
use StreamExt;
use ResponseBuilder;
let mut deltas = new
.model
.user
.stream_deltas
.await?;
while let Some = deltas.next.await
If you want full control, stream typed events (message start/delta/stop, tool deltas, ping/custom):
use StreamExt;
use ;
let mut stream = new
.model
.user
.stream
.await?;
while let Some = stream.next.await
Workflows
Build multi-step AI pipelines with the workflow helpers.
Sequential Chain
use ;
let spec = chain
.output
.build?;
let run = client.runs.create.await?;
Parallel with Aggregation
use ;
let spec = parallel
.llm
.edge
.output
.build?;
Map Fan-out
use ;
let spec = workflow
.name
.model
.llm
.map_fanout
.llm
.output
.build?;
Structured outputs from Rust types (with retry)
Structured outputs are the “Rust-native” path: you describe a type, and you get a typed value back.
use ;
use JsonSchema;
use Deserialize;
let client = from_api_key?.build?;
let result = new
.model
.user
.
.max_retries
.send
.await?;
println!;
And you can stream typed JSON with field-level completion for progressive UIs:
use StreamExt;
use JsonSchema;
use Deserialize;
use ResponseBuilder;
let mut stream = new
.model
.user
.
.stream
.await?;
while let Some = stream.next.await
Tool use is end-to-end (not just a schema)
The SDK ships the pieces you need to build a complete tool loop:
- create tool schemas from types
- parse/validate tool args into typed structs
- execute tool calls via a registry
- feed results back as tool result messages
- retry tool calls when args are malformed (with model-facing error formatting)
use ;
use JsonSchema;
use Deserialize;
let registry = new.register;
let schema = ?;
let tool = function;
let response = new
.model
.user
.tools
.tool_choice
.send
.await?;
if response.has_tool_calls
tools.v0 local filesystem tools (fs.*)
The Rust SDK includes a safe-by-default local filesystem tool pack that implements:
fs.read_file, fs.list_files, fs.search, and fs.edit.
use ;
let mut registry = new;
let fs_tools = new;
fs_tools.register_into;
// Now registry can execute fs.read_file/fs.list_files/fs.search/fs.edit tool calls.
Customer-Attributed Requests
For metered billing, set customer_id(...). The customer's tier can determine the model (so model(...) can be omitted):
use ResponseBuilder;
let response = new
.customer_id
.user
.send
.await?;
Blocking API (No Tokio)
Enable the blocking feature and use the same builder ergonomics:
use ;
let client = new?;
let response = new
.model
.user
.send_blocking?;
Feature Flags
| Feature | Default | Description |
|---|---|---|
streaming |
Yes | NDJSON streaming support |
blocking |
No | Sync client without Tokio |
tracing |
No | OpenTelemetry spans/events |
mock |
No | In-memory client for tests |
Errors
Errors are typed so callers can branch cleanly:
use ;
let result = new
.model
.user
.send
.await;
match result
Documentation
For detailed guides and API reference, visit docs.modelrelay.ai:
- Rust SDK Reference — Full SDK documentation
- First Request — Make your first API call
- Streaming — Real-time response streaming
- Structured Output — Get typed JSON responses
- Tool Use — Let models call functions
- Error Handling — Handle errors gracefully
- Workflows — Multi-step AI pipelines