modelrelay 6.5.0

# ModelRelay Rust SDK

The ModelRelay Rust SDK is a **responses-first**, **streaming-first** client for building cross-provider LLM features without committing to any single vendor API.

It’s designed to feel great in Rust:
- One fluent builder (`ResponseBuilder`) for **sync/async**, **streaming/non-streaming**, **text/structured**, and **customer-attributed** requests.
- Structured outputs powered by real Rust types (`schemars::JsonSchema` + `serde::Deserialize`) with schema generation, validation, and retry.
- A practical tool-use toolkit (registry, typed arg parsing, retry loops, streaming tool deltas) for “LLM + tools” apps.

```toml
[dependencies]
modelrelay = "5.7.0"
```

## Convenience API

The simplest way to get started. Three methods cover the most common use cases:

### Ask — Get a Quick Answer

```rust
use modelrelay::Client;

let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?.build()?;

let answer = client.ask("claude-sonnet-4-5", "What is 2 + 2?", None).await?;
println!("{}", answer); // "4"
```

### Chat — Full Response with Metadata

```rust
use modelrelay::{Client, ChatOptions};

let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?.build()?;

let response = client.chat(
    "claude-sonnet-4-5",
    "Explain quantum computing",
    Some(ChatOptions::new().with_system("You are a physics professor")),
).await?;

println!("{}", response.text());
println!("Tokens: {}", response.usage.total_tokens);
```

### Agent — Agentic Tool Loops

Run an agent that automatically executes tools until completion:

```rust
use modelrelay::{Client, AgentOptions, ToolBuilder};
use schemars::JsonSchema;
use serde::Deserialize;

#[derive(JsonSchema, Deserialize)]
struct ReadFileArgs {
    /// File path to read
    path: String,
}

let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?.build()?;

let tools = ToolBuilder::new()
    .add_sync::<ReadFileArgs, _>("read_file", "Read a file", |args, _call| {
        let content = std::fs::read_to_string(&args.path)
            .map_err(|e| e.to_string())?;
        Ok(serde_json::json!({ "content": content }))
    });

let result = client.agent(
    "claude-sonnet-4-5",
    AgentOptions::new(tools, "Read config.json and summarize it")
        .with_system("You are a helpful file assistant"),
).await?;

println!("{}", result.output);
println!("Tool calls: {}", result.usage.tool_calls);
```

## Quick Start (Async)

```rust
use modelrelay::{Client, ResponseBuilder};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?.build()?;

    let response = ResponseBuilder::new()
        .model("claude-sonnet-4-5")
        .system("Answer concisely.")
        .user("Write one line about Rust.")
        .send(&client.responses())
        .await?;

    // The response is structured: output items, tool calls, citations, usage, etc.
    // For the common case, you can extract assistant text directly:
    println!("{}", response.text());
    println!("tokens: {}", response.usage.total());
    Ok(())
}
```

## Chat-Like Text Helpers

For the most common path (**system + user → assistant text**), use the built-in convenience:

```rust
let text = client
    .responses()
    .text("claude-sonnet-4-5", "Answer concisely.", "Say hi.")
    .await?;
println!("{text}");
```

For customer-attributed requests where the backend selects the model:

```rust
let customer = client.for_customer("customer-123")?;
let text = customer
    .responses()
    .text("Answer concisely.", "Say hi.")
    .await?;
```

## Extracting Assistant Text

If you just need the assistant text, use:

```rust
let text = response.text();
let parts = response.text_chunks(); // each assistant text content part, in order
```

These helpers:
- include only output items with `role == assistant`
- include only `text` content parts

## Why This SDK Feels Good

### Fluent request building (value-style)

`ResponseBuilder` is a small, clonable value. You can compose “base requests” and reuse them:

```rust
use modelrelay::ResponseBuilder;

let base = ResponseBuilder::new()
    .model("gpt-4.1")
    .system("You are a careful reviewer.");

let a = base.clone().user("Summarize this changelog…");
let b = base.clone().user("Extract 3 risks…");
```

### Streaming you can actually use

If you only want text, stream just deltas:

```rust
use futures_util::StreamExt;
use modelrelay::ResponseBuilder;

let mut deltas = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .user("Write a haiku about type systems.")
    .stream_deltas(&client.responses())
    .await?;

while let Some(delta) = deltas.next().await {
    print!("{}", delta?);
}
```

If you want full control, stream typed events (message start/delta/stop, tool deltas, ping/custom):

```rust
use futures_util::StreamExt;
use modelrelay::{ResponseBuilder, StreamEventKind};

let mut stream = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .user("Think step by step, but only output the final answer.")
    .stream(&client.responses())
    .await?;

while let Some(evt) = stream.next().await {
    let evt = evt?;
    if evt.kind == StreamEventKind::MessageDelta {
        if let Some(text) = evt.text_delta {
            print!("{}", text);
        }
    }
}
```

## Workflows

Build multi-step AI pipelines with the workflow helpers.

### Sequential Chain

```rust
use modelrelay::{chain, llm, ChainOptions};

let spec = chain(
    vec![
        llm("summarize", |n| n.system("Summarize.").user("{{task}}")),
        llm("translate", |n| n.system("Translate to French.").user("{{summarize}}")),
    ],
    ChainOptions { name: Some("summarize-translate".into()), model: Some("claude-sonnet-4-5".into()), ..Default::default() },
)
.output("result", "translate", None)
.build()?;

let run = client.runs().create(spec).await?;
```

### Parallel with Aggregation

```rust
use modelrelay::{parallel, llm, ParallelOptions};

let spec = parallel(
    vec![
        llm("agent_a", |n| n.user("Write 3 ideas for {{task}}.")),
        llm("agent_b", |n| n.user("Write 3 objections for {{task}}.")),
    ],
    ParallelOptions { name: Some("multi-agent".into()), model: Some("claude-sonnet-4-5".into()), ..Default::default() },
)
.llm("aggregate", |n| n.system("Synthesize.").user("{{join}}"))
.edge("join", "aggregate")
.output("result", "aggregate", None)
.build()?;
```

### Map Fan-out

```rust
use modelrelay::{workflow, MapFanoutOptions, LLMNodeBuilder};

let spec = workflow()
    .name("fanout-example")
    .model("claude-sonnet-4-5")
    .llm("generator", |n| n.user("Generate 3 subquestions for {{task}}"))
    .map_fanout("fanout", MapFanoutOptions {
        items_from: Some("generator".into()),
        items_from_input: None,
        items_path: Some("/questions".into()),
        subnode: LLMNodeBuilder::new("answer").user("Answer: {{item}}").build(),
        max_parallelism: Some(4),
    })
    .llm("aggregate", |n| n.user("Combine: {{fanout}}"))
    .output("result", "aggregate", None)
    .build()?;
```

### Precompiled Workflows

For workflows that run repeatedly, compile once and reuse:

```rust
use modelrelay::RunsCreateOptions;
use serde_json::json;

// Compile once
let compiled = client.workflows().compile(spec).await?;

// Run multiple times with different inputs
for task in &tasks {
    let run = client.runs().create_from_plan_with_options(
        compiled.plan_hash.clone(),
        RunsCreateOptions {
            input: Some(json!({ "task": task })),
            ..Default::default()
        },
    ).await?;
}
```

### Plugins (Workflows)

Load GitHub-hosted plugins (markdown commands + agents), convert to workflows via `/responses`, then run them with `/runs`:

```rust
use modelrelay::{Client, OrchestrationMode, PluginRunConfig, new_local_fs_tools};

let client = Client::from_secret_key(std::env::var("MODELRELAY_API_KEY")?)?.build()?;
let tools = new_local_fs_tools(std::env::current_dir()?);

let plugin = client.plugins().load("github.com/your-org/your-plugin").await?;
let result = client.plugins().run(
    &plugin,
    "run",
    PluginRunConfig {
        user_task: "Summarize the repo and suggest next steps.".to_string(),
        orchestration_mode: Some(OrchestrationMode::Dynamic),
        tool_registry: Some(std::sync::Arc::new(tools)),
        ..Default::default()
    },
).await?;

println!("{:?}", result.outputs.get("result"));
```

### Structured outputs from Rust types (with retry)

Structured outputs are the “Rust-native” path: you describe a type, and you get a typed value back.

```rust
use modelrelay::{Client, ResponseBuilder};
use schemars::JsonSchema;
use serde::Deserialize;

#[derive(Debug, Deserialize, JsonSchema)]
struct Person {
    name: String,
    age: u32,
    email: Option<String>,
}

let client = Client::from_api_key(std::env::var("MODELRELAY_API_KEY")?)?.build()?;

let result = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .user("Extract: John Doe is 30 years old, john@example.com")
    .structured::<Person>()
    .max_retries(2)
    .send(&client.responses())
    .await?;

println!("{:?}", result.value);
```

And you can stream typed JSON with field-level completion for progressive UIs:

```rust
use futures_util::StreamExt;
use schemars::JsonSchema;
use serde::Deserialize;
use modelrelay::ResponseBuilder;

#[derive(Debug, Deserialize, JsonSchema)]
struct Article {
    title: String,
    summary: String,
    body: String,
}

let mut stream = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .user("Write an article about Rust's ownership model.")
    .structured::<Article>()
    .stream(&client.responses())
    .await?;

while let Some(evt) = stream.next().await {
    let evt = evt?;
    for field in &evt.complete_fields {
        if field == "title" {
            println!("Title: {}", evt.payload.title);
        }
    }
}
```

### Tool use is end-to-end (not just a schema)

The SDK ships the pieces you need to build a complete tool loop:
- create tool schemas from types
- parse/validate tool args into typed structs
- execute tool calls via a registry
- feed results back as tool result messages
- retry tool calls when args are malformed (with model-facing error formatting)

```rust
use modelrelay::{
    function_tool_from_type, parse_tool_args, respond_to_tool_call_json, ResponseBuilder, Tool,
    ToolChoice, ToolRegistry, ResponseExt,
};
use schemars::JsonSchema;
use serde::Deserialize;

#[derive(Debug, Deserialize, JsonSchema)]
struct WeatherArgs {
    location: String,
}

let registry = ToolRegistry::new().register(
    "get_weather",
    modelrelay::sync_handler(|_args_json, call| {
        let args: WeatherArgs = parse_tool_args(call)?;
        Ok(serde_json::json!({ "location": args.location, "temp_f": 72 }))
    }),
);

let schema = function_tool_from_type::<WeatherArgs>()?;
let tool = Tool::function(
    "get_weather",
    Some("Get current weather for a location".into()),
    Some(schema.parameters),
);

let response = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .user("Use the tool to get the weather in San Francisco.")
    .tools(vec![tool])
    .tool_choice(ToolChoice::auto())
    .send(&client.responses())
    .await?;

if response.has_tool_calls() {
    let call = response.first_tool_call().unwrap();
    let result = registry.execute(call).await;
    let tool_result = respond_to_tool_call_json(call, &result.result)?;

    // Feed the tool result back as an input item and continue the conversation.
    let followup = ResponseBuilder::new()
        .model("claude-sonnet-4-5")
        .user("Great—now summarize it in one sentence.")
        .item(tool_result)
        .send(&client.responses())
        .await?;

    println!("followup tokens: {}", followup.usage.total());
}
```

### User Interaction — `user.ask`

Use the built-in `user.ask` tool to request human input in a workflow run:

```rust
use futures_util::StreamExt;
use modelrelay::{
    user_ask_result_freeform, user_ask_tool, RunEventPayload, RunsToolCallV0,
    RunsToolResultItemV0, RunsToolResultsRequest,
};

let tools = vec![user_ask_tool()];
let run = client.runs().create(spec).await?;

let mut events = client.runs().stream_events(run.run_id, None, None).await?;
while let Some(event) = events.next().await {
    let event = event?;
    if let RunEventPayload::NodeUserAsk { node_id, user_ask } = event.payload {
        let answer = prompt_user(&user_ask.question); // your UI/input here
        let output = user_ask_result_freeform(answer)?;

        client
            .runs()
            .submit_tool_results(
                run.run_id,
                RunsToolResultsRequest {
                    node_id,
                    step: user_ask.step,
                    request_id: user_ask.request_id,
                    results: vec![RunsToolResultItemV0 {
                        tool_call: RunsToolCallV0 {
                            id: user_ask.tool_call.id,
                            name: user_ask.tool_call.name,
                            arguments: None,
                        },
                        output,
                    }],
                },
            )
            .await?;
    }
}
```

### tools.v0 local filesystem tools (fs.*)

The Rust SDK includes a safe-by-default local filesystem tool pack that implements:
`fs.read_file`, `fs.list_files`, `fs.search`, and `fs.edit`.

```rust
use modelrelay::{LocalFSToolPack, ToolRegistry};

let mut registry = ToolRegistry::new();
let fs_tools = LocalFSToolPack::new(".", Vec::new());
fs_tools.register_into(&mut registry);

// Now registry can execute fs.read_file/fs.list_files/fs.search/fs.edit tool calls.
```

## Customer-Attributed Requests

For metered billing, set `customer_id(...)`. The customer's tier can determine the model (so `model(...)` can be omitted):

```rust
use modelrelay::ResponseBuilder;

let response = ResponseBuilder::new()
    .customer_id("customer-123")
    .user("Hello!")
    .send(&client.responses())
    .await?;
```

## Blocking API (No Tokio)

Enable the `blocking` feature and use the same builder ergonomics:

```rust
use modelrelay::{BlockingClient, BlockingConfig, ResponseBuilder};

let client = BlockingClient::new(BlockingConfig {
    api_key: Some(std::env::var("MODELRELAY_API_KEY")?),
    ..Default::default()
})?;

let response = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .user("Hello!")
    .send_blocking(&client.responses())?;
```

## Feature Flags

| Feature | Default | Description |
|---------|---------|-------------|
| `streaming` | Yes | NDJSON streaming support |
| `blocking` | No | Sync client without Tokio |
| `tracing` | No | OpenTelemetry spans/events |
| `mock` | No | In-memory client for tests |

## Errors

Errors are typed so callers can branch cleanly:

```rust
use modelrelay::{Error, ResponseBuilder};

let result = ResponseBuilder::new()
    .model("claude-sonnet-4-5")
    .user("Hello!")
    .send(&client.responses())
    .await;

match result {
    Ok(_response) => {}
    Err(Error::Api(e)) if e.is_rate_limit() => {}
    Err(Error::Api(e)) if e.is_unauthorized() => {}
    Err(Error::Transport(_)) => {}
    Err(e) => return Err(e.into()),
}
```

## Documentation

For detailed guides and API reference, visit [docs.modelrelay.ai](https://docs.modelrelay.ai):

- [Rust SDK Reference](https://docs.modelrelay.ai/sdks/rust) — Full SDK documentation
- [First Request](https://docs.modelrelay.ai/getting-started/first-request) — Make your first API call
- [Streaming](https://docs.modelrelay.ai/guides/streaming) — Real-time response streaming
- [Structured Output](https://docs.modelrelay.ai/guides/structured-output) — Get typed JSON responses
- [Tool Use](https://docs.modelrelay.ai/guides/tools) — Let models call functions
- [Error Handling](https://docs.modelrelay.ai/guides/error-handling) — Handle errors gracefully
- [Workflows](https://docs.modelrelay.ai/guides/workflows) — Multi-step AI pipelines