anyllm_client 0.9.7

Async HTTP client for Anthropic-to-OpenAI translation with retry, SSRF protection, and SSE streaming
Documentation

anyllm_client

Async HTTP client that accepts Anthropic Messages API requests, translates them to OpenAI Chat Completions, sends them to any OpenAI-compatible backend, and translates the response back to Anthropic format. Part of the anyllm-proxy workspace.

What this crate is

A self-contained client library, not a CLI or server. Use it when you want Anthropic-shaped requests and responses in your own Rust code without running the proxy as a sidecar.

It owns:

  • A reqwest-based HTTP client with TLS, mTLS, and (by default) SSRF-safe DNS resolution.
  • Retry with exponential backoff and Retry-After parsing.
  • A framework-agnostic SSE frame parser for streaming responses.
  • Anthropic-shaped tool builders (ToolBuilder, ToolChoiceBuilder).
  • Rate-limit header extraction and conversion between vendor formats.

It deliberately does not own:

  • The format mapping itself: that lives in anyllm_translate and is re-exported where it makes sense.
  • A queue, batch engine, or admin UI: those are separate crates.

Where it fits

Five-crate workspace:

  • anyllm_translate - pure format mapping, no I/O.
  • anyllm_providers - provider and model catalog.
  • anyllm_client (this crate) - async HTTP client wrapping translate + transport.
  • anyllm_batch_engine - batch job queue and webhook delivery.
  • anyllm_proxy - axum HTTP server, admin UI, config parsing.

Depend on this crate directly when you need Anthropic-in / Anthropic-out from inside your own application. Depend on anyllm_proxy when you want a standalone HTTP server with config files, an admin UI, virtual keys, and metrics.

Add it

[dependencies]
anyllm_client = "0.9"

Default features include ssrf-protection. Disable it only for local development against 127.0.0.1 backends:

anyllm_client = { version = "0.9", default-features = false }

Library examples

1. Minimal: builder shorthand

The Client::builder() shorthand is the smallest path from "I have an API key" to "I have a working client". It defaults Auth to Bearer and uses the default TranslationConfig (1:1 model passthrough).

use anyllm_client::Client;
use anyllm_translate::anthropic::MessageCreateRequest;

let client = Client::builder()
    .base_url("https://api.openai.com/v1/chat/completions")
    .api_key(&std::env::var("OPENAI_API_KEY")?)
    .build()?;

let req: MessageCreateRequest = serde_json::from_str(r#"{
    "model": "gpt-4o-mini",
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Summarize Rust's borrow checker"}]
}"#)?;

let resp = client.messages(&req).await?;
println!("{:#?}", resp.content);

2. Anthropic-shaped clients on top of OpenAI-compatible providers

If your application speaks Anthropic Messages but you want to point it at Groq, OpenRouter, Together, or any local OpenAI-compatible server, use ClientConfig::builder() with a TranslationConfig that maps your Anthropic model aliases onto whatever the backend actually serves.

use anyllm_client::{Auth, Client, ClientConfig};
use anyllm_translate::TranslationConfig;

let translation = TranslationConfig::builder()
    .model_map("claude-3-5-haiku-latest",  "llama-3.1-8b-instant")
    .model_map("claude-3-5-sonnet-latest", "llama-3.3-70b-versatile")
    .build();

let client = Client::new(
    ClientConfig::builder()
        .backend_url("https://api.groq.com/openai/v1/chat/completions")
        .auth(Auth::Bearer(std::env::var("GROQ_API_KEY")?.into()))
        .translation(translation)
        .build(),
);

The same shape works for http://localhost:11434/v1/chat/completions (Ollama) or http://localhost:1234/v1/chat/completions (LM Studio). For keyless local backends, pass Auth::Bearer("".into()).

3. Streaming SSE

messages_stream returns a stream of Anthropic-shaped StreamEvent values. Translation happens incrementally so you can render tokens as they arrive.

use futures::StreamExt;
use anyllm_translate::anthropic::streaming::{Delta, StreamEvent};

let (mut stream, _rate_limits) = client.messages_stream(&req).await?;
while let Some(event) = stream.next().await {
    match event? {
        StreamEvent::ContentBlockDelta { delta: Delta::TextDelta { text }, .. } => {
            print!("{text}");
        }
        StreamEvent::MessageStop {} => break,
        _ => {}
    }
}

4. Tool use with the fluent builders

ToolBuilder and ToolChoiceBuilder produce Anthropic-shaped tool definitions without raw JSON.

use anyllm_client::{ToolBuilder, ToolChoiceBuilder};
use serde_json::json;

let weather = ToolBuilder::new("get_weather")
    .description("Get the current weather for a location")
    .input_schema(json!({
        "type": "object",
        "properties": { "location": { "type": "string" } },
        "required": ["location"]
    }))
    .build();

let mut req: anyllm_translate::anthropic::MessageCreateRequest =
    serde_json::from_str(r#"{
        "model": "claude-3-5-sonnet-latest",
        "max_tokens": 512,
        "messages": [{"role": "user", "content": "What's the weather in Paris?"}]
    }"#)?;
req.tools = Some(vec![weather]);
req.tool_choice = Some(ToolChoiceBuilder::auto());

let resp = client.messages(&req).await?;

5. Custom auth header (Azure, custom gateways)

Backends that want api-key: instead of Authorization: Bearer use Auth::Header.

use anyllm_client::{Auth, Client, ClientConfig};

let client = Client::new(
    ClientConfig::builder()
        .backend_url("https://my-resource.openai.azure.com/openai/deployments/gpt-4o/chat/completions?api-version=2024-10-21")
        .auth(Auth::Header { name: "api-key".into(), value: std::env::var("AZURE_OPENAI_KEY")?.into() })
        .build(),
);

6. Tuning timeouts and retries

use std::time::Duration;
use anyllm_client::Client;

let client = Client::builder()
    .base_url("https://api.openai.com/v1/chat/completions")
    .api_key(&std::env::var("OPENAI_API_KEY")?)
    .connect_timeout(Duration::from_secs(5))
    .read_timeout(Duration::from_secs(120))
    .max_retries(5)
    .build()?;

Retries fire on 429 and 5xx with exponential backoff and Retry-After honoring. The retry helpers (backoff_delay, is_retryable, parse_retry_after, send_with_retry) are re-exported if you want to reuse them in your own HTTP code.

7. Sharing one reqwest::Client across many Clients

Useful when you fan out requests to multiple backends and want a single connection pool.

use anyllm_client::{build_http_client, Auth, Client, ClientConfig, HttpClientConfig};

let http = build_http_client(&HttpClientConfig::new());

let openai = Client::with_http_client(http.clone(), ClientConfig::builder()
    .backend_url("https://api.openai.com/v1/chat/completions")
    .auth(Auth::Bearer(std::env::var("OPENAI_API_KEY")?.into()))
    .build());

let groq = Client::with_http_client(http, ClientConfig::builder()
    .backend_url("https://api.groq.com/openai/v1/chat/completions")
    .auth(Auth::Bearer(std::env::var("GROQ_API_KEY")?.into()))
    .build());

Modules

Module Purpose
client High-level Client, ClientBuilder, ClientConfig, Auth.
http HTTP client builder with TLS and SSRF protection.
retry Generic retry with exponential backoff and Retry-After parsing.
rate_limit Vendor rate-limit header extraction and format conversion.
sse Framework-agnostic SSE frame parser.
tools Builder helpers for Tool and ToolChoice.
error ClientError and related types.

Tests

cargo test -p anyllm_client