iron-providers
Multi-provider inference layer for AgentIron.
Install
[]
= "0.1"
Normalizes requests and responses across OpenAI Responses, OpenAI Chat Completions, and Anthropic Messages API families through a profile-driven generic provider and registry.
Provider Slugs
Built-in provider profiles are identified by slug. Each slug maps to a specific API family, base URL, and authentication strategy.
| Slug | models.dev ID | API Family | Purpose | Auth |
|---|---|---|---|---|
anthropic |
anthropic |
Anthropic Messages | General | x-api-key header |
minimax |
minimax |
Anthropic Messages | General | Bearer token |
minimax-code |
minimax-coding-plan |
Anthropic Messages | Coding | Bearer token |
zai |
zai |
OpenAI Chat Completions | General | Bearer token |
zai-code |
zai-coding-plan |
OpenAI Chat Completions | Coding | Bearer token |
kimi |
moonshotai |
OpenAI Chat Completions | General | Bearer token |
kimi-code |
kimi-for-coding |
Anthropic Messages | Coding | x-api-key header or OAuth Bearer |
openrouter |
openrouter |
OpenAI Chat Completions | General | Bearer token |
requesty |
requesty |
OpenAI Chat Completions | General | Bearer token |
codex |
openai |
Codex Responses | Coding | OAuth Bearer |
Coding-purpose slugs route to endpoints optimized for code generation tasks.
API Families
Providers are grouped into four adapter families:
- OpenAiResponses — Uses the OpenAI Responses API via
async-openai. - OpenAiChatCompletions — Uses the
/chat/completionsendpoint viareqwest. Compatible with any OpenAI-compatible API. - AnthropicMessages — Uses the Anthropic
/v1/messagesendpoint viareqwest. - CodexResponses — Uses the Codex
/responsesendpoint viareqwest. Supportsstore: false,reasoning, andparallel_tool_calls.
Registry Usage
use ;
// Create a registry with built-in providers
let registry = default;
// Look up a provider by slug
let provider = registry.get?;
// Use it for inference
let request = new;
let events = provider.infer.await?;
HTTP Timeouts
All adapters apply connect_timeout (default 30s) and read_timeout
(default 60s between socket reads) to their HTTP clients so a stalled
provider surfaces as a transport error instead of hanging the caller.
Override per session on RuntimeConfig:
use Duration;
let runtime = new
.with_connect_timeout
.with_read_timeout;
OpenAiConfig exposes equivalent with_connect_timeout /
with_read_timeout builders for direct callers of openai::infer.
Credentials
Providers accept either an API key or an OAuth bearer token via RuntimeConfig.
API key (default):
use RuntimeConfig;
let runtime = new;
OAuth bearer token:
use ;
use ;
let runtime = from_credential;
Each ProviderProfile declares which credential kinds it supports via credential_auth. When a credential is passed, GenericProvider validates:
- The credential kind is supported by the profile.
- OAuth bearer tokens have not expired.
- The underlying secret is non-empty.
If validation fails, ProviderError::auth() is returned immediately with a clear message.
Request Model
InferenceRequest now carries an InferenceContext:
context.transcript: model-visible conversation historycontext.runtime_records: runtime-only structured records
Runtime records are not projected into assistant-visible message history by default. They exist so adapters and runtimes can carry structured state without polluting the model transcript.
use ;
use json;
let mut request = new;
request
.context
.add_record;
Custom Providers
Register custom providers with a profile:
use ;
let mut registry = new;
registry.register;
let provider = registry.get?;
URL Pattern Resolution
For auto-detection based on endpoint URLs:
registry.register_by_url_pattern;
let profile = registry.resolve_by_url;
Listing Available Providers
let slugs: = registry.slugs;
// Returns sorted list: ["anthropic", "kimi", "kimi-code", "minimax", "minimax-code", "openrouter", "requesty", "zai", "zai-code"]
models.dev Integration
Built-in and custom profiles can optionally declare a distinct models.dev provider
identifier for client-side model discovery and caching.
let profile = new
.with_models_dev_id;
assert_eq!;
Provider Trait
All providers implement the Provider trait:
The GenericProvider dispatches to the correct adapter (Responses, Chat Completions, or Anthropic) based on the profile's API family.
Streaming Contract
Streaming is requested by calling infer_stream; non-streaming is requested by
calling infer. There is no stream field on InferenceRequest.
ProviderEvent::Complete has a strict meaning:
Completemeans the stream ended successfully.- If a provider emits an unrecoverable
Error, the stream ends without a laterCompleteevent. Usageevents, when present, carry cumulative provider-reported token usage snapshots for the current request. If a stream emits multiple usage snapshots, consumers should treat the latest snapshot as superseding earlier snapshots, not as an additive delta.
This contract now holds across OpenAI Responses, Chat Completions, and Anthropic adapters.
Profile Semantics
ProviderProfile is authoritative across provider families:
base_urlis used for all familiesauth_strategyis honored for all families, includingOpenAiResponsesdefault_headersare validated and applied consistently- invalid profile auth/header configuration fails fast during client construction instead of silently falling back to a default client
Key Types
ProviderProfile— Slug, optional models.dev ID, API family, base URL, auth strategy, headers, purpose, and quirks.RuntimeConfig— API key and optional default model for a session.InferenceRequest— Normalized request with model, context, tools, and generation config.InferenceContext— Separates model-visibleTranscriptfrom runtime-onlyRuntimeRecordvalues.ProviderEvent— Streamed events:Output,ToolCall,ChoiceRequest,Usage,Complete,Error,Status.Usagecarries cumulative provider-reported token usage when available.Completeis success-only.Errorcarries a structuredProviderErrorwith classification (auth, rate-limit, transport, etc.).Transcript/Message— Conversation history in provider-agnostic format.ToolDefinition/ToolPolicy— Tool schema and usage policy.GenerationConfig— Temperature, max tokens, top-p, stop sequences.
Development
Install the task runner and security tooling if you want to use the local invoke
workflow:
Available tasks:
These tasks print a short summary with warnings, failures, and the count of successful steps only.
Configure the repository pre-commit hook after cloning:
The pre-commit hook runs cargo fmt --manifest-path Cargo.toml -- --check when
staged Rust files are present. Before opening or updating a pull request, run the
same checks CI can validate locally:
GitHub Workflow
- Open a GitHub issue before starting work.
- Create a feature branch for the issue and open a pull request against
main. - Pull requests should reference an issue in the title or body when possible, for example
Closes #123, but CI does not block PRs solely for missing issue references. - The
Pull Requestworkflow runsinv buildandinv teston every PR tomain. - The
Pull Requestworkflow runsinv securityas a non-blocking audit and posts the output as a PR comment. - Merges to
maintrigger an automatic patch release that bumpsCargo.toml, creates avX.Y.Ztag, creates a GitHub release, and publishes the crate to crates.io. - Coordinated
minorandmajorreleases are handled through theRelease Manualworkflow in GitHub Actions.
Repository configuration still matters:
- crates.io Trusted Publishing is supported by the release workflows through GitHub OIDC plus
rust-lang/crates-io-auth-action, soCRATES_IO_TOKENis not required when Trusted Publishing is configured for this repository and workflow. - If branch protection blocks workflow pushes to
main, add aRELEASE_GITHUB_TOKENsecret for a token that is allowed to push the automated release commit and tag. - Branch protection should require the
Rust Checksstatus check before merge.
build runs:
cargo build --manifest-path Cargo.toml --all-targetscargo fmt --manifest-path Cargo.toml -- --checkcargo clippy --manifest-path Cargo.toml --all-targets --all-features -- -D warnings
test runs:
cargo test --manifest-path Cargo.toml
security runs:
cargo generate-lockfile --manifest-path Cargo.tomlwhen neededcargo audit
Testing
Tests include unit tests for message mapping, tool mapping, and error handling, plus mock HTTP integration tests for all built-in provider slugs using mockito.
The test suite also includes protocol-level unit coverage for:
- SSE framing with split chunks and multi-line
data:payloads - success vs failure stream termination semantics
- multiple tool-call assembly for Chat Completions and Anthropic streaming
- fail-fast invalid profile header handling
- runtime-record non-leakage into provider-visible transcripts
Migration Notes
See MIGRATION.md for breaking API changes including:
- removal of
InferenceRequest.stream - replacement of transcript-only request state with
InferenceContext - removal of
Message::SystemStructured
Dependency Notes
async-openai was upgraded to 0.36.
reqwest was upgraded to 0.13.
License
This project is licensed under the Apache License 2.0. See LICENSE-APACHE.