agentix
Multi-provider LLM client for Rust — streaming, non-streaming, tool calls, agentic loops, and MCP support.
DeepSeek · OpenAI · Anthropic · Gemini · Kimi · GLM · MiniMax · Grok · OpenRouter — one unified API.
Philosophy: Stream as Agent Structure
An agent is not an object. It is a Stream.
agentix models agents as lazy, composable streams rather than stateful objects or DAG frameworks:
// token-level stream — full control, live progress
let mut stream = agent;
while let Some = stream.next.await
// turn-level stream — one CompleteResponse per LLM turn
let result = agent_turns
.last_content.await;
// multi-agent pipeline — just Rust concurrency
let findings = join_all.await;
Concurrency is join_all. Pipelines are sequential .await. No orchestrator, no DAG, no magic — just streams composed with ordinary Rust.
Quick Start
use ;
use StreamExt;
async
vs. other frameworks
| agentix | rig | llm-chain | LangGraph | |
|---|---|---|---|---|
| Language | Rust | Rust | Rust | Python |
| Agentic loop | ✅ agent() |
manual | manual | ✅ graph nodes |
| Multi-agent pipeline | ✅ join_all + streams |
manual | manual | ✅ graph edges |
| Streaming tokens | ✅ | ✅ | ❌ | ✅ |
| Streaming tool calls | ✅ | ❌ | ❌ | ❌ |
| MCP support | ✅ | ❌ | ❌ | ✅ (partial) |
| Proc-macro tools | ✅ #[tool] |
✅ #[rig_tool] |
❌ | ❌ |
| Concurrent tool execution | ✅ | ❌ | ❌ | ✅ |
| Provider support | 8 | 10+ | 4 | 30+ |
| Agent abstraction | Stream | Object | Chain | DAG |
vs LangGraph: LangGraph models agents as DAGs with explicit nodes and edges. agentix models them as Streams — no graph definition, no state schema, no framework lock-in. Multi-agent pipelines are just join_all and sequential .await.
vs rig's #[rig_tool]: rig requires one annotated function per tool, with descriptions passed as attribute arguments and return type fixed to Result<T, ToolError>. agentix uses doc comments for descriptions, accepts any return type, and lets you group related tools in a single impl block with shared state:
// rig: one #[rig_tool] per function
// agentix: one #[tool] for the whole impl block, descriptions from doc comments
// shared state across all methods
// standalone fn also works — doc comment = description
/// Square root of x.
/// x: input value
async
let bundle = sqrt + MathTools ; // compose with +
Installation
[]
= "0.18.2"
# Optional: Model Context Protocol (MCP) tool support
# agentix = { version = "0.18.2", features = ["mcp"] }
# Optional: drive `claude -p` as the agentic loop using a Claude Max OAuth session
# agentix = { version = "0.18.2", features = ["claude-code"] }
Logging Full Request / Response Bodies
Full request bodies, response bodies, streaming chunks, and MCP raw request bodies are treated as sensitive and are disabled by default.
To enable them, you must opt in at both compile time and runtime:
AGENTIX_LOG_BODIES=1
If either one is missing, agentix will not print full bodies.
- Compile-time gate:
sensitive-logs - Runtime gate:
AGENTIX_LOG_BODIES=1
This affects:
- outbound HTTP request bodies
- non-streaming HTTP response bodies
- raw SSE streaming chunks
- MCP raw HTTP request bodies
Providers
Nine built-in providers, all using the same API:
use Request;
// Shortcut constructors (provider + default model in one call)
let req = deepseek;
let req = openai;
let req = anthropic;
let req = gemini;
let req = kimi; // Moonshot AI — kimi-k2.5
let req = glm; // Zhipu AI — glm-5
let req = minimax; // MiniMax — MiniMax-M2.7 (Anthropic API)
let req = grok;
let req = openrouter; // OpenRouter with prompt caching support
// For OpenAI-compatible endpoints (Azure, vLLM, LocalAI, Ollama, etc.),
// route through `Provider::OpenRouter` with a custom base URL —
// `Provider::OpenAI` exclusively targets `/v1/responses` and won't work
// against servers that only speak Chat Completions.
let req = openrouter
.base_url
.model;
OpenAI specifically uses the Responses API — reasoning summaries surface via LlmEvent::Reasoning, encrypted_content round-trips automatically for multi-turn tool loops, and UsageStats.reasoning_tokens tracks hidden-reasoning cost. Non-official OpenAI endpoints should use Provider::OpenRouter.
Request API
Request is a self-contained value type — it carries provider, credentials, model,
messages, tools, and tuning. Call stream() or complete() with a shared reqwest::Client.
stream() — streaming completion
let http = new;
let mut stream = new
.system_prompt
.user
.stream
.await?;
while let Some = stream.next.await
complete() — non-streaming completion
let resp = new
.user
.complete
.await?;
println!;
println!;
println!;
println!;
Builder methods
let req = new
.model
.base_url
.system_prompt
.max_tokens
.temperature
.reasoning_effort
.retries // max retries, initial delay ms
.user // convenience for adding a user message
.message // add any Message variant
.messages // set full history
.tools; // set tool definitions
Reasoning control (ReasoningEffort)
A single cross-provider dial for "how much should the model think". Providers that expose a thinking toggle and/or effort level map this to their own wire format; providers that don't, ignore it.
use ;
let req = deepseek
.reasoning_effort // maximum effort
.user;
| Variant | DeepSeek | Anthropic (Claude 4.6+) | OpenAI (Responses API) | Gemini 3+ | Gemini 2.5 | OpenRouter | Others (Grok/Kimi/GLM) |
|---|---|---|---|---|---|---|---|
None |
thinking: disabled (sampling valid) |
thinking: disabled |
omit reasoning (no toggle) |
thinkingLevel: minimal* |
thinkingBudget: 0 |
reasoning.effort: none |
ignored |
Minimal |
thinking: enabled, effort high |
adaptive, effort low |
reasoning.effort: minimal |
thinkingLevel: minimal |
thinkingBudget: 512 |
reasoning.effort: minimal |
ignored |
Low |
thinking: enabled, effort high |
adaptive, effort low |
reasoning.effort: low |
thinkingLevel: low |
thinkingBudget: 1024 |
reasoning.effort: low |
ignored |
Medium |
thinking: enabled, effort high |
adaptive, effort medium |
reasoning.effort: medium |
thinkingLevel: medium |
thinkingBudget: 4096 |
reasoning.effort: medium |
ignored |
High |
thinking: enabled, effort high |
adaptive, effort high |
reasoning.effort: high |
thinkingLevel: high |
thinkingBudget: 8192 |
reasoning.effort: high |
ignored |
XHigh |
thinking: enabled, effort max |
adaptive, effort xhigh |
reasoning.effort: xhigh |
thinkingLevel: high |
thinkingBudget: 16384 |
reasoning.effort: xhigh |
ignored |
Max |
thinking: enabled, effort max |
adaptive, effort max |
reasoning.effort: high† |
thinkingLevel: high |
thinkingBudget: 24576 |
reasoning.effort: max |
ignored |
| unset | omit (default: thinking on) | omit (default: thinking off) | omit | omit | omit | omit | no field |
* Gemini 3 Pro can't fully disable thinking; None collapses to the floor (minimal).
† OpenAI has no max variant; collapses to high. xhigh only works on gpt-5.1-codex-max.
Notes:
Nonevs unset matter.Noneemits an explicit disable toggle where the provider supports one and keeps sampling params valid. Leaving it unset accepts the provider's own default — which for DeepSeek is thinking on and for most others is thinking off.- DeepSeek forbids sampling params in thinking mode; setting
.temperature()while thinking is on drops temperature before the wire with atracing::warn!. Use.reasoning_effort(ReasoningEffort::None)to re-enable sampling. - Round-trip for thinking + tool use is automatic on Anthropic (thinking blocks + signatures), OpenAI (
encrypted_contentreasoning items), Gemini (thoughtSignatureparts), and OpenRouter (typedreasoning_details[]entries). On each of these the full opaque state is captured intoMessage::Assistant.provider_dataand re-emitted verbatim on the next turn, preserving the interleaved ordering that those APIs validate against (Anthropic's signature check, OpenAI's'function_call' was provided without its required 'reasoning' itemrule, Gemini 3's 400 on missingthoughtSignature).
See examples/11_reasoning.rs for a live comparison of the four states.
LlmEvent (what you receive from stream())
LlmEvent is #[non_exhaustive]; always include a wildcard _ => {} arm to stay forward-compatible.
Token(String)— incremental response textReasoning(String)— thinking/reasoning trace (DeepSeekreasoning_content, Claude thinking blocks, OpenAI reasoning summary, Gemini thought parts, OpenRouterreasoning/reasoning.textentries)ToolCallChunk(ToolCallChunk)— partial tool call for real-time UIToolCall(ToolCall)— completed tool callAssistantState(serde_json::Value)— opaque per-turn provider state. Emitted by Anthropic (thinking blocks + signatures), OpenAI (encrypted reasoning items), Gemini (thoughtSignatureparts), and OpenRouter (typedreasoning_details[]). The agent loop attaches it toMessage::Assistant.provider_datafor round-trip; most user code can ignore it.Usage(UsageStats)— token usage for the turn (includesreasoning_tokenswhere the provider reports it)Done— stream endedError(String)— provider error
Defining Tools
Two styles are supported: standalone function (simpler) and impl block (multiple tools in one struct).
Standalone function
use tool;
/// Add two numbers.
/// a: first number
/// b: second number
async
/// Divide a by b.
async
// Combine with + operator
let tools = add + divide;
let mut stream = agent;
The macro generates a unit struct with the same name as the function and implements Tool for it.
Impl block (multiple methods per struct)
;
- Doc comment → tool description
/// param: descriptionlines → argument descriptionsResult::Errautomatically propagates as{"error": "..."}to the LLM
Streaming tools
Add #[streaming] to yield ToolOutput::Progress / ToolOutput::Result incrementally:
use ;
;
Normal and streaming methods can be freely mixed in the same #[tool] block.
MCP Tools
Use external processes as tools via the Model Context Protocol:
use McpTool;
use Duration;
let tool = stdio.await?
.with_timeout;
// Add to a ToolBundle alongside regular tools
let mut bundle = new;
bundle.push;
Runtime add / remove
let mut bundle = default;
bundle += Calculator; // AddAssign — add tool in-place
bundle -= Calculator; // SubAssign — remove all functions Calculator provides
let bundle2 = bundle + Calculator - Calculator; // Sub — returns new bundle
Structured Output
Constrain the model to emit JSON matching a Rust struct using Request::json_schema().
Derive schemars::JsonSchema on your struct and pass the generated schema:
use JsonSchema;
use ;
let schema = to_value?;
let response = openai
.system_prompt
.user
.json_schema // strict=true enforces the schema
.complete
.await?;
let review: Review = response.json?;
See examples/08_structured_output.rs for a runnable example.
Provider support:
- OpenAI — full
json_schemasupport (gpt-4o and later) - Gemini —
responseSchema+responseMimeType: application/json(fully supported) - DeepSeek —
json_objectonly;json_schemais automatically degraded with atracing::warn - Anthropic —
response_formatis ignored; use prompt engineering instead
Reliability
- Automatic retries — exponential backoff for 429 / 5xx responses
- Usage tracking — per-request token accounting across all providers;
AgentEvent::Donecontains cumulative totals across all turns
Agent (agentic loop)
agentix::agent() drives the full LLM ↔ tool-call loop and yields typed AgentEvents.
Pass it a ToolBundle, a base Request, and an initial history — it handles
repeated LLM calls, tool execution, and history accumulation automatically.
use ;
use StreamExt;
async
AgentEvent variants
Token(String)— incremental response textReasoning(String)— thinking traceToolCallChunk(ToolCallChunk)— streaming partial tool callToolCallStart(ToolCall)— complete tool call, about to executeToolProgress { id, name, progress }— intermediate tool outputToolResult { id, name, content }— final tool resultUsage(UsageStats)— token usage per LLM requestDone(UsageStats)— emitted once when the loop finishes normally; contains cumulative totals across all turnsWarning(String)— recoverable stream errorError(String)— fatal error
agentix::agent() returns a BoxStream<'static, AgentEvent> — drop it to abort.
Claude Code (Max OAuth)
Provider::ClaudeCode is a regular provider backed by claude -p, so you can
ride an existing Claude Max subscription instead of paying per-token via
ANTHROPIC_API_KEY. It plugs into agent() like any other provider — agentix
owns the loop, tool calls dispatch locally through the Tool trait, and the
loopback MCP server only surfaces tool schemas. Auth comes from the CLI's
OAuth session in the OS keychain.
Requires the claude-code feature and the claude CLI installed + logged in.
= { = "0.18.2", = ["claude-code"] }
use ;
use StreamExt;
;
async
Each turn spawns a fresh claude -p, replays prior history via --resume,
and kills the subprocess once the first assistant turn lands — so the agent
loop keeps full control over tool dispatch and multi-turn state.
See examples/10_claude_code.rs for a runnable example.
License
MIT OR Apache-2.0