tiy-core
Unified LLM API and stateful Agent runtime in Rust
tiy-core is a Rust library that provides a single, provider-agnostic interface for streaming LLM completions and running agentic tool-use loops. Write your application logic once, then swap between OpenAI, Anthropic, Google, Ollama, and 8+ other providers by changing a config value.
Highlights
- One interface, many providers — 5 protocol-level implementations (OpenAI Completions, OpenAI Responses, Anthropic Messages, Google Generative AI / Vertex AI, Ollama) and 9 delegation providers (OpenAI-Compatible, xAI, Groq, OpenRouter, DeepSeek, MiniMax, Kimi Coding, ZAI, Zenmux) behind a single
LLMProtocoltrait. - Streaming-first —
EventStream<T, R>backed byparking_lot::Mutex<VecDeque>implementsfutures::Stream. Every provider returns anAssistantMessageEventStreamwith fine-grained deltas: text, thinking, tool call arguments, and completion events. - Tool / Function calling — Define tools via JSON Schema, validate arguments with the
jsonschemacrate, and execute tools in parallel or sequentially within the agent loop. - Stateful Agent runtime —
Agentmanages a full conversation loop: stream LLM → detect tool calls → execute tools → re-prompt → repeat. Supports steering (interrupt mid-turn), follow-up queues, event subscription (observer pattern), abort, and configurable max turns (default 25). - Extended Thinking — Provider-specific thinking/reasoning support with a unified
ThinkingLevelenum (Off → XHigh). Cross-provider thinking block conversion is handled automatically during message transformation. - Thread-safe by default — All mutable state uses
parking_lotlocks andAtomicBoolfor non-poisoning concurrency.
Architecture
graph TD
A[Your Application] --> B[Agent]
A --> C[LLMProtocol trait]
B --> C
C --> D[Protocol Providers]
C --> E[Delegation Providers]
D --> D1[OpenAI Completions]
D --> D2[OpenAI Responses]
D --> D3[Anthropic Messages]
D --> D4[Google GenAI / Vertex]
D --> D5[Ollama]
E --> E1[OpenAI-Compatible → OpenAI Completions]
E --> E2[xAI → OpenAI Completions]
E --> E3[Groq → OpenAI Completions]
E --> E4[OpenRouter → OpenAI Completions]
E --> E5[ZAI → OpenAI Completions]
E --> E6[DeepSeek → OpenAI Completions]
E --> E7[MiniMax → Anthropic]
E --> E8[Kimi Coding → Anthropic]
E --> E9[Zenmux → adaptive routing]
Core Layers
| Layer | Path | Purpose |
|---|---|---|
| Types | src/types/ |
Provider-agnostic data model: Message, ContentBlock, Model, Tool, Context, SecurityConfig |
| Protocol | src/protocol/ |
Wire-format implementations (full docs) |
| Provider | src/provider/ |
Service vendor facades (full docs) |
| Stream | src/stream/ |
Generic EventStream<T, R> implementing futures::Stream |
| Agent | src/agent/ |
Stateful conversation manager with tool execution loop (full docs) |
| Transform | src/transform/ |
Cross-provider message transformation (thinking blocks, tool call IDs, orphan resolution) |
| Thinking | src/thinking/ |
ThinkingLevel enum and provider-specific thinking options |
| Validation | src/validation/ |
JSON Schema validation for tool parameters |
| Models | src/models/ |
ModelRegistry with predefined models (GPT-4o, Claude Sonnet 4, Gemini 2.5 Flash, etc.) |
| Catalog | src/catalog/ |
Native model listing, snapshot refresh, and optional metadata enrichment for display (full docs) |
Quick Start
Add the dependency to your Cargo.toml:
[]
= "0.1.0"
= { = "1", = ["full"] }
= "0.3"
For local development before publishing, you can still use:
[]
= { = "../tiy-core" }
Streaming Completion
use StreamExt;
use ;
async
Agent with Tool Calling
use ;
async
The Agent also supports hooks (beforeToolCall / afterToolCall / onPayload), context pipeline (transformContext / convertToLlm), event subscription, steering & follow-up queues, thinking budgets, custom messages, and more. See the full Agent Module Documentation for details.
Supported Providers
| Provider | Type | Env Var |
|---|---|---|
| OpenAI | Direct | OPENAI_API_KEY |
| Anthropic | Direct | ANTHROPIC_API_KEY |
| Direct | GOOGLE_API_KEY |
|
| Ollama | Direct | — |
| OpenAI-Compatible | Delegation → OpenAI Completions | OPENAI_API_KEY |
| xAI | Delegation → OpenAI Completions | XAI_API_KEY |
| Groq | Delegation → OpenAI Completions | GROQ_API_KEY |
| OpenRouter | Delegation → OpenAI Completions | OPENROUTER_API_KEY |
| ZAI | Delegation → OpenAI Completions | ZAI_API_KEY |
| DeepSeek | Delegation → OpenAI Completions | DEEPSEEK_API_KEY |
| MiniMax | Delegation → Anthropic | MINIMAX_API_KEY |
| Kimi Coding | Delegation → Anthropic | KIMI_API_KEY |
| Zenmux | Adaptive multi-protocol | ZENMUX_API_KEY |
For detailed provider configuration, compat flags, Zenmux adaptive routing, and how to add new providers, see the Provider Documentation.
For wire-format protocol internals (SSE parsing, request building, delegation macros), see the Protocol Documentation.
Publishing
To make downstream projects depend on tiy-core without a Git URL, publish the crate to crates.io and then depend on it by version:
After publishing, consumers can keep using:
[]
= "0.1.0"
API Key Resolution
Keys are resolved in priority order:
StreamOptions.api_key(per-request override)- Provider's
default_api_key()method - Environment variable (e.g.
OPENAI_API_KEY,ANTHROPIC_API_KEY)
Base URLs follow the same pattern: StreamOptions.base_url > model.base_url > provider's DEFAULT_BASE_URL.
Security Configuration
tiy-core ships with a centralized SecurityConfig struct that controls all security limits and policies. Every field has a safe default value — you only need to override what you want to change.
Enabling Security Config
In code (programmatic):
use ;
// Method 1: Use defaults (zero-config)
let options = default;
// options.security is None → all defaults apply automatically
// Method 2: Override specific values
let security = default
.with_http
.with_agent;
let options = StreamOptions ;
From a JSON config file:
use SecurityConfig;
// Load from file — only specified fields are overridden, rest use defaults
let json = read_to_string.unwrap;
let security: SecurityConfig = from_str.unwrap;
From a TOML config file (requires toml crate):
let toml_str = read_to_string.unwrap;
let security: SecurityConfig = from_str.unwrap;
JSON Configuration Reference
A full security.json with all fields and their defaults:
{
// HTTP client and SSE stream parsing limits (applied per provider request)
"http": {
"connect_timeout_secs": 30, // TCP connect timeout
"request_timeout_secs": 1800, // Total request timeout including streaming (30 min)
"max_sse_line_buffer_bytes": 2097152, // SSE line buffer cap, prevents OOM (2 MiB)
"max_error_body_bytes": 65536, // Max error response body to read (64 KiB)
"max_error_message_chars": 4096 // Max error message length stored in events
},
// Agent runtime limits
"agent": {
"max_messages": 1000, // Conversation history cap (0 = unlimited, FIFO eviction)
"max_parallel_tool_calls": 16, // Concurrent tool execution limit
"tool_execution_timeout_secs": 120, // Per-tool execution timeout (2 min)
"validate_tool_calls": true, // Validate tool args against JSON Schema before execution
"max_subscriber_slots": 128 // Max event subscriber slots
},
// EventStream infrastructure limits
"stream": {
"max_event_queue_size": 10000, // Event buffer cap (0 = unlimited)
"result_timeout_secs": 600 // EventStream::result() blocking timeout (10 min)
},
// Header security policy — prevents custom headers from overriding auth headers
"headers": {
"protected_headers": [
"authorization",
"x-api-key",
"x-goog-api-key",
"anthropic-version",
"anthropic-beta"
]
},
// Base URL validation policy (SSRF protection)
"url": {
"require_https": true, // Enforce HTTPS (localhost/127.0.0.1 exempted)
"block_private_ips": false, // Block private/loopback IPs (off for local dev)
"allowed_schemes": ["https", "http"] // Allowed URL schemes
}
}
Partial overrides: You only need to include the fields you want to change. Omitted fields and entire sections fall back to their defaults. For example,
{}gives you all defaults, and{"http": {"connect_timeout_secs": 10}}only changes the connect timeout.
TOML Configuration Reference
The same config in TOML format:
[]
= 30
= 1800
= 2097152
= 65536
= 4096
[]
= 1000
= 16
= 120
= true
= 128
[]
= 10000
= 600
[]
= [
"authorization",
"x-api-key",
"x-goog-api-key",
"anthropic-version",
"anthropic-beta",
]
[]
= true
= false
= ["https", "http"]
Default Values Quick Reference
| Section | Field | Default | Description |
|---|---|---|---|
| http | connect_timeout_secs |
30 |
TCP connect timeout |
request_timeout_secs |
1800 |
Total request timeout (30 min) | |
max_sse_line_buffer_bytes |
2097152 |
SSE buffer cap (2 MiB) | |
max_error_body_bytes |
65536 |
Error body read cap (64 KiB) | |
max_error_message_chars |
4096 |
Error message truncation | |
| agent | max_messages |
1000 |
History cap (0 = unlimited) |
max_parallel_tool_calls |
16 |
Parallel tool exec limit | |
tool_execution_timeout_secs |
120 |
Per-tool timeout (2 min) | |
validate_tool_calls |
true |
JSON Schema validation | |
max_subscriber_slots |
128 |
Subscriber slots | |
| stream | max_event_queue_size |
10000 |
Event queue cap (0 = unlimited) |
result_timeout_secs |
600 |
Result blocking timeout (10 min) | |
| headers | protected_headers |
["authorization", ...] |
Cannot be overridden |
| url | require_https |
true |
HTTPS enforced (localhost exempt) |
block_private_ips |
false |
Private IP blocking | |
allowed_schemes |
["https", "http"] |
Allowed URL schemes |
Build & Test
# Run examples (requires API keys)
Project Structure
src/
├── lib.rs # Crate root, public re-exports
├── types/ # Provider-agnostic data model
│ ├── model.rs # Model, Provider, Api, Cost, OpenAICompletionsCompat
│ ├── message.rs # Message (User/Assistant/ToolResult), StopReason
│ ├── content.rs # ContentBlock (Text/Thinking/ToolCall/Image)
│ ├── context.rs # Context, Tool, StreamOptions
│ ├── limits.rs # SecurityConfig, HttpLimits, AgentLimits, StreamLimits, UrlPolicy, HeaderPolicy
│ ├── events.rs # AssistantMessageEvent (streaming events)
│ └── usage.rs # Token usage tracking
├── protocol/ # Wire-format protocol implementations (README.md)
│ ├── traits.rs # LLMProtocol trait
│ ├── registry.rs # Global ProtocolRegistry
│ ├── common.rs # Shared infrastructure (URL resolution, payload hooks, error handling)
│ ├── delegation.rs # Macros for generating delegation providers
│ ├── openai_completions.rs # OpenAI Chat Completions protocol
│ ├── openai_responses.rs # OpenAI Responses API protocol
│ ├── anthropic.rs # Anthropic Messages protocol
│ └── google.rs # Google GenAI + Vertex AI (dual-mode)
├── provider/ # Service vendor facades (README.md)
│ ├── openai.rs # OpenAI → protocol::openai_responses
│ ├── anthropic.rs # Anthropic → protocol::anthropic
│ ├── google.rs # Google → protocol::google
│ ├── ollama.rs # Ollama → protocol::openai_completions
│ ├── xai.rs # Delegation → OpenAI Completions
│ ├── groq.rs # Delegation → OpenAI Completions
│ ├── openrouter.rs # Delegation → OpenAI Completions
│ ├── zai.rs # Delegation → OpenAI Completions
│ ├── minimax.rs # Delegation → Anthropic
│ ├── kimi_coding.rs # Delegation → Anthropic
│ └── zenmux.rs # Adaptive 3-way routing
├── catalog/
│ ├── README.md # Catalog fetch/enrichment/snapshot documentation
│ └── mod.rs # Native model listing + snapshot refresh + metadata stores
├── stream/
│ └── event_stream.rs # Generic EventStream<T, R> + AssistantMessageEventStream
├── agent/
│ ├── README.md # Full Agent module documentation
│ ├── agent.rs # Agent loop: stream → tools → re-prompt
│ ├── state.rs # Thread-safe AgentState
│ └── types.rs # AgentConfig, AgentEvent, AgentTool, AgentHooks, ToolExecutor, ToolExecutionMode
├── transform/
│ ├── messages.rs # Thinking block conversion, orphan tool call handling
│ └── tool_calls.rs # Tool call ID normalization
├── thinking/
│ └── config.rs # ThinkingLevel, provider-specific options
├── validation/
│ └── tool_validation.rs # JSON Schema validation for tool args
└── models/
├── mod.rs # ModelRegistry + global predefined models
└── predefined.rs