ai-lib-rust
Protocol Runtime for AI-Protocol - A high-performance Rust reference implementation.
ai-lib-rust is the Rust runtime implementation for the AI-Protocol specification. It embodies the core design principle: 一切逻辑皆算子,一切配置皆协议 (All logic is operators, all configuration is protocol).
🎯 Design Philosophy
Unlike traditional adapter libraries that hardcode provider-specific logic, ai-lib-rust is a protocol-driven runtime that executes AI-Protocol specifications. This means:
- Zero hardcoded provider logic: All behavior is driven by YAML protocol files
- Operator-based architecture: Processing is done through composable operators (Decoder → Selector → Accumulator → FanOut → EventMapper)
- Hot-reloadable: Protocol configurations can be updated without restarting the application
- Unified interface: Developers interact with a single, consistent API regardless of the underlying provider
🏗️ Architecture
The library is organized into three layers:
1. Protocol Specification Layer (protocol/)
- Loader: Loads protocol files from local filesystem, embedded assets, or remote URLs
- Validator: Validates protocols against JSON Schema
- Schema: Protocol structure definitions
2. Pipeline Interpreter Layer (pipeline/)
- Decoder: Parses raw bytes into protocol frames (SSE, JSON Lines, etc.)
- Selector: Filters frames using JSONPath expressions
- Accumulator: Accumulates stateful data (e.g., tool call arguments)
- FanOut: Handles multi-candidate scenarios
- EventMapper: Converts protocol frames to unified events
3. User Interface Layer (client/, types/)
- Client: Unified client interface
- Types: Standard type system based on AI-Protocol
standard_schema
🧩 Feature flags & re-exports
ai-lib-rust keeps the runtime core small, and exposes optional higher-level helpers behind feature flags.
For a deeper overview, see docs/ARCHITECTURE.md.
- Always available re-exports (crate root):
AiClient,AiClientBuilder,CancelHandle,CallStats,ChatBatchRequest,EndpointExtMessage,MessageRole,StreamingEvent,ToolCallResult<T>,Error,ErrorContext
- Feature-gated re-exports:
routing_mvp: pure logic model management helpers (CustomModelManager,ModelArray, etc.)interceptors: application-layer call hooks (InterceptorPipeline,Interceptor,RequestContext)
Enable with:
[]
= { = "0.5.1", = ["routing_mvp", "interceptors"] }
🗺️ Capability map (layered tools)
This is a structured view of what the crate provides, grouped by layers.
1) Protocol layer (src/protocol/)
ProtocolLoader: load provider manifests from local paths / env paths / GitHub raw URLsProtocolValidator: JSON Schema validation (supports offline via embedded schema)ProtocolManifest: typed representation of provider manifestsUnifiedRequest: provider-agnostic request payload used by the runtime
2) Transport layer (src/transport/)
HttpTransport: reqwest-based transport with proxy/timeout defaults and env knobs- API key resolution: keyring →
<PROVIDER_ID>_API_KEYenv
3) Pipeline layer (src/pipeline/)
- Operator pipeline: decoder → selector → accumulator → fanout → event mapper
- Streaming normalization: maps provider frames to
StreamingEvent
4) Client layer (src/client/)
AiClient: runtime entry point; model-driven ("provider/model")- Chat builder:
client.chat().messages(...).stream().execute_stream() - Batch:
chat_batch,chat_batch_smart - Observability:
call_model_with_statsreturnsCallStats - Cancellation:
execute_stream_with_cancel()→CancelHandle - Services:
EndpointExtfor callingservicesdeclared in protocol manifests
5) Resilience layer (src/resilience/ + client/policy)
- Policy engine: capability validation + retry/fallback decisions
- Rate limiter: token-bucket + adaptive header-driven mode
- Circuit breaker: minimal breaker with env or builder defaults
- Backpressure: max in-flight permit gating
6) Types layer (src/types/)
- Messages:
Message,MessageRole,MessageContent,ContentBlock - Tools:
ToolDefinition,FunctionDefinition,ToolCall - Events:
StreamingEvent
7) Telemetry layer (src/telemetry/)
FeedbackSink/FeedbackEvent: opt-in feedback reporting
8) Utils (src/utils/)
- JSONPath mapping helpers, tool-call assembler, and small runtime utilities
9) Optional helpers (feature-gated)
routing_mvp(src/routing/): model selection + endpoint array load balancing (pure logic)interceptors(src/interceptors/): hooks around calls for logging/metrics/audit
🚀 Quick Start
Basic Usage
use ;
use StreamingEvent;
use StreamExt;
async
Multimodal (Image / Audio)
Multimodal inputs are represented as MessageContent::Blocks(Vec<ContentBlock>).
use ;
use ;
Useful environment variables
AI_PROTOCOL_DIR/AI_PROTOCOL_PATH: path to your localai-protocolrepo root (containingv1/)AI_LIB_ATTEMPT_TIMEOUT_MS: per-attempt timeout guard used by the unified policy engineAI_LIB_BATCH_CONCURRENCY: override concurrency limit for batch operations
Custom Protocol
use ProtocolLoader;
let loader = new
.with_base_path
.with_hot_reload;
let manifest = loader.load_provider.await?;
📦 Installation
Add to your Cargo.toml:
[]
= "0.5.1"
= { = "1.0", = ["full"] }
= "0.3"
🔧 Configuration
The library automatically looks for protocol files in the following locations (in order):
- Custom path set via
ProtocolLoader::with_base_path() ai-protocol/subdirectory (Git submodule)../ai-protocol/(sibling directory)../../ai-protocol/(parent's sibling)
Protocol files should follow the AI-Protocol v1.5 specification structure. The runtime validates manifests against the official JSON Schema from the AI-Protocol repository.
🔐 Provider Requirements (API Keys)
Most providers require an API key. The runtime reads keys from (in order):
-
OS Keyring (optional, convenience feature)
- Windows: Uses Windows Credential Manager
- macOS: Uses Keychain
- Linux: Uses Secret Service API
- Service:
ai-protocol, Username: provider id - Note: Keyring is optional and may not work in containers/WSL. Falls back to environment variables automatically.
-
Environment Variable (recommended for production)
- Format:
<PROVIDER_ID>_API_KEY(e.g.DEEPSEEK_API_KEY,ANTHROPIC_API_KEY,OPENAI_API_KEY) - Recommended for: CI/CD, containers, WSL, production deployments
- Format:
Example:
# Set API key via environment variable (recommended)
# Or use keyring (optional, for local development)
# Windows: Stored in Credential Manager
# macOS: Stored in Keychain
Provider-specific details vary, but ai-lib-rust normalizes them behind a unified client API.
🌐 Proxy / Timeout / Backpressure (Production knobs)
- Proxy: set
AI_PROXY_URL(e.g.http://user:pass@host:port) - HTTP timeout: set
AI_HTTP_TIMEOUT_SECS(fallback:AI_TIMEOUT_SECS) - In-flight limit: set
AI_LIB_MAX_INFLIGHTor useAiClientBuilder::max_inflight(n) - Rate limiting (optional): set either
AI_LIB_RPS(requests per second), orAI_LIB_RPM(requests per minute)
- Circuit breaker (optional): enable via
AiClientBuilder::circuit_breaker_default()or envAI_LIB_BREAKER_FAILURE_THRESHOLD(default 5)AI_LIB_BREAKER_COOLDOWN_SECS(default 30)
📊 Observability: CallStats
If you need per-call stats (latency, retries, request ids, endpoint), use:
let = client.call_model_with_stats.await?;
println!;
🛑 Cancellable Streaming
let = client.chat.messages.stream.execute_stream_with_cancel.await?;
// cancel.cancel(); // emits StreamEnd{finish_reason:"cancelled"}, drops the underlying network stream, and releases inflight permit
🧾 Optional Feedback (Choice Selection)
Telemetry is opt-in. You can inject a FeedbackSink and report feedback explicitly:
use ;
client.report_feedback.await?;
🎨 Key Features
Protocol-Driven Architecture
No match provider statements. All logic is derived from protocol configuration:
// The pipeline is built dynamically from protocol manifest
let pipeline = from_manifest?;
// Operators are configured via YAML, not hardcoded
// Adding a new provider requires zero code changes
Multi-Candidate Support
Automatically handles multi-candidate scenarios through the FanOut operator:
streaming:
candidate:
candidate_id_path: "$.choices[*].index"
fan_out: true
Tool Accumulation
Stateful accumulation of tool call arguments:
streaming:
accumulator:
stateful_tool_parsing: true
key_path: "$.delta.partial_json"
flush_on: "$.type == 'content_block_stop'"
Hot Reload
Protocol configurations can be updated at runtime:
let loader = new.with_hot_reload;
// Protocol changes are automatically picked up
📚 Examples
See the examples/ directory:
basic_usage.rs: Simple non-streaming chat completiondeepseek_chat_stream.rs: Streaming chat exampledeepseek_tool_call_stream.rs: Tool calling with streamingcustom_protocol.rs: Loading custom protocol configurationslist_models.rs: Listing available models from providerservice_discovery.rs: Service discovery and custom service callstest_protocol_loading.rs: Protocol loading sanity check
🧪 Testing
📦 Batch (Chat)
For batch execution (order-preserving), use:
use ;
let client = new.await?;
let reqs = vec!;
let results = client.chat_batch.await;
Smart batch tuning
If you prefer a conservative default heuristic, use:
let results = client.chat_batch_smart.await;
Override concurrency with:
AI_LIB_BATCH_CONCURRENCY
🤝 Contributing
Contributions are welcome! Please ensure that:
- All protocol configurations follow the AI-Protocol v1.5 specification
- New operators are properly documented
- Tests are included for new features
- Code follows Rust best practices and passes
cargo clippy
📄 License
This project is licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT License (LICENSE-MIT)
at your option.
🔗 Related Projects
- AI-Protocol: Protocol specification (v1.5)
ai-lib-rust - Where protocol meets performance. 🚀