ai-lib-rust
Protocol Runtime for AI-Protocol - A high-performance Rust reference implementation.
ai-lib-rust is the Rust runtime implementation for the AI-Protocol specification. It embodies the core design principle: 一切逻辑皆算子,一切配置皆协议 (All logic is operators, all configuration is protocol).
🎯 Design Philosophy
Unlike traditional adapter libraries that hardcode provider-specific logic, ai-lib-rust is a protocol-driven runtime that executes AI-Protocol specifications. This means:
- Zero hardcoded provider logic: All behavior is driven by protocol manifests (source YAML or dist JSON)
- Operator-based architecture: Processing is done through composable operators (Decoder → Selector → Accumulator → FanOut → EventMapper)
- Hot-reloadable: Protocol configurations can be updated without restarting the application
- Unified interface: Developers interact with a single, consistent API regardless of the underlying provider
🏗️ Architecture
The library is organized into three layers:
1. Protocol Specification Layer (protocol/)
- Loader: Loads protocol files from local filesystem, embedded assets, or remote URLs
- Validator: Validates protocols against JSON Schema
- Schema: Protocol structure definitions
2. Pipeline Interpreter Layer (pipeline/)
- Decoder: Parses raw bytes into protocol frames (SSE, JSON Lines, etc.)
- Selector: Filters frames using JSONPath expressions
- Accumulator: Accumulates stateful data (e.g., tool call arguments)
- FanOut: Handles multi-candidate scenarios
- EventMapper: Converts protocol frames to unified events
3. User Interface Layer (client/, types/)
- Client: Unified client interface
- Types: Standard type system based on AI-Protocol
standard_schema
🔄 V2 Protocol Alignment
Starting with v0.7.0, ai-lib-rust aligns with the AI-Protocol V2 specification. V0.8.0 adds full V2 runtime support including V2 manifest parsing, provider drivers, MCP, Computer Use, and extended multimodal.
Standard Error Codes (V2)
All provider errors are classified into 13 standard error codes with unified retry/fallback semantics:
| Code | Name | Retryable | Fallbackable |
|---|---|---|---|
| E1001 | invalid_request |
No | No |
| E1002 | authentication |
No | Yes |
| E1003 | permission_denied |
No | No |
| E1004 | not_found |
No | No |
| E1005 | request_too_large |
No | No |
| E2001 | rate_limited |
Yes | Yes |
| E2002 | quota_exhausted |
No | Yes |
| E3001 | server_error |
Yes | Yes |
| E3002 | overloaded |
Yes | Yes |
| E3003 | timeout |
Yes | Yes |
| E4001 | conflict |
Yes | No |
| E4002 | cancelled |
No | No |
| E9999 | unknown |
No | No |
Classification follows a priority pipeline: provider-specific error code → HTTP status override → standard HTTP mapping → E9999.
Compliance Tests
Cross-runtime behavioral consistency is verified by a shared YAML-based test suite from the ai-protocol repository:
# Run compliance tests
# With explicit compliance directory
COMPLIANCE_DIR=../ai-protocol/tests/compliance
For details, see CROSS_RUNTIME.md.
Testing with ai-protocol-mock
For integration and MCP tests without real API calls, use ai-protocol-mock:
# Start mock server (from ai-protocol-mock repo)
# Run tests with mock
MOCK_HTTP_URL=http://localhost:4010 MOCK_MCP_URL=http://localhost:4010/mcp
# Run specific mock integration tests
MOCK_HTTP_URL=http://localhost:4010
Or in code: AiClientBuilder::new().base_url_override("http://localhost:4010").build(...)
🧩 Feature flags & re-exports
ai-lib-rust keeps the runtime core small, and exposes optional capabilities behind feature flags. This aligns with the V2 "lean core, progressive complexity" design principle.
For a deeper overview, see docs/ARCHITECTURE.md.
- Always available re-exports (crate root):
AiClient,AiClientBuilder,CancelHandle,CallStats,ChatBatchRequest,ClientMetrics,EndpointExtMessage,MessageRole,StreamingEvent,ToolCallResult<T>,Error,ErrorContextFeedbackEvent,FeedbackSink(core feedback types)
- Capability features (V2 aligned):
embeddings: embedding generation (EmbeddingClient)batch: batch API processing (BatchExecutor)guardrails: input/output validationtokens: token counting and cost estimationtelemetry: advanced observability sinks (InMemoryFeedbackSink,ConsoleFeedbackSink, etc.)mcp: MCP (Model Context Protocol) tool bridge — namespace-based tool conversion and filteringcomputer_use: Computer Use abstraction — safety policies, domain allowlists, action validationmultimodal: Extended multimodal support — vision, audio, video modality validation and format checksreasoning: Extended reasoning / chain-of-thought support
- Infrastructure features:
routing_mvp: pure logic model management helpers (CustomModelManager,ModelArray, etc.)interceptors: application-layer call hooks (InterceptorPipeline,Interceptor,RequestContext)
- Meta-feature:
full: enables all capability and infrastructure features
Enable with:
[]
# Lean core (default)
= "0.8.0"
# With specific capabilities
= { = "0.8.0", = ["embeddings", "telemetry"] }
# Everything enabled
= { = "0.8.0", = ["full"] }
🗺️ Capability map (layered tools)
This is a structured view of what the crate provides, grouped by layers.
1) Protocol layer (src/protocol/)
ProtocolLoader: load provider manifests from local paths / env paths / GitHub raw URLsProtocolValidator: JSON Schema validation (supports offline via embedded schema)ProtocolManifest: typed representation of provider manifestsUnifiedRequest: provider-agnostic request payload used by the runtime
2) Transport layer (src/transport/)
HttpTransport: reqwest-based transport with proxy/timeout defaults and env knobs- API key resolution: keyring →
<PROVIDER_ID>_API_KEYenv
3) Pipeline layer (src/pipeline/)
- Operator pipeline: decoder → selector → accumulator → fanout → event mapper
- Streaming normalization: maps provider frames to
StreamingEvent
4) Client layer (src/client/)
AiClient: runtime entry point; model-driven ("provider/model")- Chat builder:
client.chat().messages(...).stream().execute_stream() - Batch:
chat_batch,chat_batch_smart - Observability:
call_model_with_statsreturnsCallStats - Cancellation:
execute_stream_with_cancel()→CancelHandle - Services:
EndpointExtfor callingservicesdeclared in protocol manifests
5) Resilience layer (src/resilience/ + client/policy)
- Policy engine: capability validation + retry/fallback decisions
- Rate limiter: token-bucket + adaptive header-driven mode
- Circuit breaker: minimal breaker with env or builder defaults
- Backpressure: max in-flight permit gating
6) Types layer (src/types/)
- Messages:
Message,MessageRole,MessageContent,ContentBlock - Tools:
ToolDefinition,FunctionDefinition,ToolCall - Events:
StreamingEvent
7) Telemetry layer (src/telemetry/)
FeedbackSink/FeedbackEvent: opt-in feedback reporting- Extended feedback types:
RatingFeedback,ThumbsFeedback,TextFeedback,CorrectionFeedback,RegenerateFeedback,StopFeedback - Multiple sinks:
InMemoryFeedbackSink,ConsoleFeedbackSink,CompositeFeedbackSink - Global sink management:
get_feedback_sink(),set_feedback_sink(),report_feedback()
8) Embedding layer (src/embeddings/) - NEW in v0.6.5
EmbeddingClient/EmbeddingClientBuilder: Generate embeddings from text- Types:
Embedding,EmbeddingRequest,EmbeddingResponse,EmbeddingUsage - Vector operations:
cosine_similarity,dot_product,euclidean_distance,manhattan_distance - Utilities:
normalize_vector,average_vectors,weighted_average_vectors,find_most_similar
9) Cache layer (src/cache/) - NEW in v0.6.5
CacheBackendtrait withMemoryCacheandNullCacheimplementationsCacheManager: TTL-based caching with statisticsCacheKey/CacheKeyGenerator: Deterministic cache key generation
10) Token layer (src/tokens/) - NEW in v0.6.5
TokenCountertrait:CharacterEstimator,AnthropicEstimator,CachingCounterModelPricing: Pre-configured pricing for GPT-4o, Claude modelsCostEstimate: Calculate request costs
11) Batch layer (src/batch/) - NEW in v0.6.5
BatchCollector/BatchConfig: Accumulate requests for batch processingBatchExecutor: Execute batches with configurable strategiesBatchResult: Structured batch execution results
12) Plugin layer (src/plugins/) - NEW in v0.6.5
Plugintrait with lifecycle hooksPluginRegistry: Centralized plugin management- Hook system:
HookType,Hook,HookManager - Middleware:
Middleware,MiddlewareChainfor request/response transformation
13) Utils (src/utils/)
- JSONPath mapping helpers, tool-call assembler, and small runtime utilities
14) Optional helpers (feature-gated)
routing_mvp(src/routing/): model selection + endpoint array load balancing (pure logic)interceptors(src/interceptors/): hooks around calls for logging/metrics/audit
🚀 Quick Start
Sharing the client across tasks
AiClient does not implement Clone (by design, for API key and provider ToS compliance).
Use Arc<AiClient> to share across async tasks:
use ;
use Arc;
async
Basic Usage
use ;
use StreamingEvent;
use StreamExt;
async
Multimodal (Image / Audio)
Multimodal inputs are represented as MessageContent::Blocks(Vec<ContentBlock>).
use ;
use ;
Useful environment variables
AI_PROTOCOL_DIR/AI_PROTOCOL_PATH: path to your localai-protocolrepo root (containingv1/)AI_LIB_ATTEMPT_TIMEOUT_MS: per-attempt timeout guard used by the unified policy engineAI_LIB_BATCH_CONCURRENCY: override concurrency limit for batch operations
Custom Protocol
use ProtocolLoader;
let loader = new
.with_base_path
.with_hot_reload;
let manifest = loader.load_provider.await?;
📦 Installation
Add to your Cargo.toml:
[]
= "0.8.0"
= { = "1.0", = ["full"] }
= "0.3"
🔧 Configuration
The library automatically looks for protocol manifests in the following locations (in order):
- Custom path set via
ProtocolLoader::with_base_path() AI_PROTOCOL_DIR/AI_PROTOCOL_PATH(local path or GitHub raw URL)- Common dev paths:
ai-protocol/,../ai-protocol/,../../ai-protocol/ - Last resort: GitHub raw
hiddenpath/ai-protocol(main)
For each base path, provider manifests are resolved in a backward-compatible order:
dist/v1/providers/<id>.json → v1/providers/<id>.yaml.
Protocol manifests should follow the AI-Protocol v1.5 specification structure. The runtime validates manifests against the official JSON Schema from the AI-Protocol repository.
🔐 Provider Requirements (API Keys)
Most providers require an API key. The runtime reads keys from (in order):
-
OS Keyring (optional, convenience feature)
- Windows: Uses Windows Credential Manager
- macOS: Uses Keychain
- Linux: Uses Secret Service API
- Service:
ai-protocol, Username: provider id - Note: Keyring is optional and may not work in containers/WSL. Falls back to environment variables automatically.
-
Environment Variable (recommended for production)
- Format:
<PROVIDER_ID>_API_KEY(e.g.DEEPSEEK_API_KEY,ANTHROPIC_API_KEY,OPENAI_API_KEY) - Recommended for: CI/CD, containers, WSL, production deployments
- Format:
Example:
# Set API key via environment variable (recommended)
# Or use keyring (optional, for local development)
# Windows: Stored in Credential Manager
# macOS: Stored in Keychain
Provider-specific details vary, but ai-lib-rust normalizes them behind a unified client API.
🌐 Proxy / Timeout / Backpressure (Production knobs)
- Proxy: set
AI_PROXY_URL(e.g.http://user:pass@host:port) - HTTP timeout: set
AI_HTTP_TIMEOUT_SECS(fallback:AI_TIMEOUT_SECS) - In-flight limit: set
AI_LIB_MAX_INFLIGHTor useAiClientBuilder::max_inflight(n) - Rate limiting (optional): set either
AI_LIB_RPS(requests per second), orAI_LIB_RPM(requests per minute)
- Circuit breaker (optional): enable via
AiClientBuilder::circuit_breaker_default()or envAI_LIB_BREAKER_FAILURE_THRESHOLD(default 5)AI_LIB_BREAKER_COOLDOWN_SECS(default 30)
📊 Observability: CallStats
If you need per-call stats (latency, retries, request ids, endpoint), use:
let = client.call_model_with_stats.await?;
println!;
🛑 Cancellable Streaming
let = client.chat.messages.stream.execute_stream_with_cancel.await?;
// cancel.cancel(); // emits StreamEnd{finish_reason:"cancelled"}, drops the underlying network stream, and releases inflight permit
🧾 Optional Feedback (Choice Selection)
Telemetry is opt-in. You can inject a FeedbackSink and report feedback explicitly:
use ;
client.report_feedback.await?;
🎨 Key Features
Protocol-Driven Architecture
No match provider statements. All logic is derived from protocol configuration:
// The pipeline is built dynamically from protocol manifest
let pipeline = from_manifest?;
// Operators are configured via manifests (YAML/JSON), not hardcoded
// Adding a new provider requires zero code changes
Multi-Candidate Support
Automatically handles multi-candidate scenarios through the FanOut operator:
streaming:
candidate:
candidate_id_path: "$.choices[*].index"
fan_out: true
Tool Accumulation
Stateful accumulation of tool call arguments:
streaming:
accumulator:
stateful_tool_parsing: true
key_path: "$.delta.partial_json"
flush_on: "$.type == 'content_block_stop'"
Hot Reload
Protocol configurations can be updated at runtime:
let loader = new.with_hot_reload;
// Protocol changes are automatically picked up
📚 Examples
See the examples/ directory:
basic_usage.rs: Simple non-streaming chat completiondeepseek_chat_stream.rs: Streaming chat exampledeepseek_tool_call_stream.rs: Tool calling with streamingcustom_protocol.rs: Loading custom protocol configurationslist_models.rs: Listing available models from providerservice_discovery.rs: Service discovery and custom service callstest_protocol_loading.rs: Protocol loading sanity check
🧪 Testing
# Run all tests
# Run compliance tests (cross-runtime consistency)
# Run with all features enabled
📦 Batch (Chat)
For batch execution (order-preserving), use:
use ;
let client = new.await?;
let reqs = vec!;
let results = client.chat_batch.await;
Smart batch tuning
If you prefer a conservative default heuristic, use:
let results = client.chat_batch_smart.await;
Override concurrency with:
AI_LIB_BATCH_CONCURRENCY
🤝 Contributing
Contributions are welcome! Please ensure that:
- All protocol configurations follow the AI-Protocol specification (v1.5 / V2)
- New operators are properly documented
- Tests are included for new features
- Compliance tests pass for cross-runtime behaviors (
cargo test --test compliance) - Code follows Rust best practices and passes
cargo clippy
📄 License
This project is licensed under either of:
- Apache License, Version 2.0 (LICENSE-APACHE)
- MIT License (LICENSE-MIT)
at your option.
🔗 Related Projects
- AI-Protocol: Protocol specification (v1.5 / V2)
- ai-lib-python: Python runtime implementation
ai-lib-rust - Where protocol meets performance. 🚀