ai-lib 🦀✨
A unified, reliable, high-performance multi-provider AI SDK for Rust
A production-grade, provider-agnostic SDK that provides a unified Rust API for 17+ AI platforms (OpenAI, Groq, Anthropic, Gemini, Mistral, Cohere, Azure OpenAI, Ollama, DeepSeek, Qwen, Baidu ERNIE, Tencent Hunyuan, iFlytek Spark, Kimi, HuggingFace, TogetherAI, xAI Grok, etc.).
Eliminates fragmented authentication flows, streaming formats, error semantics, model naming differences, and inconsistent function calling. Scale from one-liner scripts to multi-region, multi-provider systems without rewriting integration code.
🚀 Core Value (TL;DR)
ai-lib unifies:
- Chat and multimodal requests across heterogeneous model providers
- Unified streaming (unified SSE parser + JSONL protocol) with consistent deltas
- Function calling semantics (including OpenAI-style tool_calls alignment)
- Reasoning model support (structured, streaming, JSON formats)
- Batch processing workflows
- Reliability primitives (retry, backoff, timeout, proxy, health checks, load strategies)
- Model selection (cost/performance/health/weighted)
- Observability hooks
- Progressive configuration (env vars → builder → explicit injection → custom transport)
You focus on product logic; ai-lib handles infrastructure friction.
⚙️ Quick Start
Installation
[]
= "0.3.2"
= { = "1", = ["full"] }
= "0.3"
Fastest Way
use Provider;
async
Standard Chat
use ;
async
Streaming
use StreamExt;
let mut stream = client.chat_completion_stream.await?;
while let Some = stream.next.await
🧠 Core Concepts
Concept | Purpose |
---|---|
Provider | Enumerates all supported providers |
AiClient / Builder | Main entry point; configuration encapsulation |
ChatCompletionRequest | Unified request payload |
Message / Content | Text/image/audio/(future structured) |
Function / Tool | Unified function calling semantics |
Streaming Event | Provider-standardized delta streams |
ModelManager / ModelArray | Strategy-driven model orchestration |
ConnectionOptions | Explicit runtime overrides |
Metrics Trait | Custom observability integration |
Transport | Injectable HTTP + streaming implementation |
💡 Key Feature Clusters
- Unified provider abstraction (no per-provider branching)
- Universal streaming (unified SSE parser + JSONL; with fallback simulation)
- Multimodal primitives (text/image/audio)
- Function calling (consistent tool patterns; tool_calls compatibility)
- Reasoning model support (structured, streaming, JSON formats)
- Batch processing (sequential/bounded concurrency/smart strategies)
- Reliability: retry, error classification, timeout, proxy, pooling, interceptor pipeline (features)
- Model management: performance/cost/health/round-robin/weighted
- Observability: pluggable metrics and timing
- Security: isolation, no default content logging
- Extensibility: custom transport, metrics, strategy injection
🌍 Supported Providers (Snapshot)
Provider | Adapter Type | Streaming | Notes |
---|---|---|---|
Groq | Config-driven | ✅ | Ultra-low latency |
OpenAI | Independent | ✅ | Function calling |
Anthropic (Claude) | Config-driven | ✅ | High quality |
Google Gemini | Independent | ✅ | Uses x-goog-api-key header |
Mistral | Independent | ✅ | European models |
Cohere | Independent | ✅ | RAG-optimized |
HuggingFace | Config-driven | ✅ | Open models |
TogetherAI | Config-driven | ✅ | Cost-effective |
DeepSeek | Config-driven | ✅ | Reasoning models |
Qwen | Config-driven | ✅ | Chinese ecosystem |
Baidu ERNIE | Config-driven | ✅ | Enterprise CN |
Tencent Hunyuan | Config-driven | ✅ | Cloud integration |
iFlytek Spark | Config-driven | ✅ | Voice + multimodal |
Moonshot Kimi | Config-driven | ✅ | Long context |
Azure OpenAI | Config-driven | ✅ | Enterprise compliance |
Ollama | Config-driven | ✅ | Local/air-gapped |
xAI Grok | Config-driven | ✅ | Real-time oriented |
🔑 Configuration & Diagnostics
Environment Variables (Convention-based)
# API Keys
# Optional Base URLs
# Proxy
# Global Timeout (seconds)
# Optional: HTTP connection pool (enabled by default)
# Optional: Cost Metrics (when `cost_metrics` feature enabled)
Explicit Overrides
use ;
let client = with_options?;
Backpressure & Concurrency Cap (Optional)
- Simple: use
concurrency_limit
in batch APIs - Global: set a max concurrency gate via Builder
use ;
let client = new
.with_max_concurrency
.for_production
.build?;
Notes:
- The gate acquires a permit for
chat_completion
and streaming calls and releases it when finished. - If no permits are available,
RateLimitExceeded
is returned; combine with retry/queueing if needed.
🛡️ Reliability & Resilience
Aspect | Capability |
---|---|
Retry | Exponential backoff + classification |
Errors | Distinguish transient vs permanent |
Timeout | Per-request configurable |
Proxy | Global/per-connection/disable |
Connection Pool | Tunable size + lifecycle |
Health Checks | Endpoint status + policy-based avoidance |
Load Strategies | Round-robin/weighted/health/performance/cost |
Fallback | Multi-provider arrays/manual layering |
📊 Observability & Metrics
Implement Metrics
trait to bridge Prometheus, OpenTelemetry, StatsD, etc.
;
let client = new_with_metrics?;
Feature Flags (Optional)
interceptors
: Interceptor trait & pipelineunified_sse
: Common SSE parserunified_transport
: Shared reqwest client factorycost_metrics
: Minimal cost accounting via env varsrouting_mvp
: EnableModelArray
routing
Enterprise Features
For advanced enterprise capabilities, consider [ai-lib-pro]:
- Advanced Routing: Policy-driven routing, health monitoring, automatic failover
- Enterprise Observability: Structured logging, metrics, distributed tracing
- Cost Management: Centralized pricing tables and budget tracking
- Quota Management: Tenant/organization quotas and rate limiting
- Audit & Compliance: Comprehensive audit trails with redaction
- Security: Envelope encryption and key management
- Configuration: Hot-reload configuration management
ai-lib-pro layers on top of the open-source ai-lib without breaking changes, providing a seamless upgrade path for enterprise users.
Tiering: OSS vs PRO
- OSS (this crate): unified API, streaming, retries/timeouts/proxy, configurable pool, lightweight rate limiting and backpressure, batch concurrency controls. Simple env-driven setup; zero external services required.
- PRO: multi-tenant quotas & priorities, adaptive concurrency/limits, policy-driven routing, centralized config and hot-reload, deep observability/exporters, audit/compliance, cost catalog and budget guardrails. Drop-in upgrade without code changes.
🗂️ Examples Directory (in /examples)
Category | Examples |
---|---|
Getting Started | quickstart / basic_usage / builder_pattern |
Configuration | explicit_config / proxy_example / custom_transport_config |
Streaming | test_streaming / cohere_stream |
Reliability | custom_transport |
Multi-Provider | config_driven_example / model_override_demo |
Model Management | model_management |
Batch Processing | batch_processing |
Function Calling | function_call_openai / function_call_exec |
Multimodal | multimodal_example |
Architecture Demo | architecture_progress |
Professional | ascii_horse / hello_groq |
📄 License
Dual-licensed:
- MIT
- Apache License (Version 2.0)
You may choose the license that best fits your project.
🤝 Contributing Guide
- Fork & clone repository
- Create feature branch:
git checkout -b feature/your-feature
- Run tests:
cargo test
- Add examples if introducing new features
- Follow adapter layering (prefer config-driven before custom)
- Open PR with rationale + benchmarks (if performance impact)
We value: clarity, test coverage, minimal surface area creep, incremental composability.
📚 Citation