AI-lib: Unified AI SDK for Rust
A unified Rust SDK that provides a single interface to multiple AI providers using a hybrid architecture
Overview
ai-lib is a unified AI SDK for Rust that offers a single, consistent interface for interacting with multiple large language model providers. It uses a hybrid architecture that balances developer ergonomics with provider-specific features.
Note: upgrade guides and PR notes have been moved to the docs/
directory to keep the repository root clean. See docs/UPGRADE_0.2.0.md
and docs/PR_0.2.0.md
for migration and PR details.
Supported AI Providers
- ✅ Groq (config-driven) — supports llama3, mixtral models
- ✅ xAI Grok (config-driven) — supports grok models
- ✅ DeepSeek (config-driven) — supports deepseek-chat, deepseek-reasoner
- ✅ Anthropic Claude (config-driven) — supports claude-3.5-sonnet
- ✅ Google Gemini (independent adapter) — supports gemini-1.5-pro, gemini-1.5-flash
- ✅ OpenAI (independent adapter) — supports gpt-3.5-turbo, gpt-4 (may require a proxy in some regions)
- ✅ Qwen / Tongyi Qianwen (Alibaba Cloud) (config-driven) — supports Qwen family (OpenAI-compatible)
- ✅ Cohere (independent adapter) — supports command/generate models (SSE streaming + fallback)
- ✅ Baidu Wenxin (Baidu ERNIE) (config-driven) — supports ernie-3.5, ernie-4.0 (OpenAI-compatible endpoints via the Qianfan platform; may require AK/SK and OAuth)
- ✅ Tencent Hunyuan (config-driven) — supports the hunyuan family (OpenAI-compatible endpoints; cloud account and keys required)
- ✅ iFlytek Spark (config-driven) — supports spark models (OpenAI-compatible, good for mixed voice+text scenarios)
- ✅ Moonshot / Kimi (config-driven) — supports kimi series (OpenAI-compatible, suitable for long-text scenarios)
- ✅ Mistral (independent adapter) — supports mistral models
- ✅ Hugging Face Inference (config-driven) — supports hub-hosted models
- ✅ TogetherAI (config-driven) — supports together.ai hosted models
- ✅ Azure OpenAI (config-driven) — supports Azure-hosted OpenAI endpoints
- ✅ Ollama (config-driven / local) — supports local Ollama instances
Core features
🚀 Zero-cost provider switching
Switch between AI providers with a single line of code — the unified API ensures a seamless experience:
// Instant provider switching — same API, different backends
let groq_client = new?;
let gemini_client = new?;
let claude_client = new?;
Runtime selection is supported (for example via environment variables or other logic).
🌊 Universal streaming support
Provides realtime streaming responses for all providers; SSE parsing and fallback emulation ensure consistent behavior:
use StreamExt;
let mut stream = client.chat_completion_stream.await?;
print!;
while let Some = stream.next.await
Includes a cancel handle (CancelHandle
) and a planned backpressure API, suitable for low-latency UI applications.
🔄 Enterprise-grade reliability and error handling
- Automatic retries with exponential backoff: intelligently retry transient failures (e.g., timeouts, rate limits).
- Smart error classification: distinguish retryable errors (network issues) from permanent errors (authentication failures) and provide recovery guidance.
- Proxy support: HTTP/HTTPS proxies with auth for enterprise environments.
- Timeout management: configurable timeouts and graceful degradation to ensure production stability.
Example error handling:
match client.chat_completion.await
⚡ Hybrid architecture
- Config-driven adapters: for OpenAI-compatible APIs; require minimal wiring (≈15 lines) and inherit SSE streaming, proxy, and upload behaviors.
- Independent adapters: full control for providers with unique APIs, including custom auth and response parsing.
- Four-layer design: unified client layer, adapter layer, transport layer (HttpTransport with proxy and retry), and common types — ensuring type safety with no extra runtime dependencies.
- Benefits: major code reuse, flexible extensibility, and automatic feature inheritance.
📊 Metrics & observability
A minimal metrics surface (the Metrics
and Timer
traits) with a default NoopMetrics
implementation. Adapters include request counters and duration timers and accept injected metrics implementations for testing or production monitoring.
📁 Multimodal & file support
- Supports text, JSON, image, and audio content types.
- File upload / inline helpers with size checks and upload fallbacks.
- Function-calling / tool support: unified
Tool
andFunctionCall
types with cross-provider parsing and execution.
Minimal tool-calling example:
let mut req = new;
req.functions = Some;
req.function_call = Some;
🔧 Dependency injection & testability
- An object-safe transport abstraction (
DynHttpTransportRef
) allows injecting mock transports for unit tests. - Adapter constructors support custom transport injection.
Example:
let transport: DynHttpTransportRef = my_test_transport.into;
let adapter = with_transport_ref?;
🚀 Performance & scalability
- Benchmarks: memory <2MB, client overhead <1ms, streaming chunk latency <10ms.
- Connection pooling: automatic reuse with tunable
reqwest::Client
options (max idle connections, idle timeout). - Custom configuration: timeouts, proxy, and pool parameters via
HttpTransportConfig
.
Custom pool example:
let reqwest_client = builder
.pool_max_idle_per_host
.build?;
let transport = with_client;
Quickstart
Installation
Add to your Cargo.toml
:
[]
= "0.2.0"
= { = "1.0", = ["full"] }
= "0.3"
One-minute tryout (no API key required)
Construct a client and request without making network calls:
use ;
Real requests
Set API keys and proxy:
Environment variables
- API keys: e.g.
GROQ_API_KEY
,OPENAI_API_KEY
, etc. - Proxy:
AI_PROXY_URL
supports HTTP/HTTPS and auth.
Examples & tests
- Hybrid architecture:
cargo run --example test_hybrid_architecture
- Streaming:
cargo run --example test_streaming_improved
- Retry behavior:
cargo run --example test_retry_mechanism
- Provider tests:
cargo run --example test_groq_generic
, etc.
Provider details
Provider | Status | Architecture | Streaming | Models | Notes |
---|---|---|---|---|---|
Groq | ✅ production-ready | config-driven | ✅ | llama3-8b/70b, mixtral-8x7b | fast inference, proxy support |
DeepSeek | ✅ production-ready | config-driven | ✅ | deepseek-chat, deepseek-reasoner | China-focused, direct access |
Anthropic | ✅ production-ready | config-driven | ✅ | claude-3.5-sonnet | custom auth required |
Google Gemini | ✅ production-ready | independent adapter | 🔄 | gemini-1.5-pro/flash | URL parameter auth |
OpenAI | ✅ production-ready | independent adapter | ✅ | gpt-3.5-turbo, gpt-4 | may require proxy in some regions |
Qwen | ✅ production-ready | config-driven | ✅ | Qwen family | uses DASHSCOPE_API_KEY |
Baidu Wenxin (ERNIE) | ✅ production-ready | config-driven | ✅ | ernie-3.5, ernie-4.0 | OpenAI-compatible via Qianfan; may require AK/SK and OAuth — see Baidu Cloud console |
Tencent Hunyuan | ✅ production-ready | config-driven | ✅ | hunyuan family | Tencent Cloud offers OpenAI-compatible endpoints (cloud account & keys required) — see Tencent docs |
iFlytek Spark | ✅ production-ready | config-driven | ✅ | spark family | voice+text friendly, OpenAI-compatible endpoints — see iFlytek docs |
Moonshot Kimi | ✅ production-ready | config-driven | ✅ | kimi family | OpenAI-compatible endpoints; suitable for long-text scenarios — see Moonshot platform |
Roadmap
Implemented
- Hybrid architecture and universal streaming support.
- Enterprise-grade error handling, retry, and proxy support.
- Multimodal primitives, function-calling, and metrics scaffold.
- Transport injection and upload tests.
Planned
- Advanced backpressure API and benchmark CI.
- Connection pool tuning and plugin system.
- Built-in caching and load balancing.
Contributing
Contributions are welcome (new providers, performance work, docs).
- Clone:
git clone https://github.com/hiddenpath/ai-lib.git
- Create a branch:
git checkout -b feature/new-feature
- Test:
cargo test
- Open a PR.
Community & support
- 📖 Docs: docs.rs/ai-lib
- 🐛 Issues: GitHub Issues
- 💬 Discussions: GitHub Discussions
Acknowledgements & license
Thanks to the AI providers and the Rust community. Dual licensed: MIT or Apache 2.0.
Citation: