ai-lib 🦀✨

A unified, reliable, high-performance multi-provider AI SDK for Rust

A production-grade, provider-agnostic SDK that provides a unified Rust API for 17+ AI platforms (OpenAI, Groq, Anthropic, Gemini, Mistral, Cohere, Azure OpenAI, Ollama, DeepSeek, Qwen, Baidu ERNIE, Tencent Hunyuan, iFlytek Spark, Kimi, HuggingFace, TogetherAI, xAI Grok, etc.).
Eliminates fragmented authentication flows, streaming formats, error semantics, model naming differences, and inconsistent function calling. Scale from one-liner scripts to multi-region, multi-provider systems without rewriting integration code.

Official Website

🚀 Core Value (TL;DR)

ai-lib unifies:

Chat and multimodal requests across heterogeneous model providers
Unified streaming (unified SSE parser + JSONL protocol) with consistent deltas
Function calling semantics (including OpenAI-style tool_calls alignment)
Reasoning model support (structured, streaming, JSON formats)
Batch processing workflows
Reliability primitives (retry, backoff, timeout, proxy, health checks, load strategies)
Model selection (cost/performance/health/weighted)
Observability hooks
Progressive configuration (env vars → builder → explicit injection → custom transport)

You focus on product logic; ai-lib handles infrastructure friction.

⚙️ Quick Start

Installation

[dependencies]
ai-lib = "0.3.2"
tokio = { version = "1", features = ["full"] }
futures = "0.3"

Fastest Way

use ai_lib::Provider;

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let reply = ai_lib::AiClient::quick_chat_text(Provider::Groq, "Ping?").await?;
    println!("Reply: {reply}");
    Ok(())
}

Standard Chat

use ai_lib::{AiClient, Provider, Message, Role, Content, ChatCompletionRequest};

#[tokio::main]
async fn main() -> anyhow::Result<()> {
    let client = AiClient::new(Provider::OpenAI)?;
    let req = ChatCompletionRequest::new(
        client.default_chat_model(),
        vec![Message {
            role: Role::User,
            content: Content::new_text("Explain Rust ownership in one sentence."),
            function_call: None,
        }]
    );
    let resp = client.chat_completion(req).await?;
    println!("Answer: {}", resp.first_text()?);
    Ok(())
}

Streaming

use futures::StreamExt;
let mut stream = client.chat_completion_stream(req).await?;
while let Some(chunk) = stream.next().await {
    let c = chunk?;
    if let Some(delta) = c.choices[0].delta.content.clone() {
        print!("{delta}");
    }
}

🧠 Core Concepts

Concept	Purpose
Provider	Enumerates all supported providers
AiClient / Builder	Main entry point; configuration encapsulation
ChatCompletionRequest	Unified request payload
Message / Content	Text/image/audio/(future structured)
Function / Tool	Unified function calling semantics
Streaming Event	Provider-standardized delta streams
ModelManager / ModelArray	Strategy-driven model orchestration
ConnectionOptions	Explicit runtime overrides
Metrics Trait	Custom observability integration
Transport	Injectable HTTP + streaming implementation

💡 Key Feature Clusters

Unified provider abstraction (no per-provider branching)
Universal streaming (unified SSE parser + JSONL; with fallback simulation)
Multimodal primitives (text/image/audio)
Function calling (consistent tool patterns; tool_calls compatibility)
Reasoning model support (structured, streaming, JSON formats)
Batch processing (sequential/bounded concurrency/smart strategies)
Reliability: retry, error classification, timeout, proxy, pooling, interceptor pipeline (features)
Model management: performance/cost/health/round-robin/weighted
Observability: pluggable metrics and timing
Security: isolation, no default content logging
Extensibility: custom transport, metrics, strategy injection

🌍 Supported Providers (Snapshot)

Provider	Adapter Type	Streaming	Notes
Groq	Config-driven	✅	Ultra-low latency
OpenAI	Independent	✅	Function calling
Anthropic (Claude)	Config-driven	✅	High quality
Google Gemini	Independent	✅	Uses `x-goog-api-key` header
Mistral	Independent	✅	European models
Cohere	Independent	✅	RAG-optimized
HuggingFace	Config-driven	✅	Open models
TogetherAI	Config-driven	✅	Cost-effective
DeepSeek	Config-driven	✅	Reasoning models
Qwen	Config-driven	✅	Chinese ecosystem
Baidu ERNIE	Config-driven	✅	Enterprise CN
Tencent Hunyuan	Config-driven	✅	Cloud integration
iFlytek Spark	Config-driven	✅	Voice + multimodal
Moonshot Kimi	Config-driven	✅	Long context
Azure OpenAI	Config-driven	✅	Enterprise compliance
Ollama	Config-driven	✅	Local/air-gapped
xAI Grok	Config-driven	✅	Real-time oriented

🔑 Configuration & Diagnostics

Environment Variables (Convention-based)

# API Keys
export OPENAI_API_KEY=...
export GROQ_API_KEY=...
export GEMINI_API_KEY=...
export ANTHROPIC_API_KEY=...
export DEEPSEEK_API_KEY=...

# Optional Base URLs
export GROQ_BASE_URL=https://custom.groq.com

# Proxy
export AI_PROXY_URL=http://proxy.internal:8080

# Global Timeout (seconds)
export AI_TIMEOUT_SECS=30

# Optional: HTTP connection pool (enabled by default)
export AI_HTTP_POOL_MAX_IDLE_PER_HOST=32
export AI_HTTP_POOL_IDLE_TIMEOUT_MS=90000

# Optional: Cost Metrics (when `cost_metrics` feature enabled)
export COST_INPUT_PER_1K=0.5
export COST_OUTPUT_PER_1K=1.5

Explicit Overrides

use ai_lib::{AiClient, Provider, ConnectionOptions};
let client = AiClient::with_options(
    Provider::Groq,
    ConnectionOptions {
        base_url: Some("https://custom.groq.com".into()),
        proxy: Some("http://proxy.internal:8080".into()),
        api_key: Some("override-key".into()),
        timeout: Some(Duration::from_secs(45)),
        disable_proxy: false,
    }
)?;

Backpressure & Concurrency Cap (Optional)

Simple: use concurrency_limit in batch APIs
Global: set a max concurrency gate via Builder

use ai_lib::{AiClientBuilder, Provider};

let client = AiClientBuilder::new(Provider::Groq)
    .with_max_concurrency(64)
    .for_production()
    .build()?;

Notes:

The gate acquires a permit for chat_completion and streaming calls and releases it when finished.
If no permits are available, RateLimitExceeded is returned; combine with retry/queueing if needed.

🛡️ Reliability & Resilience

Aspect	Capability
Retry	Exponential backoff + classification
Errors	Distinguish transient vs permanent
Timeout	Per-request configurable
Proxy	Global/per-connection/disable
Connection Pool	Tunable size + lifecycle
Health Checks	Endpoint status + policy-based avoidance
Load Strategies	Round-robin/weighted/health/performance/cost
Fallback	Multi-provider arrays/manual layering

📊 Observability & Metrics

Implement Metrics trait to bridge Prometheus, OpenTelemetry, StatsD, etc.

struct CustomMetrics;
#[async_trait::async_trait]
impl ai_lib::metrics::Metrics for CustomMetrics {
    async fn incr_counter(&self, name: &str, value: u64) { /* ... */ }
    async fn start_timer(&self, name: &str) -> Option<Box<dyn ai_lib::metrics::Timer + Send>> { /* ... */ }
}
let client = AiClient::new_with_metrics(Provider::Groq, Arc::new(CustomMetrics))?;

Feature Flags (Optional)

interceptors: Interceptor trait & pipeline
unified_sse: Common SSE parser
unified_transport: Shared reqwest client factory
cost_metrics: Minimal cost accounting via env vars
routing_mvp: Enable ModelArray routing

Enterprise Features

For advanced enterprise capabilities, consider [ai-lib-pro]:

Advanced Routing: Policy-driven routing, health monitoring, automatic failover
Enterprise Observability: Structured logging, metrics, distributed tracing
Cost Management: Centralized pricing tables and budget tracking
Quota Management: Tenant/organization quotas and rate limiting
Audit & Compliance: Comprehensive audit trails with redaction
Security: Envelope encryption and key management
Configuration: Hot-reload configuration management

ai-lib-pro layers on top of the open-source ai-lib without breaking changes, providing a seamless upgrade path for enterprise users.

Tiering: OSS vs PRO

OSS (this crate): unified API, streaming, retries/timeouts/proxy, configurable pool, lightweight rate limiting and backpressure, batch concurrency controls. Simple env-driven setup; zero external services required.
PRO: multi-tenant quotas & priorities, adaptive concurrency/limits, policy-driven routing, centralized config and hot-reload, deep observability/exporters, audit/compliance, cost catalog and budget guardrails. Drop-in upgrade without code changes.

🗂️ Examples Directory (in /examples)

Category	Examples
Getting Started	quickstart / basic_usage / builder_pattern
Configuration	explicit_config / proxy_example / custom_transport_config
Streaming	test_streaming / cohere_stream
Reliability	custom_transport
Multi-Provider	config_driven_example / model_override_demo
Model Management	model_management
Batch Processing	batch_processing
Function Calling	function_call_openai / function_call_exec
Multimodal	multimodal_example
Architecture Demo	architecture_progress
Professional	ascii_horse / hello_groq

📄 License

Dual-licensed:

MIT
Apache License (Version 2.0)

You may choose the license that best fits your project.

🤝 Contributing Guide

Fork & clone repository
Create feature branch: git checkout -b feature/your-feature
Run tests: cargo test
Add examples if introducing new features
Follow adapter layering (prefer config-driven before custom)
Open PR with rationale + benchmarks (if performance impact)

We value: clarity, test coverage, minimal surface area creep, incremental composability.

📚 Citation

@software{ai-lib,
    title = {ai-lib: A Unified AI SDK for Rust},
    author = {Luqiang Wang},
    url = {https://github.com/hiddenpath/ai-lib},
    year = {2025}
}

ai-lib 0.3.2