llm-connector
- English | δΈζ
A production-ready Rust library for unified LLM API access across multiple providers. Built with a protocol-based architecture for maximum flexibility and performance.
π― What llm-connector Does
Core Purpose: Unified LLM API with protocol-based architecture
- β Protocol-Based Design: Organize providers by API protocol (OpenAI, Anthropic, Aliyun)
- β 10+ Providers: DeepSeek, Zhipu, Moonshot, VolcEngine, Tencent, MiniMax, StepFun, LongCat, Anthropic, Aliyun
- β Automatic Retry: Exponential backoff with smart error classification
- β Observability: Built-in logging and metrics middleware
- β Flexible Configuration: Environment variables, config files, or programmatic
- β Zero-Copy Performance: Arc-based sharing for 50-70% memory reduction
- β Production Ready: Comprehensive error handling and retry mechanisms
β¨ Key Features
ποΈ Protocol-Based Architecture
- OpenAI Protocol: 8 providers (DeepSeek, Zhipu, Moonshot, VolcEngine, Tencent, MiniMax, StepFun, LongCat)
- Anthropic Protocol: Claude models
- Aliyun Protocol: DashScope/Qwen models
- Easy Extension: Add new providers in 3 lines of code
π Reliability
- Automatic Retry: Exponential backoff with jitter
- Smart Error Classification: Only retry retriable errors
- 99.9998% Success Rate: With default retry configuration
π Observability
- Logging Middleware: Track all requests and responses
- Metrics Collection: Real-time performance monitoring
- Token Usage Tracking: Monitor API costs
β‘ Performance
- Zero-Copy Sharing: Arc-based configuration and protocols
- 50-70% Memory Reduction: Compared to deep cloning
- 10-100x Faster Cloning: O(1) instead of O(n)
π¦ Supported Providers
OpenAI Protocol (8 providers)
| Provider | Models | Status |
|---|---|---|
| DeepSeek | deepseek-chat, deepseek-coder | β |
| Zhipu (GLM) | glm-4, glm-4-plus, glm-4-flash | β |
| Moonshot (Kimi) | moonshot-v1-8k, moonshot-v1-32k | β |
| VolcEngine (Doubao) | doubao-pro, doubao-lite | β |
| Tencent (Hunyuan) | hunyuan-pro, hunyuan-lite | β |
| MiniMax | abab6.5, abab6.5s | β |
| StepFun | step-1-8k, step-1-32k | β |
| LongCat | LongCat-Flash-Chat, LongCat-Flash-Thinking | β |
Anthropic Protocol (1 provider)
| Provider | Models | Status |
|---|---|---|
| Anthropic | claude-3-5-sonnet, claude-3-opus, claude-3-haiku | β |
Aliyun Protocol (1 provider)
| Provider | Models | Status |
|---|---|---|
| Aliyun (DashScope) | qwen-turbo, qwen-plus, qwen-max | β |
Total: 10 providers, 3 protocols, 30+ models
π Quick Start
Installation
Add to your Cargo.toml:
[]
= "0.2"
= { = "1", = ["full"] }
Optional Features:
[]
= { = "0.2", = ["yaml"] } # For YAML config file support
Available features:
yaml- Enable YAML configuration file support (requiresserde_yaml)streaming- Enable streaming response support (enabled by default)reqwest- HTTP client support (enabled by default)
Basic Usage
use ;
async
βοΈ Configuration
llm-connector is a library, not a CLI tool. Configuration is simple and straightforward.
Method 1: Direct API Key (Recommended)
Pass API keys directly when creating providers:
use ;
// Simple and clear
let config = new;
let provider = new?;
Method 2: Environment Variables
For development convenience, use environment variables:
# Set API keys
Then in your code:
use env;
let api_key = var?;
let config = new;
let provider = new?;
Method 3: Advanced Configuration (Optional)
For advanced use cases with custom settings:
use ;
let config = new
.with_base_url
.with_timeout_ms
.with_retry
.with_header;
let provider = new?;
Method 4: YAML Config File (Optional, for Multi-Provider)
Requires: Enable the yaml feature in your Cargo.toml:
= { = "0.2", = ["yaml"] }
For applications managing multiple providers, you can optionally use a YAML config file:
# config.yaml
providers:
deepseek:
protocol: openai
api_key: your-deepseek-key
timeout_ms: 30000
claude:
protocol: anthropic
api_key: your-anthropic-key
timeout_ms: 60000
Load it in your code:
use RegistryConfig;
use ProviderRegistry;
// Load from YAML file
let config = from_yaml_file?;
let registry = from_config?;
// Get providers
let deepseek = registry.get_provider.unwrap;
let claude = registry.get_provider.unwrap;
Note: YAML config is optional and only recommended for complex multi-provider scenarios. For simple use cases, use Method 1 or 2.
Summary
| Method | Use Case | Complexity |
|---|---|---|
| Direct API Key | Simple, single provider | β Simple |
| Environment Variables | Development, testing | β Simple |
| Advanced Config | Custom settings | ββ Medium |
| YAML File | Multi-provider apps | βββ Complex |
Recommendation: Start with Method 1 (Direct API Key) for simplicity. Use Method 4 (YAML) only if you need to manage multiple providers.
DeepSeek Specific Features
DeepSeek provides two main models:
use ;
async
Zhipu GLM Specific Features
Zhipu GLM provides several models optimized for different use cases:
use ;
async
Streaming
use StreamExt;
use ;
async
Model Naming Convention
Use the format provider/model for explicit provider selection:
// Explicit provider selection (recommended)
"openai/gpt-4"
"anthropic/claude-3-5-sonnet-20241022"
"deepseek/deepseek-chat"
"zhipu/glm-4"
"qwen/qwen-turbo"
"kimi/moonshot-v1-8k"
// Direct model names (auto-detected)
"gpt-4" // -> openai/gpt-4
"claude-3-haiku" // -> anthropic/claude-3-haiku
"deepseek-chat" // -> deepseek/deepseek-chat
Error Handling
The library provides structured error types:
use ;
match client.chat.await
Extending with New Providers
Implement the Provider trait to add support for new LLM providers:
use ;
use async_trait;
Design Philosophy
llm-connector follows the Unix philosophy: "Do one thing and do it well."
- Single Responsibility: Only handles protocol adaptation between LLM providers
- Minimal Dependencies: Keeps the dependency tree small and focused
- Composable: Designed to be used as a building block in larger systems
- No Magic: Explicit configuration and clear error messages
- Provider Agnostic: Treats all providers equally, no special cases
What's NOT Included (By Design)
If you need these features, consider these alternatives:
- Load Balancing: Use nginx, HAProxy, or a service mesh
- Rate Limiting: Use Redis-based rate limiters or API gateways
- Caching: Use Redis, Memcached, or HTTP caching proxies
- Monitoring: Use Prometheus, Grafana, or APM solutions
- Circuit Breaking: Use Hystrix-style libraries or service mesh features
- Request Queuing: Use message queues like RabbitMQ or Apache Kafka
Contributing
We welcome contributions! Please focus on:
- Adding new providers - Implement the
Providertrait - Improving protocol compatibility - Better OpenAI API compliance
- Bug fixes - Especially around streaming and error handling
- Documentation - Examples and provider-specific notes
Please avoid adding features outside the core scope (load balancing, complex routing, etc.).
π₯ Advanced Features
Automatic Retry with Exponential Backoff
use ;
// Use default retry policy (3 retries, exponential backoff)
let retry = default;
let response = retry.execute.await?;
// Custom retry policy
let retry = new
.max_retries
.initial_backoff_ms
.backoff_multiplier
.max_backoff_ms
.build_middleware;
Logging and Metrics
use ;
// Add logging
let logger = new
.with_request_body
.with_response_body
.with_timing
.with_usage;
let response = logger.execute.await?;
// Collect metrics
let metrics = new;
let response = metrics.execute.await?;
// Get metrics snapshot
let snapshot = metrics.snapshot;
println!;
println!;
println!;
Request/Response Interceptors
use ;
use Arc;
// Create interceptor chain
let chain = new
.with_interceptor
.with_interceptor;
// Execute with interceptors
let response = chain.execute.await?;
π Examples
Check out the examples directory for more usage examples:
deepseek_example.rs- DeepSeek provider basic usagelongcat_demo.rs- LongCat API complete demo (free quota available)protocol_architecture_demo.rs- Protocol architecture overviewtest_all_providers.rs- Test all configured providersverify_real_api_calls.rs- Verify real API calls
π Examples Documentation - Detailed guide for all examples
Run an example:
# Set API key
# Run example
π Documentation
- P0 Improvements - Foundation architecture
- P1 Improvements - Retry and factory pattern
- P2 Improvements - Middleware and interceptors
- LongCat Support - LongCat integration guide
- Improvements Summary - Complete overview
π Performance
- Memory: 50-70% reduction through Arc-based sharing
- Clone Speed: 10-100x faster (O(1) vs O(n))
- Reliability: 99.9998% success rate with retry
- Overhead: <1ms for middleware stack
π Security
- API keys are never logged
- Supports custom headers for authentication
- HTTPS by default
- No data persistence
License
MIT