Expand description
§Ultrafast Client Module
This module provides the main client implementation for the Ultrafast Models SDK. It includes both standalone and gateway modes, with comprehensive provider management, routing, caching, and error handling.
§Overview
The client module provides:
- Dual Mode Operation: Standalone and gateway modes
- Provider Management: Multiple AI provider integration
- Intelligent Routing: Automatic provider selection
- Circuit Breakers: Automatic failover and recovery
- Caching Layer: Response caching for performance
- Retry Logic: Configurable retry policies
- Metrics Collection: Performance monitoring
- Streaming Support: Real-time response streaming
§Client Modes
§Standalone Mode
Direct communication with AI providers:
use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};
let client = UltrafastClient::standalone()
.with_openai("your-openai-key")
.with_anthropic("your-anthropic-key")
.with_routing_strategy(RoutingStrategy::LoadBalance {
weights: vec![0.6, 0.4],
})
.build()?;
let response = client.chat_completion(ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::user("Hello!")],
..Default::default()
}).await?;§Gateway Mode
Communication through the Ultrafast Gateway:
let client = UltrafastClient::gateway("http://localhost:3000")
.with_api_key("your-gateway-key")
.with_timeout(Duration::from_secs(30))
.build()?;
let response = client.chat_completion(request).await?;§Provider Integration
The client supports multiple providers:
- OpenAI: GPT-4, GPT-3.5, and other models
- Anthropic: Claude-3, Claude-2, Claude Instant
- Google: Gemini Pro, Gemini Pro Vision, PaLM
- Azure OpenAI: Azure-hosted OpenAI models
- Ollama: Local and remote Ollama instances
- Mistral AI: Mistral 7B, Mixtral models
- Cohere: Command, Command R models
- Custom Providers: Extensible provider system
§Routing Strategies
Multiple routing strategies for provider selection:
- Single: Route all requests to one provider
- Load Balance: Distribute requests across providers
- Failover: Primary provider with automatic fallback
- Conditional: Route based on request characteristics
- A/B Testing: Route for testing different providers
§Circuit Breakers
Automatic failover and recovery mechanisms:
- Closed State: Normal operation
- Open State: Provider failing, requests blocked
- Half-Open State: Testing if provider recovered
- Automatic Recovery: Automatic state transitions
§Caching
Built-in response caching:
- In-Memory Cache: Fast local caching
- Redis Cache: Distributed caching
- Automatic TTL: Configurable cache expiration
- Cache Keys: Intelligent cache key generation
§Retry Logic
Configurable retry policies:
- Exponential Backoff: Smart retry delays
- Max Retries: Configurable retry limits
- Retryable Errors: Automatic retry on specific errors
- Jitter: Randomized retry delays to prevent thundering herd
§Performance Features
- Connection Pooling: Reusable HTTP connections
- Request Batching: Batch multiple requests
- Compression: Automatic request/response compression
- Async Operations: Non-blocking I/O throughout
- Memory Efficiency: Minimal memory footprint
§Error Handling
Comprehensive error handling with specific error types:
- Authentication Errors: Invalid API keys or tokens
- Rate Limit Errors: Exceeded rate limits with retry info
- Provider Errors: Provider-specific error messages
- Network Errors: Connection and timeout issues
- Validation Errors: Invalid request parameters
§Configuration
Highly configurable client behavior:
- Timeouts: Per-request and per-provider timeouts
- Rate Limits: Per-provider rate limiting
- Circuit Breakers: Failure thresholds and recovery settings
- Caching: Cache TTL and size limits
- Logging: Structured logging configuration
§Examples
§Basic Usage
use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = UltrafastClient::standalone()
.with_openai("your-key")
.build()?;
let request = ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::user("Hello, world!")],
..Default::default()
};
let response = client.chat_completion(request).await?;
println!("Response: {}", response.choices[0].message.content);
Ok(())
}§Multi-Provider Setup
let client = UltrafastClient::standalone()
.with_openai("openai-key")
.with_anthropic("anthropic-key")
.with_google("google-key", "project-id")
.with_ollama("http://localhost:11434")
.with_routing_strategy(RoutingStrategy::LoadBalance {
weights: vec![0.4, 0.3, 0.2, 0.1],
})
.build()?;§Advanced Configuration
use std::time::Duration;
use ultrafast_models_sdk::{UltrafastClient, ClientConfig};
let config = ClientConfig {
timeout: Duration::from_secs(30),
max_retries: 5,
retry_delay: Duration::from_secs(1),
user_agent: Some("MyApp/1.0".to_string()),
..Default::default()
};
let client = UltrafastClient::standalone()
.with_config(config)
.with_openai("your-key")
.build()?;§Circuit Breaker Configuration
use ultrafast_models_sdk::circuit_breaker::CircuitBreakerConfig;
let circuit_config = CircuitBreakerConfig {
failure_threshold: 5,
recovery_timeout: Duration::from_secs(60),
request_timeout: Duration::from_secs(30),
half_open_max_calls: 3,
};
let client = UltrafastClient::standalone()
.with_openai("your-key")
.with_circuit_breaker_config(circuit_config)
.build()?;§Caching Configuration
use ultrafast_models_sdk::cache::CacheConfig;
let cache_config = CacheConfig {
enabled: true,
ttl: Duration::from_hours(1),
max_size: 1000,
backend: CacheBackend::Memory,
};
let client = UltrafastClient::standalone()
.with_cache_config(cache_config)
.with_openai("your-key")
.build()?;§Testing
The client includes testing utilities:
#[cfg(test)]
mod tests {
use super::*;
use tokio_test;
#[tokio_test]
async fn test_client_creation() {
let client = UltrafastClient::standalone()
.with_openai("test-key")
.build();
assert!(client.is_ok());
}
#[tokio_test]
async fn test_chat_completion() {
let client = UltrafastClient::standalone()
.with_openai("test-key")
.build()
.unwrap();
let request = ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::user("Hello")],
..Default::default()
};
let result = client.chat_completion(request).await;
// Handle result based on test environment
}
}§Performance Tips
For optimal performance:
- Use Connection Pooling: Configure appropriate pool sizes
- Enable Caching: Cache responses for repeated requests
- Configure Timeouts: Set appropriate timeouts for your use case
- Use Streaming: For long responses, use streaming endpoints
- Batch Requests: Group multiple requests when possible
§Migration from Other SDKs
§From OpenAI SDK
// Before
use openai::Client;
let client = Client::new("your-key");
let response = client.chat().create(request).await?;
// After
use ultrafast_models_sdk::UltrafastClient;
let client = UltrafastClient::standalone()
.with_openai("your-key")
.build()?;
let response = client.chat_completion(request).await?;§From Anthropic SDK
// Before
use anthropic::Client;
let client = Client::new("your-key");
let response = client.messages().create(request).await?;
// After
use ultrafast_models_sdk::UltrafastClient;
let client = UltrafastClient::standalone()
.with_anthropic("your-key")
.build()?;
let response = client.chat_completion(request).await?;§Troubleshooting
Common issues and solutions:
§Authentication Errors
- Verify API keys are correct
- Check API key permissions
- Ensure proper provider configuration
§Rate Limit Issues
- Implement exponential backoff
- Use multiple API keys
- Configure appropriate rate limits
§Connection Issues
- Check network connectivity
- Verify provider endpoints
- Configure appropriate timeouts
§Contributing
We welcome contributions! Please see our contributing guide for details on:
- Code style and formatting
- Testing requirements
- Documentation standards
- Pull request process
Structs§
- Connection
Pool - Connection pool for HTTP connections.
- Gateway
Client Builder - Pooled
Connection - A pooled HTTP connection.
- Retry
Policy - Retry policy configuration.
- Standalone
Client Builder - Builder for creating standalone mode
UltrafastClientinstances. - Ultrafast
Client - The main client for interacting with multiple AI/LLM providers.
- Ultrafast
Client Builder - Builder for creating
UltrafastClientinstances with custom configuration.
Enums§
- Client
Mode - Client operation mode.