pub struct UltrafastClient { /* private fields */ }Expand description
The main client for interacting with multiple AI/LLM providers.
The UltrafastClient provides a unified interface to multiple AI providers
with intelligent routing, circuit breakers, caching, and comprehensive error handling.
§Modes
The client supports two operation modes:
- Standalone Mode: Direct communication with AI providers
- Gateway Mode: Communication through the Ultrafast Gateway
§Features
- Multi-Provider Support: Integrate with OpenAI, Anthropic, Google, and more
- Intelligent Routing: Automatic provider selection and load balancing
- Circuit Breakers: Automatic failover and recovery
- Response Caching: Built-in caching for performance
- Rate Limiting: Per-provider rate limiting
- Retry Logic: Configurable retry policies with exponential backoff
- Performance Metrics: Real-time provider performance tracking
- Streaming Support: Real-time response streaming
§Examples
§Basic Usage
use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = UltrafastClient::standalone()
.with_openai("your-openai-key")
.build()?;
let request = ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::user("Hello, world!")],
..Default::default()
};
let response = client.chat_completion(request).await?;
println!("Response: {}", response.choices[0].message.content);
Ok(())
}§Multi-Provider Setup
let client = UltrafastClient::standalone()
.with_openai("openai-key")
.with_anthropic("anthropic-key")
.with_google("google-key", "project-id")
.with_routing_strategy(RoutingStrategy::LoadBalance {
weights: vec![0.4, 0.3, 0.2, 0.1],
})
.build()?;§Gateway Mode
let client = UltrafastClient::gateway("http://localhost:3000")
.with_api_key("your-gateway-key")
.with_timeout(Duration::from_secs(30))
.build()?;§Thread Safety
The client is thread-safe and can be shared across threads using Arc<UltrafastClient>.
§Performance
- Latency: <1ms routing overhead
- Throughput: 10,000+ requests/second
- Memory: <100MB under normal load
- Concurrency: 100,000+ concurrent requests
§Error Handling
The client provides comprehensive error handling with specific error types:
AuthenticationError: Invalid API keys or tokensRateLimitExceeded: Exceeded rate limits with retry informationProviderError: Provider-specific error messagesNetworkError: Connection and timeout issuesValidationError: Invalid request parameters
§Circuit Breakers
Each provider has an independent circuit breaker that automatically:
- Opens when failure threshold is reached
- Prevents requests to failing providers
- Tests recovery with limited requests
- Automatically closes when provider recovers
§Caching
The client supports multiple caching backends:
- In-Memory Cache: Fast local caching (default)
- Redis Cache: Distributed caching for multiple instances
- Custom Backends: Extensible cache system
§Rate Limiting
Per-provider rate limiting with:
- Request-based limits (requests per minute/hour)
- Token-based limits (tokens per minute)
- Burst handling with configurable burst sizes
- Automatic retry with exponential backoff
§Metrics
Real-time performance metrics including:
- Provider response times
- Success/failure rates
- Circuit breaker status
- Cache hit rates
- Rate limit usage
§Configuration
The client is highly configurable with:
- Per-provider timeouts and retry policies
- Global and per-provider rate limits
- Circuit breaker thresholds and recovery settings
- Cache TTL and size limits
- Connection pool sizes and timeouts
§Best Practices
- Use connection pooling for high-throughput applications
- Enable caching for repeated requests
- Configure appropriate timeouts for your use case
- Use streaming for long responses
- Monitor circuit breaker status
- Implement proper error handling and retry logic
§See Also
UltrafastClientBuilder- For building client instancesProvider- For custom provider implementationsRouter- For custom routing strategiesCache- For custom caching backends
Implementations§
Source§impl UltrafastClient
impl UltrafastClient
pub fn new() -> UltrafastClientBuilder
pub fn standalone() -> StandaloneClientBuilder
pub fn gateway(base_url: String) -> GatewayClientBuilder
pub async fn chat_completion( &self, request: ChatRequest, ) -> Result<ChatResponse, ClientError>
pub async fn stream_chat_completion( &self, request: ChatRequest, ) -> Result<Box<dyn Stream<Item = Result<StreamChunk, ClientError>> + Send + Unpin>, ClientError>
pub async fn get_last_used_provider(&self) -> Option<String>
pub async fn get_provider_circuit_state( &self, provider_id: &str, ) -> Option<CircuitState>
pub async fn is_provider_healthy(&self, provider_id: &str) -> bool
pub async fn get_circuit_breaker_metrics( &self, ) -> HashMap<String, CircuitBreakerMetrics>
pub async fn get_provider_health_status(&self) -> HashMap<String, bool>
pub async fn embedding( &self, request: EmbeddingRequest, ) -> Result<EmbeddingResponse, ClientError>
pub async fn image_generation( &self, request: ImageRequest, ) -> Result<ImageResponse, ClientError>
pub async fn audio_transcription( &self, request: AudioRequest, ) -> Result<AudioResponse, ClientError>
pub async fn text_to_speech( &self, request: SpeechRequest, ) -> Result<SpeechResponse, ClientError>
Auto Trait Implementations§
impl Freeze for UltrafastClient
impl !RefUnwindSafe for UltrafastClient
impl Send for UltrafastClient
impl Sync for UltrafastClient
impl Unpin for UltrafastClient
impl !UnwindSafe for UltrafastClient
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Mutably borrows from an owned value. Read more