Module client

Expand description

§Ultrafast Client Module

This module provides the main client implementation for the Ultrafast Models SDK. It includes both standalone and gateway modes, with comprehensive provider management, routing, caching, and error handling.

§Overview

The client module provides:

Dual Mode Operation: Standalone and gateway modes
Provider Management: Multiple AI provider integration
Intelligent Routing: Automatic provider selection
Circuit Breakers: Automatic failover and recovery
Caching Layer: Response caching for performance
Retry Logic: Configurable retry policies
Metrics Collection: Performance monitoring
Streaming Support: Real-time response streaming

§Client Modes

§Standalone Mode

Direct communication with AI providers:

use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};

let client = UltrafastClient::standalone()
    .with_openai("your-openai-key")
    .with_anthropic("your-anthropic-key")
    .with_routing_strategy(RoutingStrategy::LoadBalance {
        weights: vec![0.6, 0.4],
    })
    .build()?;

let response = client.chat_completion(ChatRequest {
    model: "gpt-4".to_string(),
    messages: vec![Message::user("Hello!")],
    ..Default::default()
}).await?;

§Gateway Mode

Communication through the Ultrafast Gateway:

let client = UltrafastClient::gateway("http://localhost:3000")
    .with_api_key("your-gateway-key")
    .with_timeout(Duration::from_secs(30))
    .build()?;

let response = client.chat_completion(request).await?;

§Provider Integration

The client supports multiple providers:

OpenAI: GPT-4, GPT-3.5, and other models
Anthropic: Claude-3, Claude-2, Claude Instant
Google: Gemini Pro, Gemini Pro Vision, PaLM
Azure OpenAI: Azure-hosted OpenAI models
Ollama: Local and remote Ollama instances
Mistral AI: Mistral 7B, Mixtral models
Cohere: Command, Command R models
Custom Providers: Extensible provider system

§Routing Strategies

Multiple routing strategies for provider selection:

Single: Route all requests to one provider
Load Balance: Distribute requests across providers
Failover: Primary provider with automatic fallback
Conditional: Route based on request characteristics
A/B Testing: Route for testing different providers

§Circuit Breakers

Automatic failover and recovery mechanisms:

Closed State: Normal operation
Open State: Provider failing, requests blocked
Half-Open State: Testing if provider recovered
Automatic Recovery: Automatic state transitions

§Caching

Built-in response caching:

In-Memory Cache: Fast local caching
Redis Cache: Distributed caching
Automatic TTL: Configurable cache expiration
Cache Keys: Intelligent cache key generation

§Retry Logic

Configurable retry policies:

Exponential Backoff: Smart retry delays
Max Retries: Configurable retry limits
Retryable Errors: Automatic retry on specific errors
Jitter: Randomized retry delays to prevent thundering herd

§Performance Features

Connection Pooling: Reusable HTTP connections
Request Batching: Batch multiple requests
Compression: Automatic request/response compression
Async Operations: Non-blocking I/O throughout
Memory Efficiency: Minimal memory footprint

§Error Handling

Comprehensive error handling with specific error types:

Authentication Errors: Invalid API keys or tokens
Rate Limit Errors: Exceeded rate limits with retry info
Provider Errors: Provider-specific error messages
Network Errors: Connection and timeout issues
Validation Errors: Invalid request parameters

§Configuration

Highly configurable client behavior:

Timeouts: Per-request and per-provider timeouts
Rate Limits: Per-provider rate limiting
Circuit Breakers: Failure thresholds and recovery settings
Caching: Cache TTL and size limits
Logging: Structured logging configuration

§Examples

§Basic Usage

use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = UltrafastClient::standalone()
        .with_openai("your-key")
        .build()?;

    let request = ChatRequest {
        model: "gpt-4".to_string(),
        messages: vec![Message::user("Hello, world!")],
        ..Default::default()
    };

    let response = client.chat_completion(request).await?;
    println!("Response: {}", response.choices[0].message.content);
    Ok(())
}

§Multi-Provider Setup

let client = UltrafastClient::standalone()
    .with_openai("openai-key")
    .with_anthropic("anthropic-key")
    .with_google("google-key", "project-id")
    .with_ollama("http://localhost:11434")
    .with_routing_strategy(RoutingStrategy::LoadBalance {
        weights: vec![0.4, 0.3, 0.2, 0.1],
    })
    .build()?;

§Advanced Configuration

use std::time::Duration;
use ultrafast_models_sdk::{UltrafastClient, ClientConfig};

let config = ClientConfig {
    timeout: Duration::from_secs(30),
    max_retries: 5,
    retry_delay: Duration::from_secs(1),
    user_agent: Some("MyApp/1.0".to_string()),
    ..Default::default()
};

let client = UltrafastClient::standalone()
    .with_config(config)
    .with_openai("your-key")
    .build()?;

§Circuit Breaker Configuration

use ultrafast_models_sdk::circuit_breaker::CircuitBreakerConfig;

let circuit_config = CircuitBreakerConfig {
    failure_threshold: 5,
    recovery_timeout: Duration::from_secs(60),
    request_timeout: Duration::from_secs(30),
    half_open_max_calls: 3,
};

let client = UltrafastClient::standalone()
    .with_openai("your-key")
    .with_circuit_breaker_config(circuit_config)
    .build()?;

§Caching Configuration

use ultrafast_models_sdk::cache::CacheConfig;

let cache_config = CacheConfig {
    enabled: true,
    ttl: Duration::from_hours(1),
    max_size: 1000,
    backend: CacheBackend::Memory,
};

let client = UltrafastClient::standalone()
    .with_cache_config(cache_config)
    .with_openai("your-key")
    .build()?;

§Testing

The client includes testing utilities:

#[cfg(test)]
mod tests {
    use super::*;
    use tokio_test;

    #[tokio_test]
    async fn test_client_creation() {
        let client = UltrafastClient::standalone()
            .with_openai("test-key")
            .build();
        assert!(client.is_ok());
    }

    #[tokio_test]
    async fn test_chat_completion() {
        let client = UltrafastClient::standalone()
            .with_openai("test-key")
            .build()
            .unwrap();

        let request = ChatRequest {
            model: "gpt-4".to_string(),
            messages: vec![Message::user("Hello")],
            ..Default::default()
        };

        let result = client.chat_completion(request).await;
        // Handle result based on test environment
    }
}

§Performance Tips

For optimal performance:

Use Connection Pooling: Configure appropriate pool sizes
Enable Caching: Cache responses for repeated requests
Configure Timeouts: Set appropriate timeouts for your use case
Use Streaming: For long responses, use streaming endpoints
Batch Requests: Group multiple requests when possible

§Migration from Other SDKs

§From OpenAI SDK

// Before
use openai::Client;
let client = Client::new("your-key");
let response = client.chat().create(request).await?;

// After
use ultrafast_models_sdk::UltrafastClient;
let client = UltrafastClient::standalone()
    .with_openai("your-key")
    .build()?;
let response = client.chat_completion(request).await?;

§From Anthropic SDK

// Before
use anthropic::Client;
let client = Client::new("your-key");
let response = client.messages().create(request).await?;

// After
use ultrafast_models_sdk::UltrafastClient;
let client = UltrafastClient::standalone()
    .with_anthropic("your-key")
    .build()?;
let response = client.chat_completion(request).await?;

§Troubleshooting

Common issues and solutions:

§Authentication Errors

Verify API keys are correct
Check API key permissions
Ensure proper provider configuration

§Rate Limit Issues

Implement exponential backoff
Use multiple API keys
Configure appropriate rate limits

§Connection Issues

Check network connectivity
Verify provider endpoints
Configure appropriate timeouts

§Contributing

We welcome contributions! Please see our contributing guide for details on:

Code style and formatting
Testing requirements
Documentation standards
Pull request process

Structs§

ConnectionPool: Connection pool for HTTP connections.
GatewayClientBuilder
PooledConnection: A pooled HTTP connection.
RetryPolicy: Retry policy configuration.
StandaloneClientBuilder: Builder for creating standalone mode UltrafastClient instances.
UltrafastClient: The main client for interacting with multiple AI/LLM providers.
UltrafastClientBuilder: Builder for creating UltrafastClient instances with custom configuration.

Enums§

ClientMode: Client operation mode.

Module client

Module client Copy item path

§Ultrafast Client Module

§Overview

§Client Modes

§Standalone Mode

§Gateway Mode

§Provider Integration

§Routing Strategies

§Circuit Breakers

§Caching

§Retry Logic

§Performance Features

§Error Handling

§Configuration

§Examples

§Basic Usage

§Multi-Provider Setup

§Advanced Configuration

§Circuit Breaker Configuration

§Caching Configuration

§Testing

§Performance Tips

§Migration from Other SDKs

§From OpenAI SDK

§From Anthropic SDK

§Troubleshooting

§Authentication Errors

§Rate Limit Issues

§Connection Issues

§Contributing

Structs§

Enums§

Module client