Module client

Module client 

Source
Expand description

§Ultrafast Client Module

This module provides the main client implementation for the Ultrafast Models SDK. It includes both standalone and gateway modes, with comprehensive provider management, routing, caching, and error handling.

§Overview

The client module provides:

  • Dual Mode Operation: Standalone and gateway modes
  • Provider Management: Multiple AI provider integration
  • Intelligent Routing: Automatic provider selection
  • Circuit Breakers: Automatic failover and recovery
  • Caching Layer: Response caching for performance
  • Retry Logic: Configurable retry policies
  • Metrics Collection: Performance monitoring
  • Streaming Support: Real-time response streaming

§Client Modes

§Standalone Mode

Direct communication with AI providers:

use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};

let client = UltrafastClient::standalone()
    .with_openai("your-openai-key")
    .with_anthropic("your-anthropic-key")
    .with_routing_strategy(RoutingStrategy::LoadBalance {
        weights: vec![0.6, 0.4],
    })
    .build()?;

let response = client.chat_completion(ChatRequest {
    model: "gpt-4".to_string(),
    messages: vec![Message::user("Hello!")],
    ..Default::default()
}).await?;

§Gateway Mode

Communication through the Ultrafast Gateway:

let client = UltrafastClient::gateway("http://localhost:3000")
    .with_api_key("your-gateway-key")
    .with_timeout(Duration::from_secs(30))
    .build()?;

let response = client.chat_completion(request).await?;

§Provider Integration

The client supports multiple providers:

  • OpenAI: GPT-4, GPT-3.5, and other models
  • Anthropic: Claude-3, Claude-2, Claude Instant
  • Google: Gemini Pro, Gemini Pro Vision, PaLM
  • Azure OpenAI: Azure-hosted OpenAI models
  • Ollama: Local and remote Ollama instances
  • Mistral AI: Mistral 7B, Mixtral models
  • Cohere: Command, Command R models
  • Custom Providers: Extensible provider system

§Routing Strategies

Multiple routing strategies for provider selection:

  • Single: Route all requests to one provider
  • Load Balance: Distribute requests across providers
  • Failover: Primary provider with automatic fallback
  • Conditional: Route based on request characteristics
  • A/B Testing: Route for testing different providers

§Circuit Breakers

Automatic failover and recovery mechanisms:

  • Closed State: Normal operation
  • Open State: Provider failing, requests blocked
  • Half-Open State: Testing if provider recovered
  • Automatic Recovery: Automatic state transitions

§Caching

Built-in response caching:

  • In-Memory Cache: Fast local caching
  • Redis Cache: Distributed caching
  • Automatic TTL: Configurable cache expiration
  • Cache Keys: Intelligent cache key generation

§Retry Logic

Configurable retry policies:

  • Exponential Backoff: Smart retry delays
  • Max Retries: Configurable retry limits
  • Retryable Errors: Automatic retry on specific errors
  • Jitter: Randomized retry delays to prevent thundering herd

§Performance Features

  • Connection Pooling: Reusable HTTP connections
  • Request Batching: Batch multiple requests
  • Compression: Automatic request/response compression
  • Async Operations: Non-blocking I/O throughout
  • Memory Efficiency: Minimal memory footprint

§Error Handling

Comprehensive error handling with specific error types:

  • Authentication Errors: Invalid API keys or tokens
  • Rate Limit Errors: Exceeded rate limits with retry info
  • Provider Errors: Provider-specific error messages
  • Network Errors: Connection and timeout issues
  • Validation Errors: Invalid request parameters

§Configuration

Highly configurable client behavior:

  • Timeouts: Per-request and per-provider timeouts
  • Rate Limits: Per-provider rate limiting
  • Circuit Breakers: Failure thresholds and recovery settings
  • Caching: Cache TTL and size limits
  • Logging: Structured logging configuration

§Examples

§Basic Usage

use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = UltrafastClient::standalone()
        .with_openai("your-key")
        .build()?;

    let request = ChatRequest {
        model: "gpt-4".to_string(),
        messages: vec![Message::user("Hello, world!")],
        ..Default::default()
    };

    let response = client.chat_completion(request).await?;
    println!("Response: {}", response.choices[0].message.content);
    Ok(())
}

§Multi-Provider Setup

let client = UltrafastClient::standalone()
    .with_openai("openai-key")
    .with_anthropic("anthropic-key")
    .with_google("google-key", "project-id")
    .with_ollama("http://localhost:11434")
    .with_routing_strategy(RoutingStrategy::LoadBalance {
        weights: vec![0.4, 0.3, 0.2, 0.1],
    })
    .build()?;

§Advanced Configuration

use std::time::Duration;
use ultrafast_models_sdk::{UltrafastClient, ClientConfig};

let config = ClientConfig {
    timeout: Duration::from_secs(30),
    max_retries: 5,
    retry_delay: Duration::from_secs(1),
    user_agent: Some("MyApp/1.0".to_string()),
    ..Default::default()
};

let client = UltrafastClient::standalone()
    .with_config(config)
    .with_openai("your-key")
    .build()?;

§Circuit Breaker Configuration

use ultrafast_models_sdk::circuit_breaker::CircuitBreakerConfig;

let circuit_config = CircuitBreakerConfig {
    failure_threshold: 5,
    recovery_timeout: Duration::from_secs(60),
    request_timeout: Duration::from_secs(30),
    half_open_max_calls: 3,
};

let client = UltrafastClient::standalone()
    .with_openai("your-key")
    .with_circuit_breaker_config(circuit_config)
    .build()?;

§Caching Configuration

use ultrafast_models_sdk::cache::CacheConfig;

let cache_config = CacheConfig {
    enabled: true,
    ttl: Duration::from_hours(1),
    max_size: 1000,
    backend: CacheBackend::Memory,
};

let client = UltrafastClient::standalone()
    .with_cache_config(cache_config)
    .with_openai("your-key")
    .build()?;

§Testing

The client includes testing utilities:

#[cfg(test)]
mod tests {
    use super::*;
    use tokio_test;

    #[tokio_test]
    async fn test_client_creation() {
        let client = UltrafastClient::standalone()
            .with_openai("test-key")
            .build();
        assert!(client.is_ok());
    }

    #[tokio_test]
    async fn test_chat_completion() {
        let client = UltrafastClient::standalone()
            .with_openai("test-key")
            .build()
            .unwrap();

        let request = ChatRequest {
            model: "gpt-4".to_string(),
            messages: vec![Message::user("Hello")],
            ..Default::default()
        };

        let result = client.chat_completion(request).await;
        // Handle result based on test environment
    }
}

§Performance Tips

For optimal performance:

  • Use Connection Pooling: Configure appropriate pool sizes
  • Enable Caching: Cache responses for repeated requests
  • Configure Timeouts: Set appropriate timeouts for your use case
  • Use Streaming: For long responses, use streaming endpoints
  • Batch Requests: Group multiple requests when possible

§Migration from Other SDKs

§From OpenAI SDK

// Before
use openai::Client;
let client = Client::new("your-key");
let response = client.chat().create(request).await?;

// After
use ultrafast_models_sdk::UltrafastClient;
let client = UltrafastClient::standalone()
    .with_openai("your-key")
    .build()?;
let response = client.chat_completion(request).await?;

§From Anthropic SDK

// Before
use anthropic::Client;
let client = Client::new("your-key");
let response = client.messages().create(request).await?;

// After
use ultrafast_models_sdk::UltrafastClient;
let client = UltrafastClient::standalone()
    .with_anthropic("your-key")
    .build()?;
let response = client.chat_completion(request).await?;

§Troubleshooting

Common issues and solutions:

§Authentication Errors

  • Verify API keys are correct
  • Check API key permissions
  • Ensure proper provider configuration

§Rate Limit Issues

  • Implement exponential backoff
  • Use multiple API keys
  • Configure appropriate rate limits

§Connection Issues

  • Check network connectivity
  • Verify provider endpoints
  • Configure appropriate timeouts

§Contributing

We welcome contributions! Please see our contributing guide for details on:

  • Code style and formatting
  • Testing requirements
  • Documentation standards
  • Pull request process

Structs§

ConnectionPool
Connection pool for HTTP connections.
GatewayClientBuilder
PooledConnection
A pooled HTTP connection.
RetryPolicy
Retry policy configuration.
StandaloneClientBuilder
Builder for creating standalone mode UltrafastClient instances.
UltrafastClient
The main client for interacting with multiple AI/LLM providers.
UltrafastClientBuilder
Builder for creating UltrafastClient instances with custom configuration.

Enums§

ClientMode
Client operation mode.