UltrafastClient

ultrafast_models_sdk::client

Struct UltrafastClient

pub struct UltrafastClient { /* private fields */ }

Expand description

The main client for interacting with multiple AI/LLM providers.

The UltrafastClient provides a unified interface to multiple AI providers with intelligent routing, circuit breakers, caching, and comprehensive error handling.

§Modes

The client supports two operation modes:

Standalone Mode: Direct communication with AI providers
Gateway Mode: Communication through the Ultrafast Gateway

§Features

Multi-Provider Support: Integrate with OpenAI, Anthropic, Google, and more
Intelligent Routing: Automatic provider selection and load balancing
Circuit Breakers: Automatic failover and recovery
Response Caching: Built-in caching for performance
Rate Limiting: Per-provider rate limiting
Retry Logic: Configurable retry policies with exponential backoff
Performance Metrics: Real-time provider performance tracking
Streaming Support: Real-time response streaming

§Examples

§Basic Usage

use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let client = UltrafastClient::standalone()
        .with_openai("your-openai-key")
        .build()?;

    let request = ChatRequest {
        model: "gpt-4".to_string(),
        messages: vec![Message::user("Hello, world!")],
        ..Default::default()
    };

    let response = client.chat_completion(request).await?;
    println!("Response: {}", response.choices[0].message.content);
    Ok(())
}

§Multi-Provider Setup

let client = UltrafastClient::standalone()
    .with_openai("openai-key")
    .with_anthropic("anthropic-key")
    .with_google("google-key", "project-id")
    .with_routing_strategy(RoutingStrategy::LoadBalance {
        weights: vec![0.4, 0.3, 0.2, 0.1],
    })
    .build()?;

§Gateway Mode

let client = UltrafastClient::gateway("http://localhost:3000")
    .with_api_key("your-gateway-key")
    .with_timeout(Duration::from_secs(30))
    .build()?;

§Thread Safety

The client is thread-safe and can be shared across threads using Arc<UltrafastClient>.

§Performance

Latency: <1ms routing overhead
Throughput: 10,000+ requests/second
Memory: <100MB under normal load
Concurrency: 100,000+ concurrent requests

§Error Handling

The client provides comprehensive error handling with specific error types:

AuthenticationError: Invalid API keys or tokens
RateLimitExceeded: Exceeded rate limits with retry information
ProviderError: Provider-specific error messages
NetworkError: Connection and timeout issues
ValidationError: Invalid request parameters

§Circuit Breakers

Each provider has an independent circuit breaker that automatically:

Opens when failure threshold is reached
Prevents requests to failing providers
Tests recovery with limited requests
Automatically closes when provider recovers

§Caching

The client supports multiple caching backends:

In-Memory Cache: Fast local caching (default)
Redis Cache: Distributed caching for multiple instances
Custom Backends: Extensible cache system

§Rate Limiting

Per-provider rate limiting with:

Request-based limits (requests per minute/hour)
Token-based limits (tokens per minute)
Burst handling with configurable burst sizes
Automatic retry with exponential backoff

§Metrics

Real-time performance metrics including:

Provider response times
Success/failure rates
Circuit breaker status
Cache hit rates
Rate limit usage

§Configuration

The client is highly configurable with:

Per-provider timeouts and retry policies
Global and per-provider rate limits
Circuit breaker thresholds and recovery settings
Cache TTL and size limits
Connection pool sizes and timeouts

§Best Practices

Use connection pooling for high-throughput applications
Enable caching for repeated requests
Configure appropriate timeouts for your use case
Use streaming for long responses
Monitor circuit breaker status
Implement proper error handling and retry logic

§See Also

UltrafastClientBuilder - For building client instances
Provider - For custom provider implementations
Router - For custom routing strategies
Cache - For custom caching backends

Implementations§

impl UltrafastClient

pub fn new() -> UltrafastClientBuilder

pub fn standalone() -> StandaloneClientBuilder

pub fn gateway(base_url: String) -> GatewayClientBuilder

pub async fn chat_completion( &self, request: ChatRequest, ) -> Result<ChatResponse, ClientError>

pub async fn stream_chat_completion( &self, request: ChatRequest, ) -> Result<Box<dyn Stream<Item = Result<StreamChunk, ClientError>> + Send + Unpin>, ClientError>

pub async fn get_last_used_provider(&self) -> Option<String>

pub async fn get_provider_circuit_state( &self, provider_id: &str, ) -> Option<CircuitState>

pub async fn is_provider_healthy(&self, provider_id: &str) -> bool

pub async fn get_circuit_breaker_metrics( &self, ) -> HashMap<String, CircuitBreakerMetrics>

pub async fn get_provider_health_status(&self) -> HashMap<String, bool>

pub async fn embedding( &self, request: EmbeddingRequest, ) -> Result<EmbeddingResponse, ClientError>

pub async fn image_generation( &self, request: ImageRequest, ) -> Result<ImageResponse, ClientError>

pub async fn audio_transcription( &self, request: AudioRequest, ) -> Result<AudioResponse, ClientError>

pub async fn text_to_speech( &self, request: SpeechRequest, ) -> Result<SpeechResponse, ClientError>

Auto Trait Implementations§

impl Freeze for UltrafastClient

impl !RefUnwindSafe for UltrafastClient

impl Send for UltrafastClient

impl Sync for UltrafastClient

impl Unpin for UltrafastClient

impl !UnwindSafe for UltrafastClient

Blanket Implementations§

impl<T> Any for T
where T: 'static + ?Sized,

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more

impl<T> Borrow<T> for T
where T: ?Sized,

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more

impl<T> BorrowMut<T> for T
where T: ?Sized,

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more

impl<T> From<T> for T

fn from(t: T) -> T

Returns the argument unchanged.

impl<T> Instrument for T

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more

impl<T, U> Into<U> for T
where U: From<T>,

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

impl<T> PolicyExt for T
where T: ?Sized,

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow only if self and other return Action::Follow. Read more

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns Action::Follow if either self or other returns Action::Follow. Read more

impl<T, U> TryFrom<U> for T
where U: Into<T>,

type Error = Infallible

The type returned in the event of a conversion error.

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

fn vzip(self) -> V

impl<T> WithSubscriber for T

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more

impl<T> ErasedDestructor for T
where T: 'static,