Expand description
§Ultrafast Models SDK
A high-performance Rust SDK for interacting with multiple AI/LLM providers through a unified interface. The SDK provides seamless integration with various AI services including OpenAI, Anthropic, Google, and more.
§Overview
The Ultrafast Models SDK provides:
- Unified Interface: Single API for multiple AI providers
- Intelligent Routing: Automatic provider selection and load balancing
- Circuit Breakers: Automatic failover and recovery mechanisms
- Caching Layer: Built-in response caching for performance
- Rate Limiting: Per-provider rate limiting and throttling
- Error Handling: Comprehensive error handling and retry logic
- Metrics Collection: Performance monitoring and analytics
§Supported Providers
The SDK supports a wide range of AI providers:
- OpenAI: GPT-4, GPT-3.5, and other OpenAI models
- Anthropic: Claude-3, Claude-2, and Claude Instant
- Google: Gemini Pro, Gemini Pro Vision, and PaLM
- Azure OpenAI: Azure-hosted OpenAI models
- Ollama: Local and remote Ollama instances
- Mistral AI: Mistral 7B, Mixtral, and other models
- Cohere: Command, Command R, and other Cohere models
- Custom Providers: Extensible provider system
§Quick Start
use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create a client with multiple providers
let client = UltrafastClient::standalone()
.with_openai("your-openai-key")
.with_anthropic("your-anthropic-key")
.with_ollama("http://localhost:11434")
.build()?;
// Create a chat request
let request = ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::user("Hello, world!")],
temperature: Some(0.7),
max_tokens: Some(100),
stream: Some(false),
..Default::default()
};
// Send the request
let response = client.chat_completion(request).await?;
println!("Response: {}", response.choices[0].message.content);
Ok(())
}§Client Modes
The SDK supports two client modes:
§Standalone Mode
Direct provider communication without gateway:
let client = UltrafastClient::standalone()
.with_openai("your-key")
.with_anthropic("your-key")
.build()?;§Gateway Mode
Communication through the Ultrafast Gateway:
let client = UltrafastClient::gateway("http://localhost:3000")
.with_api_key("your-gateway-key")
.build()?;§Routing Strategies
The SDK provides multiple routing strategies:
- Single Provider: Route all requests to one provider
- Load Balancing: Distribute requests across providers
- Failover: Primary provider with automatic fallback
- Conditional Routing: Route based on request characteristics
- A/B Testing: Route requests for testing different providers
use ultrafast_models_sdk::routing::RoutingStrategy;
// Load balancing with custom weights
let client = UltrafastClient::standalone()
.with_openai("openai-key")
.with_anthropic("anthropic-key")
.with_routing_strategy(RoutingStrategy::LoadBalance {
weights: vec![0.6, 0.4], // 60% OpenAI, 40% Anthropic
})
.build()?;
// Failover strategy
let client = UltrafastClient::standalone()
.with_openai("primary-key")
.with_anthropic("fallback-key")
.with_routing_strategy(RoutingStrategy::Failover)
.build()?;§Advanced Features
§Circuit Breakers
Automatic failover and recovery:
use ultrafast_models_sdk::circuit_breaker::CircuitBreakerConfig;
let client = UltrafastClient::standalone()
.with_openai("your-key")
.with_circuit_breaker_config(CircuitBreakerConfig {
failure_threshold: 5,
recovery_timeout: Duration::from_secs(60),
request_timeout: Duration::from_secs(30),
half_open_max_calls: 3,
})
.build()?;§Caching
Built-in response caching:
use ultrafast_models_sdk::cache::CacheConfig;
let client = UltrafastClient::standalone()
.with_openai("your-key")
.with_cache_config(CacheConfig {
enabled: true,
ttl: Duration::from_hours(1),
max_size: 1000,
})
.build()?;§Rate Limiting
Per-provider rate limiting:
use ultrafast_models_sdk::rate_limiting::RateLimitConfig;
let client = UltrafastClient::standalone()
.with_openai("your-key")
.with_rate_limit_config(RateLimitConfig {
requests_per_minute: 100,
tokens_per_minute: 10000,
burst_size: 10,
})
.build()?;§API Examples
§Chat Completions
use ultrafast_models_sdk::{ChatRequest, Message, Role};
let request = ChatRequest {
model: "gpt-4".to_string(),
messages: vec![
Message {
role: Role::System,
content: "You are a helpful assistant.".to_string(),
},
Message {
role: Role::User,
content: "What is the capital of France?".to_string(),
},
],
temperature: Some(0.7),
max_tokens: Some(150),
stream: Some(false),
..Default::default()
};
let response = client.chat_completion(request).await?;
println!("Response: {}", response.choices[0].message.content);§Streaming Responses
use futures::StreamExt;
let mut stream = client
.stream_chat_completion(ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::user("Tell me a story")],
stream: Some(true),
..Default::default()
})
.await?;
while let Some(chunk) = stream.next().await {
match chunk {
Ok(chunk) => {
if let Some(content) = &chunk.choices[0].delta.content {
print!("{}", content);
}
}
Err(e) => eprintln!("Error: {:?}", e),
}
}§Embeddings
use ultrafast_models_sdk::{EmbeddingRequest, EmbeddingInput};
let request = EmbeddingRequest {
model: "text-embedding-ada-002".to_string(),
input: EmbeddingInput::String("This is a test sentence.".to_string()),
..Default::default()
};
let response = client.embedding(request).await?;
println!("Embedding dimensions: {}", response.data[0].embedding.len());§Image Generation
use ultrafast_models_sdk::ImageGenerationRequest;
let request = ImageGenerationRequest {
model: "dall-e-3".to_string(),
prompt: "A beautiful sunset over the ocean".to_string(),
n: Some(1),
size: Some("1024x1024".to_string()),
..Default::default()
};
let response = client.generate_image(request).await?;
println!("Image URL: {}", response.data[0].url);§Error Handling
Comprehensive error handling with specific error types:
use ultrafast_models_sdk::error::UltrafastError;
match client.chat_completion(request).await {
Ok(response) => println!("Success: {:?}", response),
Err(UltrafastError::AuthenticationError { .. }) => {
eprintln!("Authentication failed");
}
Err(UltrafastError::RateLimitExceeded { retry_after, .. }) => {
eprintln!("Rate limit exceeded, retry after: {:?}", retry_after);
}
Err(UltrafastError::ProviderError { provider, message, .. }) => {
eprintln!("Provider {} error: {}", provider, message);
}
Err(e) => eprintln!("Other error: {:?}", e),
}§Configuration
Advanced client configuration:
use ultrafast_models_sdk::{UltrafastClient, ClientConfig};
use std::time::Duration;
let config = ClientConfig {
timeout: Duration::from_secs(30),
max_retries: 3,
retry_delay: Duration::from_secs(1),
user_agent: Some("MyApp/1.0".to_string()),
..Default::default()
};
let client = UltrafastClient::standalone()
.with_config(config)
.with_openai("your-key")
.build()?;§Testing
The SDK includes testing utilities:
#[cfg(test)]
mod tests {
use super::*;
use tokio_test;
#[tokio_test]
async fn test_chat_completion() {
let client = UltrafastClient::standalone()
.with_openai("test-key")
.build()
.unwrap();
let request = ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::user("Hello")],
..Default::default()
};
let result = client.chat_completion(request).await;
assert!(result.is_ok());
}
}§Performance Optimization
Tips for optimal performance:
// Use connection pooling
let client = UltrafastClient::standalone()
.with_connection_pool_size(10)
.with_openai("your-key")
.build()?;
// Enable compression
let client = UltrafastClient::standalone()
.with_compression(true)
.with_openai("your-key")
.build()?;
// Configure timeouts
let client = UltrafastClient::standalone()
.with_timeout(Duration::from_secs(15))
.with_openai("your-key")
.build()?;§Migration Guide
§From OpenAI SDK
// Before (OpenAI SDK)
use openai::Client;
let client = Client::new("your-key");
let response = client.chat().create(request).await?;
// After (Ultrafast SDK)
use ultrafast_models_sdk::UltrafastClient;
let client = UltrafastClient::standalone()
.with_openai("your-key")
.build()?;
let response = client.chat_completion(request).await?;§From Anthropic SDK
// Before (Anthropic SDK)
use anthropic::Client;
let client = Client::new("your-key");
let response = client.messages().create(request).await?;
// After (Ultrafast SDK)
use ultrafast_models_sdk::UltrafastClient;
let client = UltrafastClient::standalone()
.with_anthropic("your-key")
.build()?;
let response = client.chat_completion(request).await?;§Contributing
We welcome contributions! Please see our contributing guide for details on:
- Code style and formatting
- Testing requirements
- Documentation standards
- Pull request process
§License
This project is licensed under the MIT License - see the LICENSE file for details.
§Support
For support and questions:
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Project Wiki
Re-exports§
pub use circuit_breaker::CircuitBreaker;pub use circuit_breaker::CircuitBreakerConfig;pub use circuit_breaker::CircuitState;pub use client::ClientMode;pub use client::UltrafastClient;pub use client::UltrafastClientBuilder;pub use error::ClientError;pub use error::ProviderError;pub use models::AudioRequest;pub use models::AudioResponse;pub use models::ChatRequest;pub use models::ChatResponse;pub use models::Choice;pub use models::EmbeddingRequest;pub use models::EmbeddingResponse;pub use models::ImageRequest;pub use models::ImageResponse;pub use models::Message;pub use models::Role;pub use models::SpeechRequest;pub use models::SpeechResponse;pub use models::Usage;pub use providers::create_provider_with_circuit_breaker;pub use providers::Provider;pub use providers::ProviderConfig;pub use providers::ProviderMetrics;pub use routing::Condition;pub use routing::RoutingRule;pub use routing::RoutingStrategy;
Modules§
- cache
- Caching Module
- circuit_
breaker - Circuit Breaker Module
- client
- Ultrafast Client Module
- common
- error
- Error Handling Module
- models
- AI Model Types and Structures
- providers
- Provider System Module
- routing
- Intelligent Routing Module
Type Aliases§
- Result
- Result type for SDK operations.