Expand description
§Ultrafast Gateway Library
A high-performance AI gateway built in Rust that provides a unified interface to multiple LLM providers with advanced routing, caching, and monitoring capabilities.
§Overview
The Ultrafast Gateway is designed to be a production-ready, enterprise-grade solution for managing multiple AI/LLM providers through a single, unified API. It supports both standalone mode for direct provider calls and gateway mode for centralized server operations.
§Key Features
- Multi-Provider Support: Unified interface for 10+ LLM providers (OpenAI, Anthropic, Azure, etc.)
- Advanced Routing: Load balancing, fallback, conditional routing, and A/B testing
- Enterprise Security: Authentication, rate limiting, request validation, and content filtering
- High Performance: <1ms routing overhead, 10,000+ requests/second throughput
- Real-time Monitoring: Comprehensive metrics, cost tracking, and health monitoring
- Caching & Optimization: Redis and in-memory caching with JSON optimization
- Fault Tolerance: Circuit breakers, automatic failover, and error recovery
§Architecture
The library is organized into several core modules:
auth
: Authentication, authorization, and rate limitingconfig
: Configuration management and validationserver
: HTTP server setup and request handlinghandlers
: API endpoint handlers and business logicmiddleware
: Request/response middleware and validationmetrics
: Performance monitoring and analyticsgateway_caching
: Caching layer with Redis supportadvanced_routing
: Intelligent request routing strategieserror_handling
: Comprehensive error handling utilities
§Quick Start
use ultrafast_gateway::{create_server, config::Config};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load configuration
let config = Config::from_file("config.toml")?;
// Create and start the server
let app = create_server(config).await?;
// The server is now ready to handle requests
Ok(())
}
§Provider Integration
The gateway supports multiple providers through a unified interface:
use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};
let client = UltrafastClient::standalone()
.with_openai("your-openai-key")
.with_anthropic("your-anthropic-key")
.build()?;
let response = client.chat_completion(ChatRequest {
model: "gpt-4".to_string(),
messages: vec![Message::user("Hello, world!")],
max_tokens: Some(100),
temperature: Some(0.7),
..Default::default()
}).await?;
§Configuration
The gateway uses TOML configuration files for easy setup:
[server]
host = "0.0.0.0"
port = 3000
[providers.openai]
enabled = true
api_key = "your-openai-key"
base_url = "https://api.openai.com/v1"
[auth]
enabled = true
jwt_secret = "your-jwt-secret"
§Performance
- Latency: <1ms routing overhead
- Throughput: 10,000+ requests/second
- Concurrency: 100,000+ concurrent connections
- Memory: <1GB under normal load
- Uptime: 99.9% with automatic failover
§Security
The gateway implements enterprise-grade security features:
- API Key Management: Virtual API keys with rate limiting
- JWT Authentication: Stateless token-based authentication
- Request Validation: Comprehensive input sanitization
- Content Filtering: Plugin-based content moderation
- Rate Limiting: Per-user and per-provider limits
§Monitoring & Observability
Built-in monitoring capabilities include:
- Real-time Dashboard: WebSocket-based live metrics
- Performance Metrics: Latency, throughput, and error rates
- Provider Health: Circuit breaker status and health checks
- Cost Tracking: Real-time cost monitoring per provider
- Logging: Structured logging with multiple output formats
§Examples
§Basic Server Setup
use ultrafast_gateway::{create_server, config::Config};
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Load configuration from file
let config = Config::from_file("config.toml")?;
// Create and start the server
let app = create_server(config).await?;
println!("Gateway server started successfully!");
Ok(())
}
§Custom Configuration
use ultrafast_gateway::config::{Config, ServerConfig, ProviderConfig};
let config = Config {
server: ServerConfig {
host: "0.0.0.0".to_string(),
port: 8080,
timeout: Duration::from_secs(30),
..Default::default()
},
providers: HashMap::new(),
auth: Default::default(),
cache: Default::default(),
routing: Default::default(),
metrics: Default::default(),
logging: Default::default(),
plugins: Vec::new(),
};
§Plugin System
use ultrafast_gateway::plugins::{Plugin, PluginConfig};
#[derive(Debug)]
struct CustomPlugin;
impl Plugin for CustomPlugin {
fn name(&self) -> &'static str { "custom_plugin" }
fn process_request(&self, request: &mut Request) -> Result<(), Error> {
// Custom request processing logic
Ok(())
}
}
§Error Handling
The gateway provides comprehensive error handling:
use ultrafast_gateway::error_handling::{ErrorHandler, ErrorType};
match result {
Ok(response) => println!("Success: {:?}", response),
Err(ErrorType::AuthenticationError) => println!("Authentication failed"),
Err(ErrorType::RateLimitExceeded) => println!("Rate limit exceeded"),
Err(ErrorType::ProviderUnavailable) => println!("Provider unavailable"),
Err(e) => println!("Other error: {:?}", e),
}
§Testing
The library includes comprehensive testing utilities:
#[cfg(test)]
mod tests {
use super::*;
use tokio_test;
#[tokio_test]
async fn test_server_creation() {
let config = Config::default();
let result = create_server(config).await;
assert!(result.is_ok());
}
}
§Performance Tuning
For optimal performance, consider these settings:
[server]
# Use multiple worker threads
worker_threads = 8
[cache]
# Enable Redis for distributed caching
backend = "Redis"
ttl = "6h"
[routing]
# Aggressive health checking for fast failover
health_check_interval = "10s"
failover_threshold = 0.7
§Deployment
The gateway is designed for production deployment:
- Docker Support: Multi-stage Dockerfile with Alpine Linux
- Health Checks: Built-in health check endpoints
- Graceful Shutdown: Proper signal handling and cleanup
- Resource Limits: Configurable memory and CPU limits
- Monitoring: Prometheus metrics and Grafana dashboards
§Contributing
We welcome contributions! Please see our contributing guide for details on:
- Code style and formatting
- Testing requirements
- Documentation standards
- Pull request process
§License
This project is licensed under the MIT License - see the LICENSE file for details.
§Support
For support and questions:
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Documentation: Project Wiki
Re-exports§
pub use server::create_server;
Modules§
- advanced_
routing - Advanced Routing and Load Balancing Module
- auth
- Authentication and Authorization Module
- config
- Configuration Management Module
- dashboard
- Dashboard Module
- error_
handling - Error Handling and Validation Module
- gateway_
caching - Gateway Caching Module
- gateway_
error - Gateway Error Types Module
- handlers
- HTTP Request Handlers Module
- json_
optimization - JSON Optimization Module
- metrics
- Metrics and Monitoring Module
- middleware
- HTTP Middleware Module
- plugins
- Plugin System Module
- request_
context - Request Context Module
- server
- HTTP Server Module
Macros§
- error_
context - Macro for creating error context
- handle_
error - Macro for handling errors with context
- validate_
config - Macro for validating configuration