Crate ultrafast_gateway

Expand description

§Ultrafast Gateway Library

A high-performance AI gateway built in Rust that provides a unified interface to multiple LLM providers with advanced routing, caching, and monitoring capabilities.

§Overview

The Ultrafast Gateway is designed to be a production-ready, enterprise-grade solution for managing multiple AI/LLM providers through a single, unified API. It supports both standalone mode for direct provider calls and gateway mode for centralized server operations.

§Key Features

Multi-Provider Support: Unified interface for 10+ LLM providers (OpenAI, Anthropic, Azure, etc.)
Advanced Routing: Load balancing, fallback, conditional routing, and A/B testing
Enterprise Security: Authentication, rate limiting, request validation, and content filtering
High Performance: <1ms routing overhead, 10,000+ requests/second throughput
Real-time Monitoring: Comprehensive metrics, cost tracking, and health monitoring
Caching & Optimization: Redis and in-memory caching with JSON optimization
Fault Tolerance: Circuit breakers, automatic failover, and error recovery

§Architecture

The library is organized into several core modules:

auth: Authentication, authorization, and rate limiting
config: Configuration management and validation
server: HTTP server setup and request handling
handlers: API endpoint handlers and business logic
middleware: Request/response middleware and validation
metrics: Performance monitoring and analytics
gateway_caching: Caching layer with Redis support
advanced_routing: Intelligent request routing strategies
error_handling: Comprehensive error handling utilities

§Quick Start

use ultrafast_gateway::{create_server, config::Config};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load configuration
    let config = Config::from_file("config.toml")?;
     
    // Create and start the server
    let app = create_server(config).await?;
     
    // The server is now ready to handle requests
    Ok(())
}

§Provider Integration

The gateway supports multiple providers through a unified interface:

use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};

let client = UltrafastClient::standalone()
    .with_openai("your-openai-key")
    .with_anthropic("your-anthropic-key")
    .build()?;

let response = client.chat_completion(ChatRequest {
    model: "gpt-4".to_string(),
    messages: vec![Message::user("Hello, world!")],
    max_tokens: Some(100),
    temperature: Some(0.7),
    ..Default::default()
}).await?;

§Configuration

The gateway uses TOML configuration files for easy setup:

[server]
host = "0.0.0.0"
port = 3000

[providers.openai]
enabled = true
api_key = "your-openai-key"
base_url = "https://api.openai.com/v1"

[auth]
enabled = true
jwt_secret = "your-jwt-secret"

§Performance

Latency: <1ms routing overhead
Throughput: 10,000+ requests/second
Concurrency: 100,000+ concurrent connections
Memory: <1GB under normal load
Uptime: 99.9% with automatic failover

§Security

The gateway implements enterprise-grade security features:

API Key Management: Virtual API keys with rate limiting
JWT Authentication: Stateless token-based authentication
Request Validation: Comprehensive input sanitization
Content Filtering: Plugin-based content moderation
Rate Limiting: Per-user and per-provider limits

§Monitoring & Observability

Built-in monitoring capabilities include:

Real-time Dashboard: WebSocket-based live metrics
Performance Metrics: Latency, throughput, and error rates
Provider Health: Circuit breaker status and health checks
Cost Tracking: Real-time cost monitoring per provider
Logging: Structured logging with multiple output formats

§Examples

§Basic Server Setup

use ultrafast_gateway::{create_server, config::Config};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load configuration from file
    let config = Config::from_file("config.toml")?;
     
    // Create and start the server
    let app = create_server(config).await?;
     
    println!("Gateway server started successfully!");
    Ok(())
}

§Custom Configuration

use ultrafast_gateway::config::{Config, ServerConfig, ProviderConfig};

let config = Config {
    server: ServerConfig {
        host: "0.0.0.0".to_string(),
        port: 8080,
        timeout: Duration::from_secs(30),
        ..Default::default()
    },
    providers: HashMap::new(),
    auth: Default::default(),
    cache: Default::default(),
    routing: Default::default(),
    metrics: Default::default(),
    logging: Default::default(),
    plugins: Vec::new(),
};

§Plugin System

use ultrafast_gateway::plugins::{Plugin, PluginConfig};

#[derive(Debug)]
struct CustomPlugin;

impl Plugin for CustomPlugin {
    fn name(&self) -> &'static str { "custom_plugin" }
    fn process_request(&self, request: &mut Request) -> Result<(), Error> {
        // Custom request processing logic
        Ok(())
    }
}

§Error Handling

The gateway provides comprehensive error handling:

use ultrafast_gateway::error_handling::{ErrorHandler, ErrorType};

match result {
    Ok(response) => println!("Success: {:?}", response),
    Err(ErrorType::AuthenticationError) => println!("Authentication failed"),
    Err(ErrorType::RateLimitExceeded) => println!("Rate limit exceeded"),
    Err(ErrorType::ProviderUnavailable) => println!("Provider unavailable"),
    Err(e) => println!("Other error: {:?}", e),
}

§Testing

The library includes comprehensive testing utilities:

#[cfg(test)]
mod tests {
    use super::*;
    use tokio_test;

    #[tokio_test]
    async fn test_server_creation() {
        let config = Config::default();
        let result = create_server(config).await;
        assert!(result.is_ok());
    }
}

§Performance Tuning

For optimal performance, consider these settings:

[server]
# Use multiple worker threads
worker_threads = 8

[cache]
# Enable Redis for distributed caching
backend = "Redis"
ttl = "6h"

[routing]
# Aggressive health checking for fast failover
health_check_interval = "10s"
failover_threshold = 0.7

§Deployment

The gateway is designed for production deployment:

Docker Support: Multi-stage Dockerfile with Alpine Linux
Health Checks: Built-in health check endpoints
Graceful Shutdown: Proper signal handling and cleanup
Resource Limits: Configurable memory and CPU limits
Monitoring: Prometheus metrics and Grafana dashboards

§Contributing

We welcome contributions! Please see our contributing guide for details on:

Code style and formatting
Testing requirements
Documentation standards
Pull request process

§License

This project is licensed under the MIT License - see the LICENSE file for details.

§Support

For support and questions:

Issues: GitHub Issues
Discussions: GitHub Discussions
Documentation: Project Wiki

Re-exports§

pub use server::create_server;

Modules§

advanced_routing: Advanced Routing and Load Balancing Module
auth: Authentication and Authorization Module
config: Configuration Management Module
dashboard: Dashboard Module
error_handling: Error Handling and Validation Module
gateway_caching: Gateway Caching Module
gateway_error: Gateway Error Types Module
handlers: HTTP Request Handlers Module
json_optimization: JSON Optimization Module
metrics: Metrics and Monitoring Module
middleware: HTTP Middleware Module
plugins: Plugin System Module
request_context: Request Context Module
server: HTTP Server Module

Macros§

error_context: Macro for creating error context
handle_error: Macro for handling errors with context
validate_config: Macro for validating configuration