Crate ultrafast_gateway

Crate ultrafast_gateway 

Source
Expand description

§Ultrafast Gateway Library

A high-performance AI gateway built in Rust that provides a unified interface to multiple LLM providers with advanced routing, caching, and monitoring capabilities.

§Overview

The Ultrafast Gateway is designed to be a production-ready, enterprise-grade solution for managing multiple AI/LLM providers through a single, unified API. It supports both standalone mode for direct provider calls and gateway mode for centralized server operations.

§Key Features

  • Multi-Provider Support: Unified interface for 10+ LLM providers (OpenAI, Anthropic, Azure, etc.)
  • Advanced Routing: Load balancing, fallback, conditional routing, and A/B testing
  • Enterprise Security: Authentication, rate limiting, request validation, and content filtering
  • High Performance: <1ms routing overhead, 10,000+ requests/second throughput
  • Real-time Monitoring: Comprehensive metrics, cost tracking, and health monitoring
  • Caching & Optimization: Redis and in-memory caching with JSON optimization
  • Fault Tolerance: Circuit breakers, automatic failover, and error recovery

§Architecture

The library is organized into several core modules:

  • auth: Authentication, authorization, and rate limiting
  • config: Configuration management and validation
  • server: HTTP server setup and request handling
  • handlers: API endpoint handlers and business logic
  • middleware: Request/response middleware and validation
  • metrics: Performance monitoring and analytics
  • gateway_caching: Caching layer with Redis support
  • advanced_routing: Intelligent request routing strategies
  • error_handling: Comprehensive error handling utilities

§Quick Start

use ultrafast_gateway::{create_server, config::Config};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load configuration
    let config = Config::from_file("config.toml")?;
     
    // Create and start the server
    let app = create_server(config).await?;
     
    // The server is now ready to handle requests
    Ok(())
}

§Provider Integration

The gateway supports multiple providers through a unified interface:

use ultrafast_models_sdk::{UltrafastClient, ChatRequest, Message};

let client = UltrafastClient::standalone()
    .with_openai("your-openai-key")
    .with_anthropic("your-anthropic-key")
    .build()?;

let response = client.chat_completion(ChatRequest {
    model: "gpt-4".to_string(),
    messages: vec![Message::user("Hello, world!")],
    max_tokens: Some(100),
    temperature: Some(0.7),
    ..Default::default()
}).await?;

§Configuration

The gateway uses TOML configuration files for easy setup:

[server]
host = "0.0.0.0"
port = 3000

[providers.openai]
enabled = true
api_key = "your-openai-key"
base_url = "https://api.openai.com/v1"

[auth]
enabled = true
jwt_secret = "your-jwt-secret"

§Performance

  • Latency: <1ms routing overhead
  • Throughput: 10,000+ requests/second
  • Concurrency: 100,000+ concurrent connections
  • Memory: <1GB under normal load
  • Uptime: 99.9% with automatic failover

§Security

The gateway implements enterprise-grade security features:

  • API Key Management: Virtual API keys with rate limiting
  • JWT Authentication: Stateless token-based authentication
  • Request Validation: Comprehensive input sanitization
  • Content Filtering: Plugin-based content moderation
  • Rate Limiting: Per-user and per-provider limits

§Monitoring & Observability

Built-in monitoring capabilities include:

  • Real-time Dashboard: WebSocket-based live metrics
  • Performance Metrics: Latency, throughput, and error rates
  • Provider Health: Circuit breaker status and health checks
  • Cost Tracking: Real-time cost monitoring per provider
  • Logging: Structured logging with multiple output formats

§Examples

§Basic Server Setup

use ultrafast_gateway::{create_server, config::Config};

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Load configuration from file
    let config = Config::from_file("config.toml")?;
     
    // Create and start the server
    let app = create_server(config).await?;
     
    println!("Gateway server started successfully!");
    Ok(())
}

§Custom Configuration

use ultrafast_gateway::config::{Config, ServerConfig, ProviderConfig};

let config = Config {
    server: ServerConfig {
        host: "0.0.0.0".to_string(),
        port: 8080,
        timeout: Duration::from_secs(30),
        ..Default::default()
    },
    providers: HashMap::new(),
    auth: Default::default(),
    cache: Default::default(),
    routing: Default::default(),
    metrics: Default::default(),
    logging: Default::default(),
    plugins: Vec::new(),
};

§Plugin System

use ultrafast_gateway::plugins::{Plugin, PluginConfig};

#[derive(Debug)]
struct CustomPlugin;

impl Plugin for CustomPlugin {
    fn name(&self) -> &'static str { "custom_plugin" }
    fn process_request(&self, request: &mut Request) -> Result<(), Error> {
        // Custom request processing logic
        Ok(())
    }
}

§Error Handling

The gateway provides comprehensive error handling:

use ultrafast_gateway::error_handling::{ErrorHandler, ErrorType};

match result {
    Ok(response) => println!("Success: {:?}", response),
    Err(ErrorType::AuthenticationError) => println!("Authentication failed"),
    Err(ErrorType::RateLimitExceeded) => println!("Rate limit exceeded"),
    Err(ErrorType::ProviderUnavailable) => println!("Provider unavailable"),
    Err(e) => println!("Other error: {:?}", e),
}

§Testing

The library includes comprehensive testing utilities:

#[cfg(test)]
mod tests {
    use super::*;
    use tokio_test;

    #[tokio_test]
    async fn test_server_creation() {
        let config = Config::default();
        let result = create_server(config).await;
        assert!(result.is_ok());
    }
}

§Performance Tuning

For optimal performance, consider these settings:

[server]
# Use multiple worker threads
worker_threads = 8

[cache]
# Enable Redis for distributed caching
backend = "Redis"
ttl = "6h"

[routing]
# Aggressive health checking for fast failover
health_check_interval = "10s"
failover_threshold = 0.7

§Deployment

The gateway is designed for production deployment:

  • Docker Support: Multi-stage Dockerfile with Alpine Linux
  • Health Checks: Built-in health check endpoints
  • Graceful Shutdown: Proper signal handling and cleanup
  • Resource Limits: Configurable memory and CPU limits
  • Monitoring: Prometheus metrics and Grafana dashboards

§Contributing

We welcome contributions! Please see our contributing guide for details on:

  • Code style and formatting
  • Testing requirements
  • Documentation standards
  • Pull request process

§License

This project is licensed under the MIT License - see the LICENSE file for details.

§Support

For support and questions:

Re-exports§

pub use server::create_server;

Modules§

advanced_routing
Advanced Routing and Load Balancing Module
auth
Authentication and Authorization Module
config
Configuration Management Module
dashboard
Dashboard Module
error_handling
Error Handling and Validation Module
gateway_caching
Gateway Caching Module
gateway_error
Gateway Error Types Module
handlers
HTTP Request Handlers Module
json_optimization
JSON Optimization Module
metrics
Metrics and Monitoring Module
middleware
HTTP Middleware Module
plugins
Plugin System Module
request_context
Request Context Module
server
HTTP Server Module

Macros§

error_context
Macro for creating error context
handle_error
Macro for handling errors with context
validate_config
Macro for validating configuration