Module gateway_caching

Expand description

§Gateway Caching Module

This module provides comprehensive caching functionality for the Ultrafast Gateway, supporting both in-memory and Redis-based caching with automatic expiration and performance optimization.

§Overview

The caching system provides:

Dual Backend Support: In-memory and Redis caching
Automatic Expiration: TTL-based cache invalidation
Fallback Mechanism: Redis to memory fallback on failures
Performance Optimization: Reduces API calls and improves response times
Cache Statistics: Hit rates, memory usage, and performance metrics
Atomic Operations: Thread-safe cache operations
Key Management: Structured cache key generation

§Cache Backends

§In-Memory Caching

Fast local caching suitable for single-instance deployments:

Low Latency: Sub-millisecond access times
Memory Efficient: Configurable size limits
Automatic Cleanup: Expired entries removed automatically
Thread Safe: Concurrent access support

§Redis Caching

Distributed caching for multi-instance deployments:

Shared State: Cache shared across multiple instances
Persistence: Optional data persistence
High Availability: Redis cluster support
Atomic Operations: Thread-safe distributed operations

§Cache Key Strategy

The system uses structured cache keys for different content types:

Chat Completions: chat:{model}:{messages_hash}
Embeddings: embedding:{model}:{input_hash}
Image Generation: image:{model}:{prompt_hash}

§Usage

use ultrafast_gateway::gateway_caching::{CacheManager, CacheKeyBuilder};
use ultrafast_gateway::config::CacheConfig;

// Initialize cache manager
let config = CacheConfig {
    enabled: true,
    backend: CacheBackend::Redis { url: "redis://localhost:6379".to_string() },
    ttl: Duration::from_secs(3600),
    max_size: 1000,
};

let cache_manager = CacheManager::new(config).await?;

// Cache a chat completion
let key = CacheKeyBuilder::chat_completion_key("gpt-4", &messages_hash);
cache_manager.set(&key, response_data, None).await;

// Retrieve from cache
if let Some(cached_response) = cache_manager.get(&key).await {
    return Ok(cached_response);
}

§Configuration

Cache behavior can be configured via CacheConfig:

[cache]
enabled = true
backend = "redis"  # or "memory"
ttl = "1h"
max_size = 1000

§Performance Benefits

The caching system provides significant performance improvements:

Reduced Latency: Cached responses served in <1ms
Lower Costs: Fewer API calls to providers
Improved Throughput: Higher request handling capacity
Better User Experience: Faster response times

§Cache Invalidation

The system supports multiple invalidation strategies:

TTL-based: Automatic expiration after configured time
Manual Invalidation: Explicit cache entry removal
Pattern-based: Remove entries matching patterns
Full Clear: Clear entire cache (admin only)

§Monitoring

Cache performance can be monitored through:

Hit Rates: Cache effectiveness metrics
Memory Usage: Current cache size and memory consumption
Latency: Cache access times
Error Rates: Cache operation failures

Structs§

CacheEntry: A cache entry containing data and metadata.
CacheKeyBuilder
CacheManager: Cache manager for handling both Redis and in-memory caching.
CacheStats