Expand description
§Gateway Caching Module
This module provides comprehensive caching functionality for the Ultrafast Gateway, supporting both in-memory and Redis-based caching with automatic expiration and performance optimization.
§Overview
The caching system provides:
- Dual Backend Support: In-memory and Redis caching
- Automatic Expiration: TTL-based cache invalidation
- Fallback Mechanism: Redis to memory fallback on failures
- Performance Optimization: Reduces API calls and improves response times
- Cache Statistics: Hit rates, memory usage, and performance metrics
- Atomic Operations: Thread-safe cache operations
- Key Management: Structured cache key generation
§Cache Backends
§In-Memory Caching
Fast local caching suitable for single-instance deployments:
- Low Latency: Sub-millisecond access times
- Memory Efficient: Configurable size limits
- Automatic Cleanup: Expired entries removed automatically
- Thread Safe: Concurrent access support
§Redis Caching
Distributed caching for multi-instance deployments:
- Shared State: Cache shared across multiple instances
- Persistence: Optional data persistence
- High Availability: Redis cluster support
- Atomic Operations: Thread-safe distributed operations
§Cache Key Strategy
The system uses structured cache keys for different content types:
- Chat Completions:
chat:{model}:{messages_hash}
- Embeddings:
embedding:{model}:{input_hash}
- Image Generation:
image:{model}:{prompt_hash}
§Usage
use ultrafast_gateway::gateway_caching::{CacheManager, CacheKeyBuilder};
use ultrafast_gateway::config::CacheConfig;
// Initialize cache manager
let config = CacheConfig {
enabled: true,
backend: CacheBackend::Redis { url: "redis://localhost:6379".to_string() },
ttl: Duration::from_secs(3600),
max_size: 1000,
};
let cache_manager = CacheManager::new(config).await?;
// Cache a chat completion
let key = CacheKeyBuilder::chat_completion_key("gpt-4", &messages_hash);
cache_manager.set(&key, response_data, None).await;
// Retrieve from cache
if let Some(cached_response) = cache_manager.get(&key).await {
return Ok(cached_response);
}
§Configuration
Cache behavior can be configured via CacheConfig
:
[cache]
enabled = true
backend = "redis" # or "memory"
ttl = "1h"
max_size = 1000
§Performance Benefits
The caching system provides significant performance improvements:
- Reduced Latency: Cached responses served in <1ms
- Lower Costs: Fewer API calls to providers
- Improved Throughput: Higher request handling capacity
- Better User Experience: Faster response times
§Cache Invalidation
The system supports multiple invalidation strategies:
- TTL-based: Automatic expiration after configured time
- Manual Invalidation: Explicit cache entry removal
- Pattern-based: Remove entries matching patterns
- Full Clear: Clear entire cache (admin only)
§Monitoring
Cache performance can be monitored through:
- Hit Rates: Cache effectiveness metrics
- Memory Usage: Current cache size and memory consumption
- Latency: Cache access times
- Error Rates: Cache operation failures
Structs§
- Cache
Entry - A cache entry containing data and metadata.
- Cache
KeyBuilder - Cache
Manager - Cache manager for handling both Redis and in-memory caching.
- Cache
Stats