Expand description
Multi-Tier Caching System for LLM Edge Agent
This module implements a high-performance multi-tier caching system with:
- L1: In-memory cache (Moka) - <1ms latency, TinyLFU eviction
- L2: Distributed cache (Redis) - 1-2ms latency, persistent across instances
§Architecture
Request → L1 Lookup (in-memory)
├─ HIT → Return (0.1ms)
└─ MISS
↓
L2 Lookup (Redis)
├─ HIT → Populate L1 + Return (2ms)
└─ MISS
↓
Provider Execution
↓
Async Write → L1 + L2 (non-blocking)§Performance Targets
- L1 Latency: <1ms (typically <100μs)
- L2 Latency: 1-2ms
- Overall Hit Rate: >50% (MVP), >70% (Beta)
- L1 TTL: 5 minutes (default)
- L2 TTL: 1 hour (default)
Modules§
- key
- Cache key generation using SHA-256 hashing
- l1
- L1 In-Memory Cache using Moka
- l2
- L2 Distributed Cache using Redis
- metrics
- Cache metrics tracking and reporting
Structs§
- Cache
Health Status - Cache health status
- Cache
Manager - Multi-tier cache orchestrator
Enums§
- Cache
Lookup Result - Result of a cache lookup operation