Expand description
Model Cache and Registry
Hierarchical caching system implementing Toyota Way Just-In-Time principles:
- Right amount: Cache only what’s needed for current inference
- Right time: Prefetch before access, evict after use
- Right place: L1 = hot, L2 = warm, L3 = cold storage
§Cache Hierarchy
L0: Trueno Tensor Cache (SIMD-aligned, in-register)
L1: Hot Model Cache (heap-allocated, aligned buffers)
L2: Warm Disk Cache (memory-mapped files)
L3: Cold Storage (filesystem or network)§References
- [Megiddo & Modha 2003] ARC: A Self-Tuning, Low Overhead Replacement Cache
Structs§
- Access
Stats - Access statistics for cache entries
- Cache
Config - Cache configuration
- Cache
Entry - Cache entry with comprehensive metadata
- Cache
Metadata - Cache entry metadata
- Cache
Stats - Cache statistics
- Memory
Budget - Memory budget enforcement (Heijunka principle)
- Model
Info - Model information for registry
- Model
Registry - Model registry for bundled and dynamic models
- Model
Type - Model type identifier
Enums§
- Cache
Data - Cached data variants
- Cache
Tier - Cache tier classification
- Eviction
Policy - Eviction policy selection (configurable per deployment)