Module cache

Module cache

Expand description

Model Cache and Registry

Hierarchical caching system implementing Toyota Way Just-In-Time principles:

Right amount: Cache only what’s needed for current inference
Right time: Prefetch before access, evict after use
Right place: L1 = hot, L2 = warm, L3 = cold storage

§Cache Hierarchy

L0: Trueno Tensor Cache (SIMD-aligned, in-register)
L1: Hot Model Cache (heap-allocated, aligned buffers)
L2: Warm Disk Cache (memory-mapped files)
L3: Cold Storage (filesystem or network)

§References

[Megiddo & Modha 2003] ARC: A Self-Tuning, Low Overhead Replacement Cache

Structs§

AccessStats: Access statistics for cache entries
CacheConfig: Cache configuration
CacheEntry: Cache entry with comprehensive metadata
CacheMetadata: Cache entry metadata
CacheStats: Cache statistics
MemoryBudget: Memory budget enforcement (Heijunka principle)
ModelInfo: Model information for registry
ModelRegistry: Model registry for bundled and dynamic models
ModelType: Model type identifier

Enums§

CacheData: Cached data variants
CacheTier: Cache tier classification
EvictionPolicy: Eviction policy selection (configurable per deployment)