Module cache

Module cache 

Source
Expand description

Model Cache and Registry

Hierarchical caching system implementing Toyota Way Just-In-Time principles:

  • Right amount: Cache only what’s needed for current inference
  • Right time: Prefetch before access, evict after use
  • Right place: L1 = hot, L2 = warm, L3 = cold storage

§Cache Hierarchy

L0: Trueno Tensor Cache (SIMD-aligned, in-register)
L1: Hot Model Cache (heap-allocated, aligned buffers)
L2: Warm Disk Cache (memory-mapped files)
L3: Cold Storage (filesystem or network)

§References

  • [Megiddo & Modha 2003] ARC: A Self-Tuning, Low Overhead Replacement Cache

Structs§

AccessStats
Access statistics for cache entries
CacheConfig
Cache configuration
CacheEntry
Cache entry with comprehensive metadata
CacheMetadata
Cache entry metadata
CacheStats
Cache statistics
MemoryBudget
Memory budget enforcement (Heijunka principle)
ModelInfo
Model information for registry
ModelRegistry
Model registry for bundled and dynamic models
ModelType
Model type identifier

Enums§

CacheData
Cached data variants
CacheTier
Cache tier classification
EvictionPolicy
Eviction policy selection (configurable per deployment)