# LHD (Learning Hit Density)
## Goal
Maximize value per byte by using a learned estimate of hit probability over
remaining lifetime, especially when object sizes vary substantially.
## Core Idea
LHD estimates each object's future hit density (roughly expected future hits per
byte over a horizon) from observed reuse behavior. Eviction prefers entries with
the lowest estimated hit density.
Compared to simple size-aware scoring, LHD adapts estimates from workload data
instead of relying on fixed formulas.
## Core Data Structures (Typical)
- Hash index `K -> EntryMeta`
- Size-aware entry metadata (`size`, age/timestamp, access stats)
- Lightweight learned model/state (bucketized statistics tables)
- Victim selector keyed by estimated hit density
## Complexity & Overhead
- Higher implementation complexity than GDS/GDSF
- Additional memory for model tables and per-entry predictors
- Runtime cost depends on estimator complexity (can be O(1) with table lookups)
## Notes For CacheKit
- High-value candidate for byte hit rate optimization and heterogeneous object sizes.
- Start with coarse bucketed estimators to keep hot paths predictable.
- Benchmark against `GDS`, `GDSF`, and `Hyperbolic` on mixed-size traces.
## References
- Beckmann et al. (2018): “LHD: Improving Cache Hit Rate by Maximizing Hit Density”, USENIX OSDI 2018 paper with follow-up implementations.