cachekit 0.2.0-alpha

# Cache Replacement Policies

This document summarizes common cache replacement (eviction) policies, their tradeoffs, and when to use (or avoid) each. It’s written as a practical companion to `docs/design.md`.

Implementation notes live in `docs/policies/README.md`.

Terminology used below:
- **Admission**: whether an item is allowed into cache at all (some “policies” combine admission + eviction).
- **Eviction**: which resident item to remove when making space.
- **Scan pollution**: one-time accesses (e.g., large scans) pushing out genuinely hot items.
- **Metadata cost**: extra per-entry memory and CPU needed to maintain the policy.

## How To Choose (Quick Guidance)

Pick based on workload first:
- **Strong temporal locality (hot keys repeat quickly)**: `LRU` / `Clock` usually works.
- **One-hit-wonder dominated / scan-heavy**: prefer scan-resistant policies like `LRU-K`, `2Q`, `ARC`.
- **Frequency matters more than recency** (hot keys repeat but with long gaps): `LFU` (with aging) or hybrids.
- **Need low overhead & simple**: `FIFO`, `Random`, `Clock`.
- **Need adaptive across shifting workloads**: `ARC` (or related adaptive policies).

If you can only implement one “general purpose” policy for mixed workloads, `ARC`-style adaptivity or `LRU-K`/`2Q`-style scan resistance usually beats plain `LRU`, at the cost of more metadata and complexity.

## Policy Catalog (Summaries)

### OPT / MIN (Belady’s Optimal)

**Idea**: evict the item whose *next* use is farthest in the future.

- **Pros**: best possible hit rate for a known future; gold-standard for evaluation.
- **Cons**: requires knowing the future; not implementable in real systems (except offline traces/simulators).
- **Use when**: benchmarking and comparing other policies on recorded traces.
- **Avoid when**: building a real cache.

### Random

**Idea**: evict a uniformly random resident item.

- **Pros**: trivial; very low overhead; surprisingly decent under some adversarial patterns.
- **Cons**: ignores locality; unstable hit rate; can evict hot items.
- **Use when**: you need the simplest possible eviction with constant overhead.
- **Avoid when**: locality exists and you can afford minimal tracking.

### FIFO (First-In, First-Out)

**Idea**: evict the oldest inserted item.

- **Pros**: simple; O(1); predictable; low metadata.
- **Cons**: ignores reuse; can be very poor when early inserts stay hot.
- **Use when**: insert order correlates with staleness (e.g., streaming-ish workloads), or you want predictability.
- **Avoid when**: strong temporal locality; “old but hot” keys are common.

### LIFO / FILO (Last-In, First-Out)

**Idea**: evict the most recently inserted item.

- **Pros**: can work for some cyclic/scan-like patterns where newest items are least likely to be reused.
- **Cons**: counterproductive under temporal locality; uncommon in general-purpose caches.
- **Use when**: you have evidence newest items are least reusable.
- **Avoid when**: typical request caches with recency locality.

### LRU (Least Recently Used)

**Idea**: evict the item not accessed for the longest time.

- **Pros**: strong default for temporal locality; intuitive; stable.
- **Cons**: vulnerable to scan pollution; maintaining exact LRU can be costly under high concurrency.
- **Use when**: workloads have strong recency locality; you can tolerate metadata and updates on every access.
- **Avoid when**: large sequential scans are common; cache is highly contended and strict ordering is too expensive.

### MRU (Most Recently Used)

**Idea**: evict the most recently accessed item.

- **Pros**: can outperform LRU for some “looping scan” patterns where the just-touched item won’t be reused soon.
- **Cons**: performs poorly for typical temporal locality.
- **Use when**: known cyclic access where the most-recently-used item is least likely to be reused next.
- **Avoid when**: you’re unsure; MRU is rarely a safe default.

### Second-Chance / Clock

**Idea**: approximate LRU using a circular list and a referenced bit; give items a “second chance”.

- **Pros**: O(1) amortized; lower overhead than strict LRU; good concurrency properties in practice.
- **Cons**: approximation quality depends on implementation; still suffers from scan pollution in many forms.
- **Use when**: you want LRU-like behavior with cheaper metadata and fewer writes.
- **Avoid when**: you specifically need scan resistance or frequency awareness.

### NRU (Not Recently Used)

**Idea**: evict an item whose “referenced” bit is not set; bits are periodically cleared (epochs).

- **Pros**: very low overhead; works well when you can batch/reset reference bits cheaply.
- **Cons**: coarse recency signal; behavior depends heavily on epoch length.
- **Use when**: you already have hardware/software reference bits or can cheaply track “touched this epoch”.
- **Avoid when**: you need tight recency ordering.

### LFU (Least Frequently Used)

**Idea**: evict the item with the smallest access count.

- **Pros**: strong when popularity is stable and skewed; resists scan pollution better than LRU.
- **Cons**: “cache pollution by history” (once-hot items stick around); needs **aging/decay** to adapt; counters add overhead.
- **Use when**: hot items remain hot for long periods; frequency is the primary predictor.
- **Avoid when**: the hot set shifts quickly; you can’t implement decay/aging safely.

### MFU (Most Frequently Used)

**Idea**: evict the most frequently used item.

- **Pros**: can work in specific “burst then never again” patterns (items that were heavily used are now “done”).
- **Cons**: usually the opposite of what you want; not a general-purpose choice.
- **Use when**: you have evidence “most frequent so far” implies “least likely to be reused now”.
- **Avoid when**: almost always.

### Aging / Decayed LFU (LFU with time decay)

**Idea**: combine frequency with time so old counts lose influence (e.g., periodic halving, exponential decay).

- **Pros**: avoids LFU’s “stale hot” problem; adapts to changing popularity.
- **Cons**: more complexity; decay schedule can be tricky; still more metadata than LRU/Clock.
- **Use when**: you want frequency but with adaptivity to phase changes.
- **Avoid when**: extremely latency-sensitive hot paths where counter maintenance dominates.

### LRU-K

**Idea**: evict based on the K-th most recent access time (e.g., `K=2` tracks the 2nd most recent touch).

- **Pros**: filters one-time accesses; much more scan-resistant than LRU.
- **Cons**: more metadata per entry; more expensive updates; needs careful implementation to stay O(1) in practice.
- **Use when**: mixed point-lookups + scans; DB buffer pools; workloads with many one-hit-wonders.
- **Avoid when**: you need the simplest possible policy or can’t afford per-entry history.

### 2Q

**Idea**: use two queues: a short “probation” FIFO for new items and a main LRU for items that are accessed again.

- **Pros**: simple scan resistance; cheaper than LRU-K; widely used pattern.
- **Cons**: requires tuning queue sizes; still mainly recency-based once admitted to main queue.
- **Use when**: you want an easy scan-resistant upgrade over LRU.
- **Avoid when**: you can’t tolerate tuning knobs or workload changes dramatically.

### SLRU (Segmented LRU)

**Idea**: split LRU into segments (e.g., probationary + protected); promotion requires reuse.

- **Pros**: reduces scan pollution; simple; common in practice.
- **Cons**: needs segment sizing; not as adaptive as ARC-style approaches.
- **Use when**: you want low-complexity scan resistance with LRU semantics.
- **Avoid when**: workload shifts require continual retuning.

### ARC (Adaptive Replacement Cache)

**Idea**: adaptively balances recency vs frequency using two LRU lists plus “ghost” history lists to tune itself.

- **Pros**: strong across many workloads; self-tuning between scan resistance and frequency-ish behavior.
- **Cons**: more complex; more metadata (including ghost entries); harder to implement lock-efficiently.
- **Use when**: you need robust performance across shifting patterns and can afford complexity.
- **Avoid when**: memory overhead must be minimal or implementation complexity is a hard constraint.

### CAR (Clock with Adaptive Replacement)

**Idea**: ARC-like adaptivity but with Clock structures to reduce overhead.

- **Pros**: retains ARC’s adaptivity with lower overhead in some implementations.
- **Cons**: still complex; behavior depends on details.
- **Use when**: you want ARC-like behavior but prefer Clock-style mechanics.
- **Avoid when**: you need simplicity or have no room for ghost/history metadata.

### LIRS (Low Inter-reference Recency Set)

**Idea**: use inter-reference recency (distance between repeated touches) to classify and protect frequently reused items.

- **Pros**: excellent scan resistance in many workloads; strong theoretical grounding.
- **Cons**: implementation complexity; metadata overhead; harder to explain/debug than LRU variants.
- **Use when**: you can invest in a high-quality scan-resistant policy for DB-like workloads.
- **Avoid when**: you need a small, simple policy surface.

### CLOCK-Pro

**Idea**: Clock-based policy that differentiates hot/cold pages and tracks recent history to handle scans better than Clock.

- **Pros**: good scan resistance with Clock mechanics; practical for OS/DB buffer caches.
- **Cons**: more complex than Clock; tuning/implementation details matter.
- **Use when**: you want better-than-Clock scan handling without full ARC machinery.
- **Avoid when**: you want the simplest possible eviction logic.

### Size/Cost-Aware Policies (GDS / GDSF family)

**Idea**: evict based on a “value” score that accounts for retrieval cost and/or object size (common in web caches).

- **Pros**: optimizes for byte hit rate or cost-weighted hit rate; better than LRU when object sizes vary widely.
- **Cons**: more bookkeeping; needs cost/size signals; can be less intuitive.
- **Use when**: objects have large size variance; misses have heterogeneous cost (e.g., network fetch cost).
- **Avoid when**: costs are uniform and you only care about request hit rate.

### TTL / Time-Based Expiration (Not a Replacement Policy)

**Idea**: entries expire after a time-to-live, regardless of recency/frequency.

- **Pros**: bounds staleness; essential for correctness in many domains.
- **Cons**: does not optimize hit rate by itself; still needs an eviction policy when full.
- **Use when**: correctness requires freshness bounds (configs, tokens, CDN-like caching).
- **Avoid when**: you treat TTL as a substitute for eviction optimization.

## Practical Tradeoffs (What Changes In Real Systems)

- **Scan resistance**: `LRU`/`Clock` are vulnerable; `LRU-K`/`2Q`/`ARC`/`LIRS` handle scans better.
- **Metadata & CPU**: `Random`/`FIFO` < `Clock` < `LRU` < `2Q`/`SLRU` < `LRU-K`/`ARC`/`LIRS`.
- **Concurrency**: strict global `LRU` lists can contend; `Clock` and sharded designs often scale better.
- **Adaptivity**: `LFU` needs decay to adapt; `ARC`-family adapts via history; static partitions (`2Q`/`SLRU`) need tuning.
- **Predictability**: simpler policies are easier to reason about under tail-latency constraints; complex policies can have more edge cases.

## When To Use / Not Use (Rules Of Thumb)

- Use `LRU` when you have **temporal locality** and can tolerate per-hit metadata updates.
- Prefer `Clock` when you want **LRU-like** behavior with **lower overhead**.
- Avoid plain `LRU` for workloads with **large scans** unless you add scan resistance (e.g., `2Q`, `SLRU`, `LRU-K`).
- Use `LFU` (with aging) when **popularity is stable** and you care about long-term hot items.
- Use `ARC` when workload is **mixed or shifting** and you can afford the complexity and memory overhead.
- Use cost/size-aware policies (GDS/GDSF) when optimizing **byte hit rate** or **miss cost**, not just request count.

## Reference Material

- Wikipedia: Cache replacement policies: https://en.wikipedia.org/wiki/Cache_replacement_policies
- LRU-K: “The LRU-K page replacement algorithm for database disk buffering” (O’Neil, O’Neil, Weikum), 1993.
- 2Q: “2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm” (Johnson, Shasha), 1994.
- ARC: “ARC: A Self-Tuning, Low Overhead Replacement Cache” (Megiddo, Modha), 2003.
- LIRS: “LIRS: An Efficient Low Inter-reference Recency Set Replacement Policy to Improve Buffer Cache Performance” (Jiang, Zhang), 2002.
- OPT (Belady): “A study of replacement algorithms for a virtual-storage computer” (Belady), 1966.
- GDS/GDSF: “GreedyDual-Size: An algorithm for web caching” (Cao, Irani), 1997.