Skip to main content

Module cache_alignment

Module cache_alignment 

Source
Expand description

KV-Cache alignment for commercial LLM prompt caching.

Claude’s prompt caching stores KV-tensors for byte-exact prefix matches. GPT models have similar mechanisms. This module ensures lean-ctx outputs are structured to maximize cache hit rates.

Key strategies:

  1. Stable prefix: invariant content (instructions, tool defs) comes first
  2. Cache-block alignment: content segmented to match provider breakpoints
  3. Delta-only after cached prefix: only send changes, rest stays in KV-cache
  4. Deterministic ordering: same inputs always produce byte-identical output

Structs§

CacheAlignedOutput
CacheBlock
DeltaResult

Functions§

cache_order_code
Order file contents for maximum cache reuse across tool calls. Stable elements (imports, type defs) first, then variable elements (function bodies).
compute_delta
Generate a delta between two versions of content for cache-efficient updates. Returns only the changed portions, prefixed with stable context identifiers.