Skip to main content

Crate pf_cache

Crate pf_cache 

Source
Expand description

§pf-cache

Paged KV-cache capture, content-addressing per page (CoW across forks), and a CachePager trait that the per-engine adapters implement.

See agent_docs/cache-layer.md for the spec and .claude/skills/kvcache-format/SKILL.md for the page-out / page-in pseudo-code. The on-disk format is paged-batchinvariant-v1.

§What ships in Phase 4 (this commit)

§Bit-exact replay

Bit-exact restore requires batch-invariant kernels (vLLM --enforce-deterministic, SGLang --deterministic-mode). The CUDA-host integration test (tests/cache_bit_exact_vllm.rs) is gated behind $PF_HAS_GPU=1; the in-process round-trip in tests/cache_round_trip.rs is the build-host proxy and runs everywhere.

Re-exports§

pub use capture::capture_cache;
pub use capture::restore_cache;
pub use format::CacheMeta;
pub use format::Dtype;
pub use format::LAYOUT_V1;
pub use format::LogicalSeq;
pub use format::Page;
pub use format::PageManifest;
pub use pager::CachePager;
pub use pager::SyntheticCachePager;
pub use serialize::deserialize_pages;
pub use serialize::serialize_pages;

Modules§

capture
High-level capture / restore one-shot helpers.
format
Wire format for the cache layer (paged-batchinvariant-v1).
pager
Engine-agnostic paged-cache interface.
serialize
Serialize / deserialize a paged KV cache via a BlobStore.