Expand description
§pf-cache
Paged KV-cache capture, content-addressing per page (CoW across forks),
and a CachePager trait that the per-engine adapters implement.
See agent_docs/cache-layer.md for the spec and
.claude/skills/kvcache-format/SKILL.md for the page-out / page-in
pseudo-code. The on-disk format is paged-batchinvariant-v1.
§What ships in Phase 4 (this commit)
format::PageManifest: the wire-format struct mirrored from the spec.serialize::serialize_pages/serialize::deserialize_pages: portable round-trip via thepf_core::cas::BlobStoretrait, no GPU.pager::CachePager: the engine-agnostic interface every adapter implements (vLLM, SGLang, …).pager::SyntheticCachePager: in-memory implementation used by every test in this crate. Lets us prove serialize+restore round-trip without booting an inference engine.capture::capture_cache/capture::restore_cache: high-level one-shot helpers that the snapshotter calls.
§Bit-exact replay
Bit-exact restore requires batch-invariant kernels (vLLM
--enforce-deterministic, SGLang --deterministic-mode). The CUDA-host
integration test (tests/cache_bit_exact_vllm.rs) is gated behind
$PF_HAS_GPU=1; the in-process round-trip in tests/cache_round_trip.rs
is the build-host proxy and runs everywhere.
Re-exports§
pub use capture::capture_cache;pub use capture::restore_cache;pub use format::CacheMeta;pub use format::Dtype;pub use format::LAYOUT_V1;pub use format::LogicalSeq;pub use format::Page;pub use format::PageManifest;pub use pager::CachePager;pub use pager::SyntheticCachePager;pub use serialize::deserialize_pages;pub use serialize::serialize_pages;