Expand description
Shared blob cache for ContentAddressedMount.
The cache holds materialised file bytes keyed by ContentHash so a
hot kernel read syscall is a pointer-bump + slice copy rather than
a full get_blob + decompress chain. It’s the difference between
mount reads beating std::fs::read (warm) and being ~10× slower
(cold) — see benches/mount_read_paths.rs.
Why a separate, shared pool instead of owning the cache inline?
Heddle’s content-addressed model means two threads forked from the
same parent share every blob hash on the parts of the tree they
haven’t diverged on yet. If each mount carries its own LRU, every
freshly-opened mount starts cold even when a sibling mount in the
same process just decompressed the exact same bytes a millisecond
ago. By making the cache an Arc<BlobCachePool> the daemon can
attach one pool to itself, hand it to every new mount, and every
cache-hot blob anywhere in the process is hot for the new mount
too. Cap stays the same; hit rate goes up.
The pool is byte-bounded, not entry-bounded — a 256 MiB cap holds roughly 25 × 10 MiB blobs or 250 000 × 1 KiB blobs, whichever the workload happens to be. Eviction is LRU. A single blob larger than the cap bypasses the cache entirely so one giant file can’t evict the rest of the working set.
Structs§
- Blob
Cache Pool - Process-shared blob cache. Construct once per
Repository(or once per daemon process) and hand the sameArc<BlobCachePool>to everycrate::ContentAddressedMountthat wants to share its warm state with sibling mounts. - Blob
Cache Stats
Constants§
- DEFAULT_
BLOB_ CACHE_ BYTES - Default cap. Picked so a typical agent workspace fits in memory
without the cache dominating RSS — for daemon deployments the
recommended sizing is
min(4 GiB, 25% of physical RAM), set viaBlobCachePool::with_capacity.