Skip to main content

Module cache

Module cache 

Source
Expand description

Shared blob cache for ContentAddressedMount.

The cache holds materialised file bytes keyed by ContentHash so a hot kernel read syscall is a pointer-bump + slice copy rather than a full get_blob + decompress chain. It’s the difference between mount reads beating std::fs::read (warm) and being ~10× slower (cold) — see benches/mount_read_paths.rs.

Why a separate, shared pool instead of owning the cache inline?

Heddle’s content-addressed model means two threads forked from the same parent share every blob hash on the parts of the tree they haven’t diverged on yet. If each mount carries its own LRU, every freshly-opened mount starts cold even when a sibling mount in the same process just decompressed the exact same bytes a millisecond ago. By making the cache an Arc<BlobCachePool> the daemon can attach one pool to itself, hand it to every new mount, and every cache-hot blob anywhere in the process is hot for the new mount too. Cap stays the same; hit rate goes up.

The pool is byte-bounded, not entry-bounded — a 256 MiB cap holds roughly 25 × 10 MiB blobs or 250 000 × 1 KiB blobs, whichever the workload happens to be. Eviction is LRU. A single blob larger than the cap bypasses the cache entirely so one giant file can’t evict the rest of the working set.

Structs§

BlobCachePool
Process-shared blob cache. Construct once per Repository (or once per daemon process) and hand the same Arc<BlobCachePool> to every crate::ContentAddressedMount that wants to share its warm state with sibling mounts.
BlobCacheStats

Constants§

DEFAULT_BLOB_CACHE_BYTES
Default cap. Picked so a typical agent workspace fits in memory without the cache dominating RSS — for daemon deployments the recommended sizing is min(4 GiB, 25% of physical RAM), set via BlobCachePool::with_capacity.