Expand description
CachePadded<T> — target-aware cache-line alignment.
Wrap a contended atomic in CachePadded to keep it from sharing a
cache line with neighboring fields. Without this, two atomics on the
same line cause cache-coherency ping-pong between cores even when the
threads writing to them touch logically independent data — “false
sharing.” The L1-to-L1 round trip to re-fetch an invalidated line is
tens of nanoseconds, catastrophic in a tight allocate-deallocate loop.
The alignment used is per-target:
x86_64,aarch64,powerpc64: 128 bytes. x86_64’s L1 line is 64 bytes but the adjacent-line prefetcher pulls cache lines in pairs, so a 64-byte pad still allows false sharing across the prefetched neighbor; 128 closes that gap. Apple Silicon (M-series) AArch64 uses 128-byte coherency granularity natively.arm,mips,mips64,sparc,hexagon: 32 bytes.m68k: 16 bytes.s390x: 256 bytes.- Anything else: 64 bytes (the historical x86 line size).
The cfg matrix mirrors crossbeam_utils::CachePadded’s choices so
benchmarks and reasoning carry across crates. We inline the
definition rather than depending on crossbeam_utils to keep
forge-alloc dependency-free at the runtime layer.
Structs§
- Cache
Padded - Wraps a value so it occupies a whole cache line, preventing the neighboring fields in a struct from being invalidated when the wrapped atomic is written by another core.
Constants§
- CACHE_
LINE - The cache-line alignment used by
CachePaddedon this target. Surfaced so dependent crates andconst _: () = assert!(...)layout pins can reference the same value the wrapper itself uses.