Expand description
16-byte-aligned memory pool for SIMD-friendly byte buffers. Aligned memory pool – 16-byte aligned allocation for SIMD readiness.
Provides AlignedPool for raw 16-byte-aligned byte storage, and
AlignedByteSlice for transparent alignment of tensor weight data.
When source bytes are already aligned, no copy is performed; when
misaligned, a one-time aligned copy is made into the pool.
§SIMD relevance
AVX2 _mm256_load_pd requires 32-byte alignment for optimal performance,
but 16-byte alignment avoids page-split penalties on all x86-64 CPUs.
The tensor_simd module uses unaligned loads
(_mm256_loadu_pd) so alignment is not strictly required, but aligned
data is faster on older microarchitectures.
Structs§
- Aligned
Byte Slice - An alignment-aware byte slice that guarantees 16-byte alignment for
tensor weight mapping. If the source bytes are already aligned, it
wraps them directly. If misaligned, it copies into an
AlignedPool. - Aligned
Pool - A pre-allocated memory pool with 16-byte alignment guarantee.