prefetch-index
A small crate to prefetch an element of an array.
Provides prefetch_index<T>(data: &[T], index: usize)
that prefetches the cache line containing (the first byte of) slice[index].
prefetch_index_nta is also provided for non-temporal accesses.
This can be used too overlap multiple independent memory accesses, and is particularly useful when the CPU's reorder buffer is not sufficiently long to start loading memory by itself.
On my machine (i7 10750H) at 3.0GHz, the version without prefetching is 40 ns/it, while the version with prefetching is 8 ns/it, i.e., 5x faster!