SimdQuickHeap: A fast SIMD-based priority queue
The SimdQuickHeap is, as the name suggests, a fast priority queue. It has some similarities to QuickSelect, and uses SIMD for fast partitioning and insertion of elements.
Unlike the classic binary heap (and $d$-ary heap variants), it is I/O-efficient and does not suffer from bad cache-locality when the size of the queue exceeds the cache.
Currently, it only supports i32, u32, i64, and u64 keys and requires
either AVX2 or AVX-512.
See the preprint for details and benchmarks:
SimdQuickHeap: The QuickHeap Reconsidered Johannes Breitling, Ragnar Groot Koerkamp, Marvin Williams, Arxiv 2026 https://doi.org/10.48550/ARXIV.2604.25681
An older blogpost can be found here: https://curiouscoding.nl/posts/quickheap.
Example
let mut q = default;
q.push;
q.push;
q.push;
assert_eq!;
q.push;
q.push;
assert_eq!;
assert_eq!;
assert_eq!;
assert_eq!;
assert_eq!;
Results
Below you can see the results for 32-bit and 64-bit data when we first push n
elements and then how long it takes on average to do a pair of push and pop
operations. Note that the y-axis is normalized by log_2(n) as well.
The SimdQuickHeap is around 2x faster than the Radix heap, and up to 10x faster than the binary heap and d-ary heap.
Running the benchmarks
Benchmarks can be found in the bench/ directory, and Rust FFI wrappers around
external C++ implementations can be found in ext/, with submodules
containing the original C++ sources in nested third_party directories.
(Run git submodule update --init --recursive to initialize them.)
Run just bench-small plot-all (see the justfile) for a small subset of
the experiments. Note that these need nightly Rust for some unstable convenience features.
If you get linker errors, comment out the .flag("-flto") in ext/{s3q-sys,sequence-heap-sys,boost-heap-sys}/build.rs.