Swap Queue
A lock-free thread-owned queue whereby tasks are taken by stealers in entirety via buffer swapping. For batching use-cases, this has the advantage that all tasks can be taken as a single batch in constant time irregardless of batch size, whereas alternatives using crossbeam_deque::Worker
and tokio::sync::mpsc
need to collect each task separately and situationally lack a clear cutoff point. This design ensures that should you be waiting on a resource such as a connection to be available, that once it is so there is no further delay before a task batch can be processed. While push behavior alone is slower than crossbeam_deque::Worker
and faster than tokio::sync::mpsc
, overall batching performance is around ~11-19% faster than crossbeam_deque::Worker
, and ~28-45% faster than tokio::sync::mpsc
on ARM and there is never a slow cutoff between batches.
Example
use Worker;
use ;
// Jemalloc makes this library substantially faster
static GLOBAL: Jemalloc = Jemalloc;
// Worker needs to be thread local because it is !Sync
thread_local!
// This mechanism will batch optimally without overhead within an async-context because spawn will happen after things already scheduled
async
Benchmarks
Benchmarks ran on t4g.medium using ami-06391d741144b83c2
Async Batching
Push
Batch collecting
CI tested under ThreadSanitizer, LeakSanitizer, Miri and Loom.