Zero-Pool: Consistent High-Performance Thread Pool
When microseconds matter and allocation is the enemy.
This is an experimental thread pool implementation focused on exploring lock-free FIFO MPMC queue techniques. Consider this a performance playground rather than a production-ready library.
Key Features:
- Zero locks - lock-free
- Zero queue limit - unbounded
- Zero channels - no std/crossbeam channel overhead
- Zero virtual dispatch - function pointer dispatch avoids vtable lookups
- Zero core spinning - event based
- Zero result transport cost - tasks write directly to caller-provided memory
- Zero per worker queues - single global queue structure = perfect workload balancing
- Zero external dependencies - standard library only and stable rust
Using a result-via-parameters pattern means workers place results into caller provided memory, removing thread transport overhead. The single global queue structure ensures optimal load balancing without the complexity of work-stealing or load redistribution algorithms.
Since the library uses raw pointers, you must ensure parameter structs remain valid until TaskFuture::wait() completes, result pointers remain valid until task completion, and that your task functions are thread-safe. The library provides type-safe methods like submit_task and submit_batch_uniform for convenient usage.
Note: TaskFuture uses a small Mutex + Condvar to efficiently block waiting threads. Core pool operations remain lock-free.
Benchmarks
test bench_heavy_compute_rayon ... bench: 4,612,748.55 ns/iter
test bench_heavy_compute_zeropool ... bench: 4,433,491.25 ns/iter
test bench_indexed_computation_rayon ... bench: 37,142.34 ns/iter
test bench_indexed_computation_zeropool ... bench: 35,483.85 ns/iter
test bench_individual_tasks_rayon_empty ... bench: 46,815.76 ns/iter
test bench_individual_tasks_zeropool_empty ... bench: 154,867.12 ns/iter
test bench_task_overhead_rayon ... bench: 34,408.66 ns/iter
test bench_task_overhead_zeropool ... bench: 33,548.53 ns/iter
Note: the pool fares poorly for workloads dominated by millions of tiny, individual task submissions (see bench_individual_tasks_zeropool_empty). This is mainly due to per-submission allocation; prefer batching or larger tasks for best throughput.
Example Usage
Submitting a Single Task
use ;
zp_define_task_fn!;
let pool = new;
let mut result = 0u64;
let task = CalculationParams ;
let future = pool.submit_task;
future.wait;
println!;
Submitting Uniform Batches
Submits multiple tasks of the same type to the thread pool.
use ;
zp_define_task_fn!;
let pool = new;
let mut results = vec!;
let tasks: = results.iter_mut.enumerate.map.collect;
let future = pool.submit_batch_uniform;
future.wait;
println!;
Submitting Multiple Independent Tasks
You can submit individual tasks and uniform batches in parallel:
use ;
// Define first task type
zp_define_task_fn!;
// Define second task type
zp_define_task_fn!;
let pool = new;
// Individual task - separate memory location
let mut single_result = 0u64;
let single_task_params = ComputeParams ;
// Uniform batch - separate memory from above
let mut batch_results = vec!;
let batch_task_params: = batch_results.iter_mut.enumerate
.map
.collect;
// Submit all batches
let future1 = pool.submit_task;
let future2 = pool.submit_batch_uniform;
// Wait on them in any order; completion order is not guaranteed
future1.wait;
future2.wait;
println!;
println!;
zp_define_task_fn!
Defines a task function that safely dereferences the parameter struct.
use ;
zp_define_task_fn!;
zp_write!
Eliminates explicit unsafe blocks when writing to result pointers.
use ;
zp_define_task_fn!;
zp_write_indexed!
Safely writes a value to a specific index in a Vec or array via raw pointer, useful for batch processing where each task writes to a different index.
use ;
zp_define_task_fn!;
// Usage with a pre-allocated vector
let pool = new;
let mut results = vec!;
let task_params: = .map.collect;
let future = pool.submit_batch_uniform;
future.wait;