Skip to main content

Crate micropool

Crate micropool 

Source
Expand description

§🌊 micropool: low-latency thread pool with parallel iterators

Crates.io Docs.rs

micropool is a rayon-style thread pool designed for games and other low-latency scenarios. It implements the ability to spread work across multiple CPU threads in blocking and non-blocking ways. It also has full support for paralight’s parallel iterators, which cleanly facilitate multithreading in a synchronous codebase. micropool uses a work-stealing scheduling system, but is unique in several aspects:

  1. 🧵🤝 External threads participate: when a non-pool thread is blocked on micropool (from calling join or using a parallel iterator), it will actively help complete the work. This eliminates the overhead of a context switch.
  2. ⏩✅ Forward progress is guaranteed: any synchronous work that uses micropool is guaranteed to have at least one thread processing it, from the moment of creation.
  3. 🎯🛡️ Scope-based work stealing: a thread that is blocked will only steal work related to its current task. Blocked threads will never steal unrelated work, which might take an unpredictable amount of time to finish.
  4. 🎚️⚡ Two priority tiers: foreground work created by a blocking call is always prioritized over background tasks created via spawn.
  5. 🔄💤 Spinning before sleeping: threads will spin for a configurable interval before sleeping with the operating system scheduler.

§Usage

§Foreground work with join

A single operation can be split between two threads using the join primitive:

micropool::join(|| {
    println!("A {:?}", std::thread::current().id());
}, || {
    println!("B {:?}", std::thread::current().id());
});

// Possible output:
// B ThreadId(2)
// A ThreadId(1)
§Foreground work with parallel iterators

Parallel iterators allow for splitting common list operations across multiple threads. micropool re-exports the paralight library:

use micropool::iter::*;

let len = 10_000;
let input = (0..len as u64).collect::<Vec<u64>>();
let input_slice = input.as_slice();
let result = input_slice
    .par_iter()
    .with_thread_pool(micropool::split_by_threads())
    .sum::<u64>();

assert_eq!(result, 49995000);

The .with_thread_pool line specifies that the current micropool instance should be used, and split_by_threads indicates that each pool thread should process an equal-sized chunk of the data. Other data-splitting strategies available are split_by, split_per_item, and split_per.

§Background work with spawn

Tasks can be spawned asynchronously, then joined later:

let task = micropool::spawn_owned(|| 2 + 2);
println!("Is my task complete yet? {}", task.complete());
println!("The result: {}", task.join());

// Possible output:
// Is my task complete yet? false
// The result: 4

§Scheduling system

The following example illustrates the properties of the micropool scheduling system:

println!("A {:?}", std::thread::current().id());
let background_task = micropool::spawn_owned(|| println!("B {:?}", std::thread::current().id()));

micropool::join(|| {
    std::thread::sleep(std::time::Duration::from_millis(20));
    println!("C {:?}", std::thread::current().id())
}, || {
    println!("D {:?}", std::thread::current().id());
    micropool::join(|| {
        std::thread::sleep(std::time::Duration::from_millis(200));
        println!("E {:?}", std::thread::current().id());
    }, || {
        println!("F {:?}", std::thread::current().id());
    });
});

One possible output of this code might be:

A ThreadId(1)      // The main thread is #1
D ThreadId(2)      // Thread #2 begins helping the outer micropool::join call
C ThreadId(1)      // Thread #1 helps to finish the outer micropool::join call
F ThreadId(1)      // Thread #1 steals work from thread #2, to help complete the inner micropool::join call 
E ThreadId(2)      // Thread #2 finishes the inner micropool::join call
B ThreadId(2)      // Thread #2 grabs and completes the background task; thread #1 will *never* execute this

There are several key differences between micropool’s behavior and rayon, for instance:

  1. The outer call to join occurs on an external thread. With rayon, this call would simply block and the main thread would wait for pool threads to finish both halves of join. With micropool, the external thread helps.
  2. Because the calling thread always helps complete its work, progress on a synchronous task never stalls. In contrast, if the rayon thread pool is saturated with tasks, the call to join might be long and unpredictable - the rayon workers would need to finish their current tasks first, even if those tasks are unrelated.
  3. When the external thread finishes its work, and is blocking on the result of join, there is other work available: the background_task. In this case, completion of background_task is not required for join to return. As such, the external thread will never run it. In contrast, if a rayon thread is blocked, it may run unrelated work in the meantime, so it may take a long/unpredictable amount of time before control flow returns from the join.
  4. Worker threads will always help with synchronous work (like join) before processing asynchronous tasks created via spawn. This natural separation of foreground and background work ensures that the most important foreground tasks - like per-frame rendering or physics in a agame engine - happen first.
  5. According to Dennis Gustafsson, workers that spin while waiting for new tasks sometimes perform better than workers that sleep. When many short tasks are scheduled, the overhead of operating system calls for sleeping can outweight the wasted compute. micropool compensates for this by spinning threads before they sleep.

Modules§

iter
Iterator adaptors to define parallel pipelines more conveniently.

Structs§

OwnedTask
A task whose result is exclusively owned by the caller.
SharedTask
A clonable task whose result can be shared by reference.
ThreadPool
Represents a user-created thread pool.
ThreadPoolBuilder
Determines how a thread pool will behave.

Functions§

join
Takes two closures and potentially runs them in parallel. It returns a pair of the results from those closures.
num_threads
The total number of worker threads in the current pool.
spawn_owned
Spawns an asynchronous task on the current thread pool. The returned handle can be used to obtain the result.
spawn_shared
Spawns a shared asynchronous task on the current thread pool. The returned handle can be used to obtain the result.
split_by
Execute paralight iterators by batching elements. Every iterator will be broken up into chunks separate work units, which may be processed in parallel.
split_by_threads
Execute paralight iterators by batching elements. Every iterator will be broken up into N separate work units, where N is the number of pool threads. Each unit may be processed in parallel.
split_per
Execute paralight iterators by batching elements. Each group of chunk_size elements may be processed by a single thread.
split_per_item
Execute paralight iterators with maximal parallelism. Every iterator item may be processed on a separate thread.