Skip to main content

Crate micropool

Crate micropool 

Source
Expand description

ยง๐ŸŒŠ micropool: low-latency thread pool with parallel iterators

Crates.io Docs.rs

micropool is a rayon-style thread pool designed for games and other low-latency scenarios. It implements the ability to spread work across multiple CPU threads in blocking and non-blocking ways. It also has full support for paralightโ€™s parallel iterators, which cleanly facilitate multithreading in a synchronous codebase. micropool uses a work-stealing scheduling system, but is unique in several aspects:

  1. ๐Ÿงต๐Ÿค External threads participate: when a non-pool thread is blocked on micropool (from calling join or using a parallel iterator), it will actively help complete the work. This eliminates the overhead of a context switch.
  2. ๐Ÿ”’โŒ Lock-free scheduler: the job scheduling system uses atomic bitsets to share work. Threads will never block while trying to read data for a job. Threads only sleep if no work is available.
  3. ๐ŸŽฏ๐Ÿ›ก๏ธ Scope-based work stealing: a thread that is blocked will attempt steal work related to its current task. Blocked threads avoid stealing unrelated work, which might take an unpredictable amount of time to finish.
  4. ๐ŸŽš๏ธโšก Two priority tiers: foreground work created by a blocking call is always prioritized over background tasks created via spawn.
  5. ๐Ÿ”„๐Ÿ’ค Spinning before sleeping: threads will spin for a configurable interval before sleeping with the operating system scheduler.

ยงUsage

ยงForeground work with join

A single operation can be split between two threads using the join primitive:

micropool::join(|| {
    println!("A {:?}", std::thread::current().id());
}, || {
    println!("B {:?}", std::thread::current().id());
});

// Possible output:
// B ThreadId(2)
// A ThreadId(1)
ยงForeground work with parallel iterators

Parallel iterators allow for splitting common list operations across multiple threads. micropool re-exports the paralight library:

use micropool::iter::*;

let len = 10_000;
let input = (0..len as u64).collect::<Vec<u64>>();
let input_slice = input.as_slice();
let result = input_slice
    .par_iter()
    .with_thread_pool(micropool::split_by_threads())
    .sum::<u64>();

assert_eq!(result, 49995000);

The .with_thread_pool line specifies that the current micropool instance should be used, and split_by_threads indicates that each pool thread should process an equal-sized chunk of the data. Other data-splitting strategies available are split_by, split_per_item, and split_per.

ยงBackground work with spawn

Tasks can be spawned asynchronously, then joined later:

let task = micropool::spawn_owned(|| 2 + 2);
println!("Is my task complete yet? {}", task.complete());
println!("The result: {}", task.join());

// Possible output:
// Is my task complete yet? false
// The result: 4

ยงScheduling system

The following example illustrates the properties of the micropool scheduling system:

println!("A {:?}", std::thread::current().id());
let background_task = micropool::spawn_owned(|| println!("B {:?}", std::thread::current().id()));

micropool::join(|| {
    std::thread::sleep(std::time::Duration::from_millis(20));
    println!("C {:?}", std::thread::current().id())
}, || {
    println!("D {:?}", std::thread::current().id());
    micropool::join(|| {
        std::thread::sleep(std::time::Duration::from_millis(200));
        println!("E {:?}", std::thread::current().id());
    }, || {
        println!("F {:?}", std::thread::current().id());
    });
});

One possible output of this code might be:

A ThreadId(1)      // The main thread is #1
D ThreadId(2)      // Thread #2 begins helping the outer micropool::join call
C ThreadId(1)      // Thread #1 helps to finish the outer micropool::join call
F ThreadId(1)      // Thread #1 steals work from thread #2, to help complete the inner micropool::join call 
E ThreadId(2)      // Thread #2 finishes the inner micropool::join call
B ThreadId(2)      // Thread #2 grabs and completes the background task; thread #1 should not execute this

There are several key differences between micropoolโ€™s behavior and rayon, for instance:

  1. The outer call to join occurs on an external thread. With rayon, this call would simply block and the main thread would wait for pool threads to finish both halves of join. With micropool, the external thread helps.
  2. Because the calling thread always helps complete its work, progress on a synchronous task does not stall. In contrast, if the rayon thread pool is saturated with tasks, the call to join might be long and unpredictable - the rayon workers would need to finish their current tasks first, even if those tasks are unrelated.
  3. When the external thread finishes its work, and is blocking on the result of join, there is other work available: the background_task. In this case, completion of background_task is not required for join to return. As such, the external thread will not run it. In contrast, if a rayon thread is blocked, it may run unrelated work in the meantime, so it may take a long/unpredictable amount of time before control flow returns from the join.
  4. Worker threads will always help with synchronous work (like join) before processing asynchronous tasks created via spawn. This natural separation of foreground and background work ensures that the most important foreground tasks - like per-frame rendering or physics in a a game engine - happen first.
  5. According to Dennis Gustafsson, workers that spin while waiting for new tasks sometimes perform better than workers that sleep. When many short tasks are scheduled, the overhead of operating system calls for sleeping can outweight the wasted compute. micropool compensates for this by spinning threads before they sleep.

Modulesยง

iter
Iterator adaptors to define parallel pipelines more conveniently.

Structsยง

OwnedTask
A task whose result is exclusively owned by the caller.
SharedTask
A clonable task whose result can be shared by reference.
ThreadPool
Represents a user-created thread pool.
ThreadPoolBuilder
Determines how a thread pool will behave.

Functionsยง

join
Takes two closures and potentially runs them in parallel. It returns a pair of the results from those closures.
num_threads
The total number of worker threads in the current pool.
spawn_owned
Spawns an asynchronous task on the current thread pool. The returned handle can be used to obtain the result.
spawn_shared
Spawns a shared asynchronous task on the current thread pool. The returned handle can be used to obtain the result.
split_by
Execute paralight iterators by batching elements. Every iterator will be broken up into chunks separate work units, which may be processed in parallel.
split_by_threads
Execute paralight iterators by batching elements. Every iterator will be broken up into N separate work units, where N is the number of pool threads. Each unit may be processed in parallel.
split_per
Execute paralight iterators by batching elements. Each group of chunk_size elements may be processed by a single thread.
split_per_item
Execute paralight iterators with maximal parallelism. Every iterator item may be processed on a separate thread.