1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
use derive_more::Display;
/// How work is distributed among processors during a benchmark run.
///
/// The work is redistributed for each benchmark iteration, ensuring that hardware-specific
/// performance anomalies are averaged out (e.g. if some processors have worse thermals and
/// throttle more often).
#[derive(Copy, Clone, Debug, Display, Eq, PartialEq)]
pub enum WorkDistribution {
/// One worker pair is spawned for each numerically neighboring memory region pair.
///
/// For example, with 3 memory regions, we would have 3 pairs of workers: (0, 1), (1, 2), (2, 0).
///
/// Each pair will work together, processing one payload per pair. In total,
/// there will be two workers per memory region (one working with the "previous"
/// memory region and one working with the "next" one).
///
/// Each worker is pinned to a specific processor.
///
/// Different memory regions may be a different distance apart, so this allows us to average
/// out any differences - some pairs are faster, some are slower, we just want to average it
/// out so every benchmark run is consistent (instead of picking two random memory regions).
///
/// This option can only be used if there are at least two memory regions. Benchmark runs with
/// this distribution will be skipped if the system only has a single memory region.
PinnedMemoryRegionPairs,
/// Each worker in a pair is spawned in the same memory region.
///
/// Each pair will work together, processing one payload between the two members. Different
/// pairs may be in different memory regions.
///
/// Each worker is pinned to a specific processor.
///
/// The number of pairs will match the number that would have been used with
/// `PinnedMemoryRegionPairs`, for optimal comparability. There will be a minimum of one pair.
PinnedSameMemoryRegion,
/// Both workers in each pair are spawned on the same processor, picked arbitrarily.
///
/// Each pair will work together, processing one payload between the two members. Different
/// pairs may be in different memory regions.
///
/// The number of pairs will match the number that would have been used with
/// `PinnedMemoryRegionPairs`, for optimal comparability. There will be a minimum of one pair.
PinnedSameProcessor,
/// All workers are spawned without regard to memory region or processor, randomly picking
/// processors for each iteration.
///
/// Each worker is given back its own payload - while we still spawn the same number of workers
/// as in the paired scenarios, each member of the pair operates independently and processes
/// its own payload.
///
/// Note that this requires the benchmark scenario logic to be capable of handling its own data
/// set. If the benchmark logic requires two collaborating workers, you cannot use this work
/// distribution as it would likely end in a deadlock due to lack of a partner.
PinnedSelf,
/// Like `PinnedMemoryRegionPairs` but each worker is allowed to float among all the processors
/// in the memory region, based on the operating system's scheduling decisions.
///
/// We still have the same total number of workers to keep total system load equivalent.
UnpinnedMemoryRegionPairs,
/// Like `PinnedSameMemoryRegion` but each worker is allowed to float among half the processors
/// in the memory region, based on the operating system's scheduling decisions. Each member of
/// the pair gets one half of the processors in the memory region.
///
/// We still have the same total number of workers to keep total system load equivalent.
UnpinnedSameMemoryRegion,
/// All workers are spawned without regard to memory region or processor, leaving it up
/// to the operating system to decide where to run them. Note that, typically, this will still
/// result in them running in the same memory region, as that tends to be the default behavior.
///
/// Each worker is given back its own payload - while we still spawn the same number of workers
/// as in the paired scenarios, each member of the pair operates independently and processes
/// its own payload.
///
/// Note that this requires the benchmark scenario logic to be capable of handling its own data
/// set. If the benchmark logic requires two collaborating workers, you cannot use this work
/// distribution as it would likely end in a deadlock due to lack of a partner.
UnpinnedSelf,
}
impl WorkDistribution {
/// All the work distribution modes.
pub fn all() -> &'static [WorkDistribution] {
&[
WorkDistribution::PinnedMemoryRegionPairs,
WorkDistribution::PinnedSameMemoryRegion,
WorkDistribution::PinnedSameProcessor,
WorkDistribution::PinnedSelf,
WorkDistribution::UnpinnedMemoryRegionPairs,
WorkDistribution::UnpinnedSameMemoryRegion,
WorkDistribution::UnpinnedSelf,
]
}
}