1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
//! A benchmark framework designed for testing key-value stores with easily customizable
//! workloads.
//!
//! Key features:
//!
//! 1. Flexible and ergonomic control over benchmark specifications using TOML configuration files.
//! 2. Collecting diverse metrics, including throughput, latency (w/ CDF), and rate-limited latency.
//! 3. One-shot execution of multiple benchmark steps with different properties.
//! 4. Various built-in key-value stores in place as well as a client/server implementation.
//! 5. Highly extensible and can be seamlessly integrated into your own store.
//!
//! # Benchmark Configuration
//!
//! A benchmark in kvbench consists of one or more benchmark runs, termed as *phases*.
//! Phases will be run sequentially following their order in the configuration file.
//!
//! A benchmark configuration file is formatted in TOML. It consists of the definition of each
//! phase in an array named `benchmark`, so the configuration of each phase starts with
//! `[[benchmark]]`. The file also optionally contains a `[global]` section which will override the
//! unspecified field in each phase. This can eliminate redundant options in each phase, for
//! example, when those options are the same across the board.
//!
//! A configuration file generally looks like the following:
//!
//! ```toml
//! [global]
//! # global options
//!
//! [[benchmark]]
//! # phase 1 configuration
//!
//! [[benchmark]]
//! # phase 2 configuration
//!
//! ...
//! ```
//! Options in `[global]` section can also be overwritten via environment variables without
//! modifying the TOML file. For example, if the user needs to override `x` in `[global]`, one can
//! set the environment variable `global.x` (case insensitive). This is helpful when the user would
//! like to run different benchmarks when changing only a few options using a shell script.
//!
//! **Reference**
//!
//! - [`BenchmarkOpt`]: the available options for benchmark phase configuration.
//! - [`GlobalOpt`]: the available options for global configuration.
//!
//! # Key-Value Store Configuration
//!
//! In addition to the specification of the benchmark itself, kvbench also requires the
//! parameters of the key-value store it runs against. Only one key-value store runs at a time.
//!
//! The configuration of a key-value store is stored in a dictionary `map`.
//! A store's configuration file looks like the following:
//!
//! ```toml
//! [map]
//! name = "..."
//! # option1 = ...
//! # option2 = ...
//!
//! ...
//! ```
//! The field `name` must be given and it should be equal to the name registered by the store.
//! Other than `name`, all the fields are parsed as a string map and will be passed to the
//! store's constructor function. The options in `[map]` section can also be overwritten via
//! environment variables (e.g., setting `map.x` overrides property `x`).
//!
//! **Reference**
//!
//! - [`mod@stores`]: the available options for built-in stores and how to register new stores.
//!
//! # Run a Benchmark
//!
//! Once the configuration files of the benchmark along with the key-value store are ready, a
//! benchmark can be started by using the `bench` mode of the built-in command-line interface.
//!
//! **Reference**
//!
//! - [`cmdline()`]: the usage of the default command-line interface.
//!
//! # Metrics Collection
//!
//! Currently, all outputs are in plain text format. This makes the output easy to process using
//! shell scripts and tools including gnuplot. If there are new data added to the output, it
//! will be appended at the end of existing entries (but before `cdf` if it exists, see below)
//! to make sure outputs from old versions can still be processed without changes.
//!
//! ## Throughput-only Output (default case)
//!
//! When measuring throughput, an output may look like the following:
//! ```txt
//! phase 0 repeat 0 duration 1.00 elapsed 1.00 total 1000000 mops 1.00
//! phase 0 repeat 1 duration 1.00 elapsed 2.00 total 1000000 mops 1.00
//! phase 0 repeat 2 duration 1.00 elapsed 3.00 total 1000000 mops 1.00
//! phase 0 finish . duration 1.00 elapsed 3.00 total 3000000 mops 1.00
//! ```
//!
//! The general format is:
//!
//! ```txt
//! phase <p> repeat <r> duration <d> elapsed <e> total <o> mops <t>
//! ```
//!
//! Where:
//!
//! - `<p>`: phase id.
//! - `<r>`: repeat id in a phase, or string `finish .`, if the line is the aggregated report
//! of a whole phase.
//! - `<d>`: the duration of the repeat/phase, in seconds.
//! - `<e>`: the total elapsed seconds since the starting of the program.
//! - `<o>`: the total key-value operations executed by all worker threads in the repeat/phase.
//! - `<t>`: followed by the throughput in million operations per second of the repeat/phase.
//!
//! ## Throughput + Latency Output (when `latency` is `true`)
//!
//! When latency measurement is enabled, the latency metrics shall be printed at the end of each
//! benchmark. It is not shown after each repeat, because unlike throughput which is a singleton
//! value at a given time, latency is a set of values and it usually matters only when we aggregate
//! a lot of them. The output format in this case is generally the same as throughput-only
//! measurements, but the `finish` line has extra output like the following:
//!
//! ```txt
//! phase 0 repeat 0 duration 1.00 elapsed 1.00 total 1000000 mops 1.00
//! phase 0 repeat 1 duration 1.00 elapsed 2.00 total 1000000 mops 1.00
//! phase 0 repeat 2 duration 1.00 elapsed 3.00 total 1000000 mops 1.00
//! phase 0 finish . duration 1.00 elapsed 3.00 total 3000000 mops 1.00 min_us 0.05 max_us 100.00 avg_us 50.00 p50_us 50.00 p95_us 95.00 p99_us 99.00 p999_us 100.00
//! ```
//!
//! The extra output on the last line has a format of:
//!
//! ```txt
//! min_us <i> max_us <a> avg_us <v> p50_us <m> p95_us <n> p99_us <p> p999_us <t>
//! ```
//!
//! Where (all units are microseconds):
//!
//! - `<i>`: minimum latency
//! - `<a>`: maximum latency
//! - `<v>`: mean latency
//! - `<m>`: median latency (50% percentile)
//! - `<n>`: P95 latency
//! - `<p>`: P99 latency
//! - `<t>`: P999 latency (99.9%)
//!
//! ## Throughput + Latency + Latency CDF Mode (when both `latency` and `cdf` are `true`)
//!
//! When `cdf` is enabled, the latency CDF data will be printed at the end of the same line as the
//! latency metrics above. In that case, the output will be like the following:
//!
//! ```txt
//! phase 0 repeat 0 duration 1.00 elapsed 1.00 total 1000000 mops 1.00
//! phase 0 repeat 1 duration 1.00 elapsed 2.00 total 1000000 mops 1.00
//! phase 0 repeat 2 duration 1.00 elapsed 3.00 total 1000000 mops 1.00
//! phase 0 finish . duration 1.00 elapsed 3.00 total 3000000 mops 1.00 min_us 0.05 max_us 100.00 avg_us 50.00 p50_us 50.00 p95_us 95.00 p99_us 99.00 p999_us 100.00 cdf_us percentile ...
//! ```
//! Since the latency metrics vary a lot between different benchmarks/runs, the number of data
//! points of the CDF is different. Therefore, it is printed at the end of the output only. It is
//! printed as a tuple of `<us> <percentile>` where `<us>` is the latency in microseconds and
//! `<percentile>` is the percentile of the accumulated operations with latency higher than between
//! `<us> - 1` and `<us>`, inclusively, ranging from 0 to 100 (two digit precision).
//! There can be arbitrary number of tuples. The output ends when the maximum recorded latency is
//! reached.
//!
//! An example of the CDF data will look like:
//!
//! ```txt
//! cdf_us percentile 1 0.00 2 0.00 3 0.00 4 10.00 5 20.00 6 20.00 ...
//! ```
//!
//! It means there are not data points at 1/2/3 microseconds. At 4 microseconds, there are 10% data
//! points. At 5 microseconds, there are another 10% data points which makes the total percentile
//! 20.00. At 6 microseconds, there are no data points so the percentile is still 20.00. Users can
//! post-process the output and make a smooth CDF plot out of it.
//!
//! # Server Mode
//! A key-value client/server implementation is available in kvbench. The server can be backed by
//! an arbitrary key-value store defined by a TOML file as in a benchmark, and the server can be
//! started using the `server` mode of the built-in command-line interface.
//!
//! To benchmark the server's performance, users can use the built-in client implementation.
//!
//! **Reference**
//!
//! - [`cmdline()`]: the usage of the default command-line interface.
//! - [`stores::remote`]: the available options of the key-value store client.
use ;
use RefCell;
use Rc;
use Arc;
/// A synchronous, thread-safe key-value store.
///
/// This trait is used for owned stores, with which a per-thread handle can be created. The default
/// benchmark/server implementation is provided, unless the use case needs to use specific thread
/// management implementations.
/// A per-thread handle that references a [`KVMap`].
///
/// The handle is the real object that exposes a key-value interface.
/// A single operation that is applied to the key-value store.
///
/// This trait is used mainly in [`AsyncKVMap`] and server/client implementation.
/// A request submitted by an asynchronous store.
/// A response received by an asynchronous store.
/// A non-blocking, thread-safe key-value map.
///
/// Unlike [`KVMap`], [`AsyncKVMap`] works in request/response style. Where each handle needs to be
/// created by registering an explicit responder that serves as the "callback" when the underlying
/// routine produces a response.
///
/// Specifically, for benchmark, each worker thread maintains just one handle, so the buffer is
/// per-worker. For server, each worker may manage multiple connections. Each connection needs its
/// own responder (we should not mix responses for different connections, obviously). Therefore, it
/// creates a handle for each incoming connection, and maintains a responder for it.
/// A per-thread handle that references a [`AsyncKVMap`].
///
/// The handle is the real object that exposes a key-value interface.
/// An asynchronous entry point to callback when a response returns.
pub use ;
pub use cmdline;
pub use WorkloadOpt;
pub extern crate inventory;
pub extern crate toml;