1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
//! # Metrics oriented profiler + bencher.
//!
//! This library provide a way to define a metrics, trough [`Metrics`] trait and gather them
//! in places instrumented by [`tracing`] crate.
//!
//! ## Bencher and profiler in one place
//!
//! Imagine having crate with some defined pipeline:
//! ```rust,ignore
//! fn pipeline(data: &[u8]) -> Vec<u8> {
//! serialize(process(parse(data)))
//! }
//! ```
//! At some point, perfomance of pipeline doesn't suit needs, and one want to start optimisation,
//! but for doing optimisation, one first need to know what part is slowing down pipeline.
//!
//! So one start to integrate bencher + profiler.
//!
//! Profiler shows what parts took more time, and bencher just snapshot
//! current state of perfomance in case of future regressions.
//!
//! With classic toolset you endup with:
//! 1. Benchmark for each pipeline phase (with some setup code and syntetic data for each phase)
//! 2. And some entrypoint with test data suitable for profiler (can be shared with bench, but with care)
//! 3. `[Optional]` Add some metrics in production, to allow gather perfomance stats in production.
//!
//! This aproach envolves a lot of duplication and boilerplates, and also enforces to expose some private api (input/output of pipeline phases).
//!
//! Instead one can use [`profiler`] and simplify the process:
//! ```rust,no_run
//! use profiler::Metrics;
//!
//! // Instrument functions that want to observe using `tracing::instrument`
//! #[tracing::instrument(skip_all)]
//! fn parse(data: &[u8]) -> Vec<u32> {
//! data.chunks(4).map(|c| u32::from_le_bytes(c.try_into().unwrap_or([0; 4]))).collect()
//! }
//! fn process(items: Vec<u32>) -> u64 {
//! // Or use `tracing::span` api
//! let _span = tracing::info_span!("parse").entered();
//! items.iter().map(|&x| x as u64).sum()
//! }
//! #[tracing::instrument(skip_all)]
//! fn serialize(result: u64) -> Vec<u8> {
//! result.to_le_bytes().to_vec()
//! }
//! fn pipeline(data: &[u8]) -> Vec<u8> {
//! serialize(process(parse(data)))
//! }
//!
//! // -- And create single entrypoint with custom setup.
//! fn bench_pipeline() {
//! let data: Vec<u8> = (0..1024u16).flat_map(|x| x.to_le_bytes()).collect();
//! pipeline(&data);
//! }
//! profiler::bench_main!(bench_pipeline);
//! ```
//! Putting this file somewhere in `<CARGO_ROOT>/benches/bench_name.rs`, and add bench section to `Cargo.toml`:
//! ```toml
//! [[bench]]
//! name = "bench_name"
//! harness = false
//! ```
//!
//! And now one have single entrypoint, where they can observe and debug perfomance regressions.
//!
//! ## Extend metrics
//!
//! By default [`profiler`] provides a multiple [`metrics providers`](metrics).
//! And implement default [`bench::MetricsProvider`] used in benchmarking.
//! User can decide what important for them, by deriving their own combination using `#[derive(Metrics)]` of [`Metrics`] trait,
//! and use them in [`bench_main!`].
//!
//! If one need to track something unique for the application (bytes read, slab size, etc.) they can define their own provider
//! using [`SingleMetric`] trait.
//!
//! If one want to collect metrics outside of benchmark, they can use [`Collector`] api directly.
//!
//! [`profiler`]: https:/docs.rs/profiler
use ;
use ;
pub use cratePerfEventMetric;
pub use crate;
pub use crate;
/// Derive macros for [`Metrics`] trait.
///
/// Using this macro will emit implementation of [`Metrics`] and [`Default`] traits, and some helper assertions.
///
/// By default each metric field is initialized using [`Default::default()`] implementation.
/// But one can customize constructor by using `#[new(...)]` attribute,
/// where they can provide arguments for `new` method in the metric type.
///
/// In example below, `cycles` field will be initialized with `PerfEventMetric::new(perf_event::events::Hardware::CPU_CYCLES)`.
///
/// # Example:
/// ```rust
/// use profiler::{InstantProvider, Metrics, PerfEventMetric};
/// use profiler::metrics::perf_event;
///
/// #[derive(Metrics)]
/// pub struct MetricsProvider {
/// /// Without `#[new]` attribute, the metric will be initialized with `Default::default()`.
/// /// wall_time can be gathered from Instant or from perf_event(CPU_CLOCK), result is similar,
/// /// but Instant is more portable.
/// pub wall_time: InstantProvider,
/// /// CPU cycles spent in the span.
/// /// The first metric in the list will be used as the primary metric and adds report of %parent in the report.
/// #[new(perf_event::events::Hardware::CPU_CYCLES)]
/// pub cycles: PerfEventMetric,
/// }
/// ```
///
/// Macro is only applies to structs where each member implements the [`SingleMetric`] trait.
///
/// <div class="warning">
/// Note: <code>Metrics</code> and <code>SingleMetric</code> are different traits.
/// </div>
///
///
/// `#[config]` attribute allows to customize display options for each metric, see [`MetricReportInfo`] for more details.
///
/// # Example:
/// ```rust
/// use profiler::{Metrics, PerfEventMetric};
/// use profiler::metrics::perf_event;
///
/// #[derive(Metrics)]
/// pub struct MetricsProvider {
/// #[new(perf_event::events::Hardware::CPU_CYCLES)]
/// #[config(show_spread = false)]
/// pub cycles: PerfEventMetric,
/// }
/// ```
///
/// [`MetricReportInfo`]: crate::metrics::MetricReportInfo
pub use Metrics;
pub use black_box;
///
/// Entry collected by [`Collector`].
///
/// This is pure data transfer object, just storing information about span,
/// and result of metrics calculation on span closing.
/// Single collector: a [`tracing_subscriber::Layer`] that captures
/// [`ProfileEntry`] events on span enter / exit into an internal buffer.
///
/// Cheaply cloneable (inner state behind [`Arc`]).
///
/// Current implementation collect all entries inside one synchronized [`Vec`].
/// One should call [`Collector::drain()`] to retrieve entries at some point.