1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
//! Inferno is a set of tools that let you to produce [flame graphs] from performance profiles of
//! your application. It's a port of parts Brendan Gregg's original [flamegraph toolkit] that aims
//! to improve the performance of the original flamegraph tools and provide programmatic access to
//! them to facilitate integration with _other_ tools (like [not-perf]).
//!
//! Inferno, like the original flame graph toolkit, consists of two "stages": stack collapsing and
//! plotting. In the original Perl implementations, these were represented by the `stackcollapse-*`
//! binaries and `flamegraph.pl` respectively. In Inferno, collapsing is available through the
//! [`collapse`] module and the `inferno-collapse-*` binaries, and plotting can be found in the
//! [`flamegraph`] module and the `inferno-flamegraph` binary.
//!
//! # Command-line use
//!
//! ## Collapsing stacks
//!
//! Most sampling profilers (as opposed to [tracing profilers]) work by repeatedly recording the
//! state of the [call stack]. The stack can be sampled based on a fixed sampling interval, based
//! on [hardware or software events], or some combination of the two. In the end, you get a series
//! of [stack traces], each of which represents a snapshot of where the program was at different
//! points in time.
//!
//! Given enough of these snapshots, you can get a pretty good idea of where your program is
//! spending its time by looking at which functions appear in many of the traces. To ease this
//! analysis, we want to "collapse" the stack traces so if a particular trace occurs more than
//! once, we instead just keep it _once_ along with a count of how many times we've seen it. This
//! is what the various collapsing tools do! You'll sometimes see the resulting tuples of stack +
//! count called a "folded stack trace".
//!
//! Since profiling tools produce stack traces in a myriad of different formats, and the flame
//! graph plotter expects input in a particular folded stack trace format, each profiler needs a
//! separate collapse implementation. While the original Perl implementation supports _lots_ of
//! profilers, Inferno currently only supports four: the widely used [`perf`] tool (specifically
//! the output from `perf script`), [DTrace], [sample], and [VTune]. Support for xdebug is
//! [hopefully coming soon], and [`bpftrace`] should get [native support] before too long.
//!
//! Inferno supports profiles from applications written in any language, but we'll walk through an
//! example with a Rust program. To profile a Rust application, you would first set
//!
//! ```toml
//! [profile.release]
//! debug = true
//! ```
//!
//! in your `Cargo.toml` so that your profile will have useful function names and such included.
//! Then, compile with `--release`, and then run your favorite performance profiler:
//!
//! ### perf (Linux)
//!
//! ```console
//! # perf record --call-graph dwarf target/release/mybin
//! $ perf script | inferno-collapse-perf > stacks.folded
//! ```
//!
//! For more advanced uses, see Brendan Gregg's excellent [perf examples] page.
//!
//! ### DTrace (macOS)
//!
//! ```console
//! $ target/release/mybin &
//! $ pid=$!
//! # dtrace -x ustackframes=100 -n "profile-97 /pid == $pid/ { @[ustack()] = count(); } tick-60s { exit(0); }"  -o out.user_stacks
//! $ cat out.user_stacks | inferno-collapse-dtrace > stacks.folded
//! ```
//!
//! For more advanced uses, see also upstream FlameGraph's [DTrace examples].
//! You may also be interested in something like [NodeJS's ustack helper].
//!
//! ### sample (macOS)
//!
//! ```console
//! $ target/release/mybin &
//! $ pid=$!
//! $ sample $pid 30 -file sample.txt
//! $ inferno-collapse-sample sample.txt > stacks.folded
//! ```
//!
//! ### VTune (Windows and Linux)
//!
//! ```console
//! $ amplxe-cl -collect hotspots -r resultdir -- target/release/mybin
//! $ amplxe-cl -R top-down -call-stack-mode all -column=\"CPU Time:Self\",\"Module\" -report-out result.csv -filter \"Function Stack\" -format csv -csv-delimiter comma -r resultdir
//! $ inferno-collapse-vtune result.csv > stacks.folded
//! ```
//!
//! ## Producing a flame graph
//!
//! Once you have a folded stack file, you're ready to produce the flame graph SVG image. To do so,
//! simply provide the folded stack file to `inferno-flamegraph`, and it will print the resulting
//! SVG. Following on from the example above:
//!
//! ```console
//! $ cat stacks.folded | inferno-flamegraph > profile.svg
//! ```
//!
//! And then open `profile.svg` in your viewer of choice.
//!
//! ## Differential flame graphs
//!
//! You can debug CPU performance regressions with the help of differential flame graphs.
//! They let you easily visualize the differences between two profiles performed before and
//! after a code change. See Brendan Gregg's [differential flame graphs] blog post for a great
//! writeup. To create one you must first pass the two folded stack files to `inferno-diff-folded`,
//! then send the output to `inferno-flamegraph`. Example:
//!
//! ```console
//! $ inferno-diff-folded folded1 folded2 | inferno-flamegraph > diff2.svg
//! ```
//!
//! The flamegraph will be colored based on higher samples (red) and smaller samples (blue). The
//! frame widths will be based on the 2nd folded profile. This might be confusing if stack frames
//! disappear entirely; it will make the most sense to ALSO create a differential based on the 1st
//! profile widths, while switching the hues. To do this, reverse the order of the input files
//! and pass the `--negate` flag to `inferno-flamegraph` like this:
//!
//! ```console
//! $ inferno-diff-folded folded2 folded1 | inferno-flamegraph --negate > diff1.svg
//! ```
//!
//! # Development
//!
//! This crate was initially developed through [a series of live coding sessions]. If you want to
//! contribute to the code, that may be a good way to learn why it's all designed the way it is!
//!
//!   [flame graphs]: http://www.brendangregg.com/flamegraphs.html
//!   [flamegraph toolkit]: https://github.com/brendangregg/FlameGraph
//!   [not-perf]: https://github.com/nokia/not-perf
//!   [tracing profilers]: https://danluu.com/perf-tracing/
//!   [call stack]: https://en.wikipedia.org/wiki/Call_stack
//!   [hardware or software events]: https://perf.wiki.kernel.org/index.php/Tutorial#Events
//!   [stack traces]: https://en.wikipedia.org/wiki/Stack_trace
//!   [`perf`]: https://perf.wiki.kernel.org/index.php/Main_Page
//!   [DTrace]: https://www.joyent.com/dtrace
//!   [hopefully coming soon]: https://twitter.com/DanielLockyer/status/1094605231155900416
//!   [native support]: https://github.com/jonhoo/inferno/issues/51#issuecomment-466732304
//!   [`bpftrace`]: https://github.com/iovisor/bpftrace
//!   [perf examples]: http://www.brendangregg.com/perf.html
//!   [DTrace examples]: http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html#DTrace
//!   [NodeJS's ustack helper]: http://dtrace.org/blogs/dap/2012/01/05/where-does-your-node-program-spend-its-time/
//!   [a series of live coding sessions]: https://www.youtube.com/watch?v=jTpK-bNZiA4&list=PLqbS7AVVErFimAvMW-kIJUwxpPvcPBCsz
//!   [differential flame graphs]: http://www.brendangregg.com/blog/2014-11-09/differential-flame-graphs.html
//!   [sample]: https://gist.github.com/loderunner/36724cc9ee8db66db305#profiling-with-sample
//!   [VTune]: https://software.intel.com/en-us/vtune-amplifier-help-command-line-interface

#![cfg_attr(doc, warn(rustdoc::all))]
#![cfg_attr(doc, allow(rustdoc::missing_doc_code_examples))]
#![deny(missing_docs)]
#![warn(unreachable_pub)]
#![cfg_attr(all(test, feature = "nightly"), feature(test))]
#![allow(clippy::disallowed_names)]

#[cfg(all(test, feature = "nightly"))]
extern crate test;

/// Stack collapsing for various input formats.
///
/// See the [crate-level documentation] for details.
///
///   [crate-level documentation]: ../index.html
pub mod collapse;

/// Tool for creating an output required to generate differential flame graphs.
///
/// See the [crate-level documentation] for details.
///
///   [crate-level documentation]: ../index.html
pub mod differential;

/// Tools for producing flame graphs from folded stack traces.
///
/// See the [crate-level documentation] for details.
///
///   [crate-level documentation]: ../index.html
pub mod flamegraph;