Inferno is a set of tools that let you to produce flame graphs from performance profiles of your application. It’s a port of parts Brendan Gregg’s original flamegraph toolkit that aims to improve the performance of the original flamegraph tools and provide programmatic access to them to facilitate integration with other tools (like not-perf).
Inferno, like the original flame graph toolkit, consists of two “stages”: stack collapsing and
plotting. In the original Perl implementations, these were represented by the
flamegraph.pl respectively. In Inferno, collapsing is available through the
collapse module and the
inferno-collapse-* binaries, and plotting can be found in the
flamegraph module and the
Most sampling profilers (as opposed to tracing profilers) work by repeatedly recording the state of the call stack. The stack can be sampled based on a fixed sampling interval, based on hardware or software events, or some combination of the two. In the end, you get a series of stack traces, each of which represents a snapshot of where the program was at different points in time.
Given enough of these snapshots, you can get a pretty good idea of where your program is spending its time by looking at which functions appear in many of the traces. To ease this analysis, we want to “collapse” the stack traces so if a particular trace occurs more than once, we instead just keep it once along with a count of how many times we’ve seen it. This is what the various collapsing tools do! You’ll sometimes see the resulting tuples of stack + count called a “folded stack trace”.
Since profiling tools produce stack traces in a myriad of different formats, and the flame
graph plotter expects input in a particular folded stack trace format, each profiler needs a
separate collapse implementation. While the original Perl implementation supports lots of
profilers, Inferno currently only supports four: the widely used
perf tool (specifically
the output from
perf script), DTrace, sample, and VTune. Support for xdebug is
hopefully coming soon, and
bpftrace should get native support before too long.
Inferno supports profiles from applications written in any language, but we’ll walk through an example with a Rust program. To profile a Rust application, you would first set
[profile.release] debug = true
Cargo.toml so that your profile will have useful function names and such included.
Then, compile with
--release, and then run your favorite performance profiler:
$ perf script | inferno-collapse-perf > stacks.folded
For more advanced uses, see Brendan Gregg’s excellent perf examples page.
$ target/release/mybin & $ pid=$! $ cat out.user_stacks | inferno-collapse-dtrace > stacks.folded
For more advanced uses, see also upstream FlameGraph’s DTrace examples. You may also be interested in something like NodeJS’s ustack helper.
$ target/release/mybin & $ pid=$! $ sample $pid 30 -file sample.txt $ inferno-collapse-sample sample.txt > stacks.folded
VTune (Windows and Linux)
$ amplxe-cl -collect hotspots -r resultdir -- target/release/mybin $ amplxe-cl -R top-down -call-stack-mode all -column=\"CPU Time:Self\",\"Module\" -report-out result.csv -filter \"Function Stack\" -format csv -csv-delimiter comma -r resultdir $ inferno-collapse-vtune result.csv > stacks.folded
Producing a flame graph
Once you have a folded stack file, you’re ready to produce the flame graph SVG image. To do so,
simply provide the folded stack file to
inferno-flamegraph, and it will print the resulting
SVG. Following on from the example above:
$ cat stacks.folded | inferno-flamegraph > profile.svg
And then open
profile.svg in your viewer of choice.
Differential flame graphs
You can debug CPU performance regressions with the help of differential flame graphs.
They let you easily visualize the differences between two profiles performed before and
after a code change. See Brendan Gregg’s differential flame graphs blog post for a great
writeup. To create one you must first pass the two folded stack files to
then send the output to
$ inferno-diff-folded folded1 folded2 | inferno-flamegraph > diff2.svg
The flamegraph will be colored based on higher samples (red) and smaller samples (blue). The
frame widths will be based on the 2nd folded profile. This might be confusing if stack frames
disappear entirely; it will make the most sense to ALSO create a differential based on the 1st
profile widths, while switching the hues. To do this, reverse the order of the input files
and pass the
--negate flag to
inferno-flamegraph like this:
$ inferno-diff-folded folded2 folded1 | inferno-flamegraph --negate > diff1.svg
This crate was initially developed through a series of live coding sessions. If you want to contribute to the code, that may be a good way to learn why it’s all designed the way it is!
Stack collapsing for various input formats.
Tool for creating an output required to generate differential flame graphs.
Tools for producing flame graphs from folded stack traces.