Expand description

Inferno is a set of tools that let you to produce flame graphs from performance profiles of your application. It’s a port of parts Brendan Gregg’s original flamegraph toolkit that aims to improve the performance of the original flamegraph tools and provide programmatic access to them to facilitate integration with other tools (like not-perf).

Inferno, like the original flame graph toolkit, consists of two “stages”: stack collapsing and plotting. In the original Perl implementations, these were represented by the stackcollapse-* binaries and flamegraph.pl respectively. In Inferno, collapsing is available through the collapse module and the inferno-collapse-* binaries, and plotting can be found in the flamegraph module and the inferno-flamegraph binary.

Command-line use

Collapsing stacks

Most sampling profilers (as opposed to tracing profilers) work by repeatedly recording the state of the call stack. The stack can be sampled based on a fixed sampling interval, based on hardware or software events, or some combination of the two. In the end, you get a series of stack traces, each of which represents a snapshot of where the program was at different points in time.

Given enough of these snapshots, you can get a pretty good idea of where your program is spending its time by looking at which functions appear in many of the traces. To ease this analysis, we want to “collapse” the stack traces so if a particular trace occurs more than once, we instead just keep it once along with a count of how many times we’ve seen it. This is what the various collapsing tools do! You’ll sometimes see the resulting tuples of stack + count called a “folded stack trace”.

Since profiling tools produce stack traces in a myriad of different formats, and the flame graph plotter expects input in a particular folded stack trace format, each profiler needs a separate collapse implementation. While the original Perl implementation supports lots of profilers, Inferno currently only supports four: the widely used perf tool (specifically the output from perf script), DTrace, sample, and VTune. Support for xdebug is hopefully coming soon, and bpftrace should get native support before too long.

Inferno supports profiles from applications written in any language, but we’ll walk through an example with a Rust program. To profile a Rust application, you would first set

[profile.release]
debug = true

in your Cargo.toml so that your profile will have useful function names and such included. Then, compile with --release, and then run your favorite performance profiler:

perf (Linux)

$ perf script | inferno-collapse-perf > stacks.folded

For more advanced uses, see Brendan Gregg’s excellent perf examples page.

DTrace (macOS)

$ target/release/mybin &
$ pid=$!
$ cat out.user_stacks | inferno-collapse-dtrace > stacks.folded

For more advanced uses, see also upstream FlameGraph’s DTrace examples. You may also be interested in something like NodeJS’s ustack helper.

sample (macOS)

$ target/release/mybin &
$ pid=$!
$ sample $pid 30 -file sample.txt
$ inferno-collapse-sample sample.txt > stacks.folded

VTune (Windows and Linux)

$ amplxe-cl -collect hotspots -r resultdir -- target/release/mybin
$ amplxe-cl -R top-down -call-stack-mode all -column=\"CPU Time:Self\",\"Module\" -report-out result.csv -filter \"Function Stack\" -format csv -csv-delimiter comma -r resultdir
$ inferno-collapse-vtune result.csv > stacks.folded

Producing a flame graph

Once you have a folded stack file, you’re ready to produce the flame graph SVG image. To do so, simply provide the folded stack file to inferno-flamegraph, and it will print the resulting SVG. Following on from the example above:

$ cat stacks.folded | inferno-flamegraph > profile.svg

And then open profile.svg in your viewer of choice.

Differential flame graphs

You can debug CPU performance regressions with the help of differential flame graphs. They let you easily visualize the differences between two profiles performed before and after a code change. See Brendan Gregg’s differential flame graphs blog post for a great writeup. To create one you must first pass the two folded stack files to inferno-diff-folded, then send the output to inferno-flamegraph. Example:

$ inferno-diff-folded folded1 folded2 | inferno-flamegraph > diff2.svg

The flamegraph will be colored based on higher samples (red) and smaller samples (blue). The frame widths will be based on the 2nd folded profile. This might be confusing if stack frames disappear entirely; it will make the most sense to ALSO create a differential based on the 1st profile widths, while switching the hues. To do this, reverse the order of the input files and pass the --negate flag to inferno-flamegraph like this:

$ inferno-diff-folded folded2 folded1 | inferno-flamegraph --negate > diff1.svg

Development

This crate was initially developed through a series of live coding sessions. If you want to contribute to the code, that may be a good way to learn why it’s all designed the way it is!

Modules

Stack collapsing for various input formats.
Tool for creating an output required to generate differential flame graphs.
Tools for producing flame graphs from folded stack traces.