Crate dhat[−][src]
Expand description
This crate provides heap profiling and ad hoc profiling capabilities to Rust programs, similar to those provided by DHAT.
The heap profiling works by using a global allocator that wraps the system
allocator, tracks all heap allocations, and on program exit writes data to
file so it can be viewed with DHAT’s viewer. This corresponds to DHAT’s
--mode=heap
mode.
The ad hoc profiling is via a second mode of operation, where ad hoc events
can be manually inserted into a Rust program for aggregation and viewing.
This corresponds to DHAT’s --mode=ad-hoc
mode.
Motivation
DHAT is a powerful heap profiler that comes with Valgrind. This crate is a related but alternative choice for heap profiling Rust programs. DHAT and this crate have the following differences.
- This crate works on any platform, while DHAT only works on some platforms (Linux, mostly). (Note that DHAT’s viewer is just HTML+JS+CSS and should work in any modern web browser on any platform.)
- This crate causes a much smaller slowdown than DHAT.
- This crate requires some modifications to a program’s source code and recompilation, while DHAT does not.
- This crate cannot track memory accesses the way DHAT does, because it does not instrument all memory loads and stores.
- This crate does not provide profiling of copy functions such as
memcpy
andstrcpy
, unlike DHAT. - The backtraces produced by this crate may be better than those produced by DHAT.
- DHAT measures a program’s entire execution, but this crate only measures
what happens within the scope of
main
. It will miss the small number of allocations that occur before or aftermain
, within the Rust runtime.
Configuration
In your Cargo.toml
file, as well as specifying dhat
as a dependency,
you should enable source line debug info:
[profile.release]
debug = 1
Usage (heap profiling)
For heap profiling, enable the global allocator by adding this code to your program:
use dhat::{Dhat, DhatAlloc};
#[global_allocator]
static ALLOCATOR: DhatAlloc = DhatAlloc;
Then add the following code to the very start of your main
function:
let _dhat = Dhat::start_heap_profiling();
DhatAlloc
is slower than the system allocator, so it should only be
enabled while profiling.
Usage (ad hoc profiling)
Ad hoc profiling involves manually annotating hot code points and then aggregating the executed annotations in some fashion.
To do this, add the following code to the very start of your main
function:
let _dhat = Dhat::start_ad_hoc_profiling();
Then insert calls like this at points of interest:
dhat::ad_hoc_event(100);
For example, imagine you have a hot function that is called from many call
sites. You might want to know how often it is called and which other
functions called it the most. In that case, you would add a ad_hoc_event
call to that function, and the data collected by this crate and viewed with
DHAT’s viewer would show you exactly what you want to know.
The meaning of the integer argument to ad_hoc_event
will depend on
exactly what you are measuring. If there is no meaningful weight to give to
an event, you can just use 1
.
Running
For both heap profiling and ad hoc profiling, the program will run more slowly than normal. (Unfortunately, on Windows, it may run much more slowly. This is because backtrace gathering can be drastically slower on Windows than on other platforms.)
When the Dhat
value is dropped at the end of main
, some basic
information will be printed to stderr
. For heap profiling it will look
like the following.
dhat: Total: 1,256 bytes in 6 blocks
dhat: At t-gmax: 1,256 bytes in 6 blocks
dhat: At t-end: 1,256 bytes in 6 blocks
dhat: The data in dhat-heap.json is viewable with dhat/dh_view.html
For ad hoc profiling it will look like the following.
dhat: Total: 141 units in 11 events
dhat: The data in dhat-ad-hoc.json is viewable with dhat/dh_view.html
A file called dhat-heap.json
(for heap profiling) or dhat-ad-hoc.json
(for ad hoc profiling) will be written. It can be viewed in DHAT’s viewer.
If you don’t see this output, it may be because your program called
std::process::exit
, which terminates a program without running any
destructors. To work around this, explicitly call drop
on the Dhat
value just before the call to std::process:exit
.
Viewing
Open a copy of DHAT’s viewer, version 3.17 or later. There are two ways to do this.
- Easier: Use the online version.
- Harder: Clone the Valgrind repository with
git clone git://sourceware.org/git/valgrind.git
and opendhat/dh_view.html
. (There is no need to build any code in this repository.)
Then click on the “Load…” button to load dhat-heap.json
or
dhat-ad-hoc.json
.
DHAT’s viewer shows a tree with nodes that look like this.
PP 1.1/6 {
Total: 1,024 bytes (81.53%, 3,335,504.89/s) in 1 blocks (16.67%, 3,257.33/s), avg size 1,024 bytes, avg lifetime 61 µs (19.87% of program duration)
Max: 1,024 bytes in 1 blocks, avg size 1,024 bytes
At t-gmax: 1,024 bytes (81.53%) in 1 blocks (16.67%), avg size 1,024 bytes
At t-end: 1,024 bytes (81.53%) in 1 blocks (16.67%), avg size 1,024 bytes
Allocated at {
#1: 0x10c1e4108: <alloc::alloc::Global as core::alloc::AllocRef>::alloc (alloc.rs:203:9)
#2: 0x10c1e4108: alloc::raw_vec::RawVec<T,A>::allocate_in (raw_vec.rs:186:45)
#3: 0x10c1e4108: alloc::raw_vec::RawVec<T,A>::with_capacity_in (raw_vec.rs:161:9)
#4: 0x10c1e4108: alloc::raw_vec::RawVec<T>::with_capacity (raw_vec.rs:92:9)
#5: 0x10c1e4108: alloc::vec::Vec<T>::with_capacity (vec.rs:355:20)
#6: 0x10c1e4108: std::io::buffered::BufWriter<W>::with_capacity (buffered.rs:517:46)
#7: 0x10c1e4108: std::io::buffered::LineWriter<W>::with_capacity (buffered.rs:925:29)
#8: 0x10c1e4108: std::io::buffered::LineWriter<W>::new (buffered.rs:905:9)
#9: 0x10c1e4108: std::io::stdio::stdout::stdout_init (stdio.rs:543:65)
#10: 0x10c1e4108: std::io::lazy::Lazy<T>::init (lazy.rs:57:19)
#11: 0x10c1e4108: std::io::lazy::Lazy<T>::get (lazy.rs:33:18)
#12: 0x10c1e4108: std::io::stdio::stdout (stdio.rs:536:25)
#13: 0x10c1e4ccb: std::io::stdio::print_to::{{closure}} (stdio.rs:890:13)
#14: 0x10c1e4ccb: std::thread::local::LocalKey<T>::try_with (local.rs:265:16)
#15: 0x10c1e4ccb: std::io::stdio::print_to (stdio.rs:879:18)
#16: 0x10c1e4ccb: std::io::stdio::_print (stdio.rs:907:5)
#17: 0x10c0d6826: heap::main (heap.rs:9:5)
}
}
Full details about the output are in the DHAT documentation.
Note that DHAT uses the word “block” rather than “allocation” to refer to the memory allocated by a single heap allocation operation.
When heap profiling, this crate doesn’t track memory accesses (unlike DHAT) and so the “reads” and “writes” measurements are not shown within DHAT’s viewer, and “sort metric” views involving reads, writes, or accesses are not available.
The backtraces produced by this crate are trimmed to reduce output file sizes and improve readability in DHAT’s viewer.
- Only one allocation-related frame will be shown at the top of the
backtrace. That frame may be a function within
alloc::alloc
, a function within this crate, or a global allocation function like__rg_alloc
. - Common frames at the bottom of backtraces, below
main
, are omitted.
Structs
A type whose scope dictates the start and end of profiling.
A global allocator that tracks allocations and deallocations on behalf of
the Dhat
type.
Some heap stats about execution. For testing purposes, subject to change.
Some stats about execution. For testing purposes, subject to change.
Functions
Register an event during ad hoc profiling. Has no effect unless a Dhat
value that was created with Dhat::start_ad_hoc_profiling
is in scope. The
meaning of the weight argument is determined by the user.
Get current stats. Returns None
if called before
Dhat::start_heap_profiling
or Dhat::start_ad_hoc_profiling
is called.