pub struct Callgrind(/* private fields */);
default
only.Expand description
The configuration for Callgrind
Can be specified in crate::LibraryBenchmarkConfig::tool
or
crate::BinaryBenchmarkConfig::tool
.
§Example
use iai_callgrind::{LibraryBenchmarkConfig, main, Callgrind};
main!(
config = LibraryBenchmarkConfig::default()
.tool(Callgrind::default());
library_benchmark_groups = some_group
);
Implementations§
Source§impl Callgrind
impl Callgrind
Sourcepub fn with_args<I, T>(args: T) -> Self
pub fn with_args<I, T>(args: T) -> Self
Create a new Callgrind
configuration with initial command-line arguments
See also Callgrind::args
§Examples
use iai_callgrind::Callgrind;
let config = Callgrind::with_args(["collect-bus=yes"]);
Sourcepub fn args<I, T>(&mut self, args: T) -> &mut Self
pub fn args<I, T>(&mut self, args: T) -> &mut Self
Add command-line arguments to the Callgrind
configuration
The command-line arguments are passed directly to the callgrind invocation. Valid arguments are https://valgrind.org/docs/manual/cl-manual.html#cl-manual.options and the core valgrind command-line arguments https://valgrind.org/docs/manual/manual-core.html#manual-core.options. Note that not all command-line arguments are supported especially the ones which change output paths. Unsupported arguments will be ignored printing a warning.
The flags can be omitted (“collect-bus” instead of “–collect-bus”).
§Examples
use iai_callgrind::Callgrind;
let config = Callgrind::default().args(["collect-bus=yes"]);
Sourcepub fn enable(&mut self, value: bool) -> &mut Self
pub fn enable(&mut self, value: bool) -> &mut Self
Enable this tool. This is the default.
This is mostly useful to disable a tool which has been enabled in a
crate::LibraryBenchmarkConfig
(or crate::BinaryBenchmarkConfig
) at a higher-level.
However, the default tool (usually callgrind) cannot be disabled.
use iai_callgrind::Callgrind;
let config = Callgrind::default().enable(false);
Sourcepub fn entry_point(&mut self, entry_point: EntryPoint) -> &mut Self
pub fn entry_point(&mut self, entry_point: EntryPoint) -> &mut Self
Set or unset the entry point for a benchmark
Iai-Callgrind sets the --toggle-collect
argument of callgrind to the benchmark function
which we call EntryPoint::Default
. Specifying a --toggle-collect
argument, sets
automatically --collect-at-start=no
. This ensures that only the metrics from the benchmark
itself are collected and not the setup
or teardown
or anything before/after the
benchmark function.
However, there are cases when the default toggle is not enough EntryPoint::Custom
or in
the way EntryPoint::None
.
Setting EntryPoint::Custom
is convenience for disabling the entry point with
EntryPoint::None
and setting --toggle-collect=CUSTOM_ENTRY_POINT
in
Callgrind::args
. EntryPoint::Custom
can be useful if you
want to benchmark a private function and only need the function in the benchmark function as
access point. EntryPoint::Custom
accepts glob patterns the same way as
--toggle-collect
does.
§Examples
If you’re using callgrind client requests either in the benchmark function itself or in your
library, then using EntryPoint::None
is presumably be required. Consider the following
example (DEFAULT_ENTRY_POINT
marks the default entry point):
use iai_callgrind::{
main, LibraryBenchmarkConfig,library_benchmark, library_benchmark_group
};
use std::hint::black_box;
fn to_be_benchmarked() -> u64 {
println!("Some info output");
iai_callgrind::client_requests::callgrind::start_instrumentation();
let result = {
// some heavy calculations
};
iai_callgrind::client_requests::callgrind::stop_instrumentation();
result
}
#[library_benchmark]
fn some_bench() -> u64 { // <-- DEFAULT ENTRY POINT
black_box(to_be_benchmarked())
}
library_benchmark_group!(name = some_group; benchmarks = some_bench);
main!(library_benchmark_groups = some_group);
In the example above EntryPoint::Default
is active, so the counting of events starts
when the some_bench
function is entered. In to_be_benchmarked
, the client request
start_instrumentation
does effectively nothing and stop_instrumentation
will stop the
event counting as requested. This is most likely not what you intended. The event counting
should start with start_instrumentation
. To achieve this, you can set EntryPoint::None
which removes the default toggle, but also --collect-at-start=no
. So, you need to specify
--collect-at-start=no
in Callgrind::args
. The example would then look like this:
use std::hint::black_box;
use iai_callgrind::{library_benchmark, EntryPoint, LibraryBenchmarkConfig, Callgrind};
// ...
#[library_benchmark(
config = LibraryBenchmarkConfig::default()
.tool(Callgrind::with_args(["--collect-at-start=no"])
.entry_point(EntryPoint::None)
)
)]
fn some_bench() -> u64 {
black_box(to_be_benchmarked())
}
// ...
Sourcepub fn limits<T>(&mut self, limits: T) -> &mut Self
pub fn limits<T>(&mut self, limits: T) -> &mut Self
Configure the limits percentages over/below which a performance regression can be assumed
A performance regression check consists of an EventKind
and a percentage over which a
regression is assumed. If the percentage is negative, then a regression is assumed to be
below this limit.
§Examples
use iai_callgrind::{Callgrind, EventKind};
let config = Callgrind::default().limits([(EventKind::Ir, 5f64)]);
Sourcepub fn fail_fast(&mut self, value: bool) -> &mut Self
pub fn fail_fast(&mut self, value: bool) -> &mut Self
If set to true, then the benchmarks fail on the first encountered regression
The default is false
and the whole benchmark run fails with a regression error after all
benchmarks have been run.
§Examples
use iai_callgrind::Callgrind;
let config = Callgrind::default().fail_fast(true);
Sourcepub fn flamegraph<T>(&mut self, flamegraph: T) -> &mut Selfwhere
T: Into<InternalFlamegraphConfig>,
pub fn flamegraph<T>(&mut self, flamegraph: T) -> &mut Selfwhere
T: Into<InternalFlamegraphConfig>,
Option to produce flamegraphs from callgrind output with a crate::FlamegraphConfig
The flamegraphs are usable but still in an experimental stage. Callgrind lacks the tool like
cg_diff
for cachegrind to compare two different profiles. Flamegraphs on the other hand
can bridge the gap and be FlamegraphKind::Differential
to compare two benchmark runs.
§Examples
use iai_callgrind::{
LibraryBenchmarkConfig, main, FlamegraphConfig, FlamegraphKind, Callgrind
};
main!(
config = LibraryBenchmarkConfig::default()
.tool(Callgrind::default()
.flamegraph(FlamegraphConfig::default()
.kind(FlamegraphKind::Differential)
)
);
library_benchmark_groups = some_group
);
Sourcepub fn format<I, T>(&mut self, callgrind_metrics: T) -> &mut Self
pub fn format<I, T>(&mut self, callgrind_metrics: T) -> &mut Self
Customize the format of the callgrind output
This option allows customizing the output format of callgrind metrics. It does not set any
flags for the callgrind execution (i.e. --branch-sim=yes
) which actually enable the
collection of these metrics. Consult the docs of EventKind
and CallgrindMetrics
to
see which flag is necessary to enable the collection of a specific metric. The rules:
- A metric is only printed if specified here
- A metric is not printed if not collected by callgrind
- The order matters
- In case of duplicate specifications of the same metric the first one wins.
Callgrind offers a lot of metrics, so the CallgrindMetrics
enum contains groups of
EventKind
s, to avoid having to specify all EventKind
s one-by-one (although still
possible with CallgrindMetrics::SingleEvent
).
All command-line arguments of callgrind and which metric they collect are described in full detail in the callgrind documentation.
§Examples
To enable printing all callgrind metrics specify CallgrindMetrics::All
. All
callgrind
metrics include the cache misses (EventKind::I1mr
, …). For example in a library
benchmark:
use iai_callgrind::{main, LibraryBenchmarkConfig, OutputFormat, CallgrindMetrics, Callgrind};
main!(
config = LibraryBenchmarkConfig::default()
.tool(Callgrind::default()
.format([CallgrindMetrics::All]));
library_benchmark_groups = some_group
);
The benchmark is executed with the callgrind arguments set by iai-callgrind which don’t
collect any other metrics than cache misses (--cache-sim=yes
), so the output will look
like this:
file::some_group::printing cache_misses:
Instructions: 1353|1353 (No change)
Dr: 255|255 (No change)
Dw: 233|233 (No change)
I1mr: 54|54 (No change)
D1mr: 12|12 (No change)
D1mw: 0|0 (No change)
ILmr: 53|53 (No change)
DLmr: 3|3 (No change)
DLmw: 0|0 (No change)
L1 Hits: 1775|1775 (No change)
LL Hits: 10|10 (No change)
RAM Hits: 56|56 (No change)
Total read+write: 1841|1841 (No change)
Estimated Cycles: 3785|3785 (No change)