Struct Callgrind

Source

pub struct Callgrind(/* private fields */);

Available on crate feature default only.

Expand description

The configuration for Callgrind

Can be specified in crate::LibraryBenchmarkConfig::tool or crate::BinaryBenchmarkConfig::tool.

§Example

use iai_callgrind::{LibraryBenchmarkConfig, main, Callgrind};

main!(
    config = LibraryBenchmarkConfig::default()
        .tool(Callgrind::default());
    library_benchmark_groups = some_group
);

Implementations§

Source §

impl Callgrind

Source

pub fn with_args<I, T>(args: T) -> Self
where I: AsRef<str>, T: IntoIterator<Item = I>,

Create a new Callgrind configuration with initial command-line arguments

§Examples

use iai_callgrind::Callgrind;

let config = Callgrind::with_args(["collect-bus=yes"]);

Source

pub fn args<I, T>(&mut self, args: T) -> &mut Self
where I: AsRef<str>, T: IntoIterator<Item = I>,

Add command-line arguments to the Callgrind configuration

The command-line arguments are passed directly to the callgrind invocation. Valid arguments are https://valgrind.org/docs/manual/cl-manual.html#cl-manual.options and the core valgrind command-line arguments https://valgrind.org/docs/manual/manual-core.html#manual-core.options. Note that not all command-line arguments are supported especially the ones which change output paths. Unsupported arguments will be ignored printing a warning.

The flags can be omitted (“collect-bus” instead of “–collect-bus”).

§Examples

use iai_callgrind::Callgrind;

let config = Callgrind::default().args(["collect-bus=yes"]);

Source

pub fn enable(&mut self, value: bool) -> &mut Self

Enable this tool. This is the default.

This is mostly useful to disable a tool which has been enabled in a crate::LibraryBenchmarkConfig (or crate::BinaryBenchmarkConfig) at a higher-level. However, the default tool (usually callgrind) cannot be disabled.

use iai_callgrind::Callgrind;

let config = Callgrind::default().enable(false);

Source

pub fn entry_point(&mut self, entry_point: EntryPoint) -> &mut Self

Set or unset the entry point for a benchmark

Iai-Callgrind sets the --toggle-collect argument of callgrind to the benchmark function which we call EntryPoint::Default. Specifying a --toggle-collect argument, sets automatically --collect-at-start=no. This ensures that only the metrics from the benchmark itself are collected and not the setup or teardown or anything before/after the benchmark function.

However, there are cases when the default toggle is not enough EntryPoint::Custom or in the way EntryPoint::None.

Setting EntryPoint::Custom is convenience for disabling the entry point with EntryPoint::None and setting --toggle-collect=CUSTOM_ENTRY_POINT in Callgrind::args. EntryPoint::Custom can be useful if you want to benchmark a private function and only need the function in the benchmark function as access point. EntryPoint::Custom accepts glob patterns the same way as --toggle-collect does.

§Examples

If you’re using callgrind client requests either in the benchmark function itself or in your library, then using EntryPoint::None is presumably be required. Consider the following example (DEFAULT_ENTRY_POINT marks the default entry point):

use iai_callgrind::{
    main, LibraryBenchmarkConfig,library_benchmark, library_benchmark_group
};
use std::hint::black_box;

fn to_be_benchmarked() -> u64 {
    println!("Some info output");
    iai_callgrind::client_requests::callgrind::start_instrumentation();
    let result = {
        // some heavy calculations
    };
    iai_callgrind::client_requests::callgrind::stop_instrumentation();

    result
}

#[library_benchmark]
fn some_bench() -> u64 { // <-- DEFAULT ENTRY POINT
    black_box(to_be_benchmarked())
}

library_benchmark_group!(name = some_group; benchmarks = some_bench);
main!(library_benchmark_groups = some_group);

In the example above EntryPoint::Default is active, so the counting of events starts when the some_bench function is entered. In to_be_benchmarked, the client request start_instrumentation does effectively nothing and stop_instrumentation will stop the event counting as requested. This is most likely not what you intended. The event counting should start with start_instrumentation. To achieve this, you can set EntryPoint::None which removes the default toggle, but also --collect-at-start=no. So, you need to specify --collect-at-start=no in Callgrind::args. The example would then look like this:

use std::hint::black_box;

use iai_callgrind::{library_benchmark, EntryPoint, LibraryBenchmarkConfig, Callgrind};

// ...

#[library_benchmark(
    config = LibraryBenchmarkConfig::default()
        .tool(Callgrind::with_args(["--collect-at-start=no"])
            .entry_point(EntryPoint::None)
        )
)]
fn some_bench() -> u64 {
    black_box(to_be_benchmarked())
}

// ...

Source

pub fn limits<T>(&mut self, limits: T) -> &mut Self
where T: IntoIterator<Item = (EventKind, f64)>,

Configure the limits percentages over/below which a performance regression can be assumed

A performance regression check consists of an EventKind and a percentage over which a regression is assumed. If the percentage is negative, then a regression is assumed to be below this limit.

§Examples

use iai_callgrind::{Callgrind, EventKind};

let config = Callgrind::default().limits([(EventKind::Ir, 5f64)]);

Source

pub fn fail_fast(&mut self, value: bool) -> &mut Self

If set to true, then the benchmarks fail on the first encountered regression

The default is false and the whole benchmark run fails with a regression error after all benchmarks have been run.

§Examples

use iai_callgrind::Callgrind;

let config = Callgrind::default().fail_fast(true);

Source

pub fn flamegraph<T>(&mut self, flamegraph: T) -> &mut Self
where T: Into<InternalFlamegraphConfig>,

Option to produce flamegraphs from callgrind output with a crate::FlamegraphConfig

The flamegraphs are usable but still in an experimental stage. Callgrind lacks the tool like cg_diff for cachegrind to compare two different profiles. Flamegraphs on the other hand can bridge the gap and be FlamegraphKind::Differential to compare two benchmark runs.

§Examples

use iai_callgrind::{
    LibraryBenchmarkConfig, main, FlamegraphConfig, FlamegraphKind, Callgrind
};

main!(
    config = LibraryBenchmarkConfig::default()
        .tool(Callgrind::default()
            .flamegraph(FlamegraphConfig::default()
                .kind(FlamegraphKind::Differential)
            )
        );
    library_benchmark_groups = some_group
);

Source

pub fn format<I, T>(&mut self, callgrind_metrics: T) -> &mut Self
where I: Into<CallgrindMetrics>, T: IntoIterator<Item = I>,

Customize the format of the callgrind output

This option allows customizing the output format of callgrind metrics. It does not set any flags for the callgrind execution (i.e. --branch-sim=yes) which actually enable the collection of these metrics. Consult the docs of EventKind and CallgrindMetrics to see which flag is necessary to enable the collection of a specific metric. The rules:

A metric is only printed if specified here
A metric is not printed if not collected by callgrind
The order matters
In case of duplicate specifications of the same metric the first one wins.

Callgrind offers a lot of metrics, so the CallgrindMetrics enum contains groups of EventKinds, to avoid having to specify all EventKinds one-by-one (although still possible with CallgrindMetrics::SingleEvent).

All command-line arguments of callgrind and which metric they collect are described in full detail in the callgrind documentation.

§Examples

To enable printing all callgrind metrics specify CallgrindMetrics::All. All callgrind metrics include the cache misses (EventKind::I1mr, …). For example in a library benchmark:

use iai_callgrind::{main, LibraryBenchmarkConfig, OutputFormat, CallgrindMetrics, Callgrind};
main!(
    config = LibraryBenchmarkConfig::default()
                 .tool(Callgrind::default()
                     .format([CallgrindMetrics::All]));
    library_benchmark_groups = some_group
);

The benchmark is executed with the callgrind arguments set by iai-callgrind which don’t collect any other metrics than cache misses (--cache-sim=yes), so the output will look like this:

file::some_group::printing cache_misses:
  Instructions:                        1353|1353                 (No change)
  Dr:                                   255|255                  (No change)
  Dw:                                   233|233                  (No change)
  I1mr:                                  54|54                   (No change)
  D1mr:                                  12|12                   (No change)
  D1mw:                                   0|0                    (No change)
  ILmr:                                  53|53                   (No change)
  DLmr:                                   3|3                    (No change)
  DLmw:                                   0|0                    (No change)
  L1 Hits:                             1775|1775                 (No change)
  LL Hits:                               10|10                   (No change)
  RAM Hits:                              56|56                   (No change)
  Total read+write:                    1841|1841                 (No change)
  Estimated Cycles:                    3785|3785                 (No change)