Expand description
canbench
is a tool for benchmarking canisters on the Internet Computer.
Ā§Quickstart
This example is also available to tinker with in the examples directory. See the fibonacci example.
Ā§1. Install the canbench
binary.
The canbench
is what runs your canisterās benchmarks.
cargo install canbench
Ā§2. Add optional dependency to Cargo.toml
Typically you do not want your benchmarks to be part of your canister when deploying it to the Internet Computer.
Therefore, we include canbench
only as an optional dependency so that itās only included when running benchmarks.
For more information about optional dependencies, you can read more about them here.
canbench_rs = { version = "x.y.z", optional = true }
Ā§3. Add a configuration to canbench.yml
The canbench.yml
configuration file tells canbench
how to build and run you canister.
Below is a typical configuration.
Note that weāre compiling the canister with the canbench
feature so that the benchmarking logic is included in the Wasm.
build_cmd:
cargo build --release --target wasm32-unknown-unknown --features canbench-rs
wasm_path:
./target/wasm32-unknown-unknown/release/<YOUR_CANISTER>.wasm
Ā§Init Args
Init args can be specified using the init_args
key in the configuration file:
init_args:
hex: 4449444c0001710568656c6c6f
Ā§Stable Memory
A file can be specified to be loaded in the canisterās stable memory after initialization.
stable_memory:
file:
stable_memory.bin
Ā§4. Start benching! šš½
Letās say we have a canister that exposes a query
computing the fibonacci sequence of a given number.
Hereās what that query can look like:
#[ic_cdk::query]
fn fibonacci(n: u32) -> u32 {
if n == 0 {
return 0;
} else if n == 1 {
return 1;
}
let mut a = 0;
let mut b = 1;
let mut result = 0;
for _ in 2..=n {
result = a + b;
a = b;
b = result;
}
result
}
Now, letās add some benchmarks to this query:
#[cfg(feature = "canbench-rs")]
mod benches {
use super::*;
use canbench_rs::bench;
#[bench]
fn fibonacci_20() {
// NOTE: the result is printed to prevent the compiler from optimizing the call away.
println!("{:?}", fibonacci(20));
}
#[bench]
fn fibonacci_45() {
// NOTE: the result is printed to prevent the compiler from optimizing the call away.
println!("{:?}", fibonacci(45));
}
}
Run canbench
. Youāll see an output that looks similar to this:
$ canbench
---------------------------------------------------
Benchmark: fibonacci_20 (new)
total:
instructions: 2301 (new)
heap_increase: 0 pages (new)
stable_memory_increase: 0 pages (new)
---------------------------------------------------
Benchmark: fibonacci_45 (new)
total:
instructions: 3088 (new)
heap_increase: 0 pages (new)
stable_memory_increase: 0 pages (new)
---------------------------------------------------
Executed 2 of 2 benchmarks.
Ā§5. Track performance regressions
Notice that canbench
reported the above benchmarks as ānewā.
canbench
allows you to persist the results of these benchmarks.
In subsequent runs, canbench
reports the performance relative to the last persisted run.
Letās first persist the results above by running canbench
again, but with the persist
flag:
$ canbench --persist
...
---------------------------------------------------
Executed 2 of 2 benchmarks.
Successfully persisted results to canbench_results.yml
Now, if we run canbench
again, canbench
will run the benchmarks, and will additionally report that there were no changes detected in performance.
$ canbench
Finished release [optimized] target(s) in 0.34s
---------------------------------------------------
Benchmark: fibonacci_20
total:
instructions: 2301 (no change)
heap_increase: 0 pages (no change)
stable_memory_increase: 0 pages (no change)
---------------------------------------------------
Benchmark: fibonacci_45
total:
instructions: 3088 (no change)
heap_increase: 0 pages (no change)
stable_memory_increase: 0 pages (no change)
---------------------------------------------------
Executed 2 of 2 benchmarks.
Letās try swapping out our implementation of fibonacci
with an implementation thatās miserably inefficient.
Replace the fibonacci
function defined previously with the following:
#[ic_cdk::query]
fn fibonacci(n: u32) -> u32 {
match n {
0 => 1,
1 => 1,
_ => fibonacci(n - 1) + fibonacci(n - 2),
}
}
And running canbench
again, we see that it detects and reports a regression.
$ canbench
---------------------------------------------------
Benchmark: fibonacci_20
total:
instructions: 337.93 K (regressed by 14586.14%)
heap_increase: 0 pages (no change)
stable_memory_increase: 0 pages (no change)
---------------------------------------------------
Benchmark: fibonacci_45
total:
instructions: 56.39 B (regressed by 1826095830.76%)
heap_increase: 0 pages (no change)
stable_memory_increase: 0 pages (no change)
---------------------------------------------------
Executed 2 of 2 benchmarks.
Apparently, the recursive implementation is many orders of magnitude more expensive than the iterative implementation š± Good thing we found out before deploying this implementation to production.
Notice that fibonacci_45
took > 50B instructions, which is substantially more than the instruction limit given for a single message execution on the Internet Computer. canbench
runs benchmarks in an environment that gives them up to 10T instructions.
Ā§Additional Examples
For the following examples, weāll be using the following canister code, which you can also find in the examples directory.
This canister defines a simple state as well as a pre_upgrade
function that stores that state into stable memory.
use candid::{CandidType, Encode};
use ic_cdk_macros::pre_upgrade;
use std::cell::RefCell;
#[derive(CandidType)]
struct User {
name: String,
}
#[derive(Default, CandidType)]
struct State {
users: std::collections::BTreeMap<u64, User>,
}
thread_local! {
static STATE: RefCell<State> = RefCell::new(State::default());
}
#[pre_upgrade]
fn pre_upgrade() {
// Serialize state.
let bytes = STATE.with(|s| Encode!(s).unwrap());
// Write to stable memory.
ic_cdk::api::stable::StableWriter::default()
.write(&bytes)
.unwrap();
}
Ā§Excluding setup code
Letās say we want to benchmark how long it takes to run the pre_upgrade
function. We can define the following benchmark:
#[cfg(feature = "canbench-rs")]
mod benches {
use super::*;
use canbench_rs::bench;
#[bench]
fn pre_upgrade_bench() {
// Some function that fills the state with lots of data.
initialize_state();
pre_upgrade();
}
}
The problem with the above benchmark is that itās benchmarking both the pre_upgrade
call and the initialization of the state.
What if weāre only interested in benchmarking the pre_upgrade
call?
To address this, we can use the #[bench(raw)]
macro to specify exactly which code weād like to benchmark.
#[cfg(feature = "canbench-rs")]
mod benches {
use super::*;
use canbench_rs::bench;
#[bench(raw)]
fn pre_upgrade_bench() -> canbench_rs::BenchResult {
// Some function that fills the state with lots of data.
initialize_state();
// Only benchmark the pre_upgrade. Initializing the state isn't
// included in the results of our benchmark.
canbench_rs::bench_fn(pre_upgrade)
}
}
Running canbench
on the example above will benchmark only the code wrapped in canbench_rs::bench_fn
, which in this case is the call to pre_upgrade
.
$ canbench pre_upgrade_bench
---------------------------------------------------
Benchmark: pre_upgrade_bench (new)
total:
instructions: 717.10 M (new)
heap_increase: 519 pages (new)
stable_memory_increase: 184 pages (new)
---------------------------------------------------
Executed 1 of 1 benchmarks.
Ā§Granular Benchmarking
Building on the example above, the pre_upgrade
function does two steps:
- Serialize the state
- Write to stable memory
Suppose weāre interested in understanding, within pre_upgrade
, the resources spent in each of these steps.
canbench
allows you to do more granular benchmarking using the canbench_rs::bench_scope
function.
Hereās how we can modify our pre_upgrade
function:
#[pre_upgrade]
fn pre_upgrade() {
// Serialize state.
let bytes = {
#[cfg(feature = "canbench-rs")]
let _p = canbench_rs::bench_scope("serialize_state");
STATE.with(|s| Encode!(s).unwrap())
};
// Write to stable memory.
#[cfg(feature = "canbench-rs")]
let _p = canbench_rs::bench_scope("writing_to_stable_memory");
ic_cdk::api::stable::StableWriter::default()
.write(&bytes)
.unwrap();
}
In the code above, weāve asked canbench
to profile each of these steps separately.
Running canbench
now, each of these steps are reported.
$ canbench pre_upgrade_bench
---------------------------------------------------
Benchmark: pre_upgrade_bench (new)
total:
instructions: 717.11 M (new)
heap_increase: 519 pages (new)
stable_memory_increase: 184 pages (new)
serialize_state (profiling):
instructions: 717.10 M (new)
heap_increase: 519 pages (new)
stable_memory_increase: 0 pages (new)
writing_to_stable_memory (profiling):
instructions: 502 (new)
heap_increase: 0 pages (new)
stable_memory_increase: 184 pages (new)
---------------------------------------------------
Executed 1 of 1 benchmarks.
StructsĀ§
- Bench
Result - The results of a benchmark.
- Bench
Scope - An object used for benchmarking a specific scope.
- Measurement
- A benchmark measurement containing various stats.
FunctionsĀ§
- bench_
fn - Benchmarks the given function.
- bench_
scope - Benchmarks the scope this function is declared in.
Attribute MacrosĀ§
- bench
- A macro for declaring a benchmark where only some part of the function is benchmarked.