What it does
Measures performance and produces verdicts the same way the rest of the
dev-* suite does: machine-readable, stored as dev-report results.
Designed for AI agents and CI gating, not for interactive profiling.
For interactive profiling, use criterion or divan.
Quick start
[]
= "0.9"
use ;
let mut b = new;
for _ in 0..1000
let result = b.finish;
let threshold = regression_pct;
// If you have a stored baseline mean from a previous run, pass it in.
// Returns a CheckResult that can be pushed into a dev_report::Report.
let _check = result.compare_against_baseline;
The returned CheckResult carries the bench tag and numeric
Evidence for mean_ns, baseline_ns, p50_ns, p99_ns, cv,
ops_per_sec, samples, and iterations_recorded — no parsing of
free-form detail strings.
Variance-aware verdicts
Noisy machines produce false regressions. dev-bench computes the
coefficient of variation alongside the mean and downgrades regressions
within the noise band from Fail to Warn:
use ;
use Duration;
let mut b = new;
for _ in 0..30
let r = b.finish;
let opts = CompareOptions ;
let _check = r.compare_with_options;
Throughput
For sub-microsecond operations where Instant::now() overhead would
dominate per-iter timing, batch with iter_with_count:
use ;
let mut b = new;
b.iter_with_count;
let r = b.finish;
println!;
Baseline storage
Baselines persist via the BaselineStore trait. A JSON file backend
ships out of the box:
use ;
let store = new;
let baseline = Baseline ;
store.save.unwrap;
if let Some = store.load.unwrap
Saves are atomic (write-temp-rename); loads tolerate missing files.
Producer trait
To plug into dev-report::MultiReport aggregation alongside other
producers, wrap a benchmark in BenchProducer:
use ;
use Producer;
let producer = new;
let report = producer.produce; // dev_report::Report
Allocation tracking (opt-in)
[]
= { = "0.9", = ["alloc-tracking"] }
static ALLOC: Alloc = Alloc;
let _profiler = new_heap;
// ... run benchmarked code ...
let stats = snapshot;
let check = stats.compare_against_baseline;
The dhat allocator changes timing characteristics, so do not combine timing thresholds with allocation thresholds in the same invocation.
Design choices
- Output is a
CheckResult, not a stdout dump. Agents can read it. - Regression-first, not absolute-perf-first. The interesting question is "did this change make it slower," not "is this the fastest in the world."
- Configurable thresholds: percent-based, absolute-nanosecond, or throughput-drop-percent.
- Variance-aware: regressions within the CV noise band are flagged
as
Warn, notFail.
The dev-* suite
dev-bench is one of the producer crates. See
dev-tools for the umbrella
crate and dev-report for
the schema all results conform to.
Status
v0.9.x is the pre-1.0 stabilization line. APIs are expected to be
near-final; minor adjustments may still happen ahead of 1.0. The
statistic definitions (mean, p50, p99) are pinned and will not
change.
Minimum supported Rust version
1.85 — pinned in Cargo.toml via rust-version and verified by
the MSRV job in CI. (Bumped from 1.75 because the alloc-tracking
feature pulls dhat → addr2line, which requires Rust 1.81+, and
sibling crates require edition2024 (1.85+).)
License
Apache-2.0. See LICENSE.