Expand description
Paired benchmarking statistics for A/B comparison.
This crate provides statistical functions for analyzing paired benchmark data, where each measurement consists of a baseline and current observation from the same experimental unit (e.g., same input, same machine configuration).
Part of the perfgate workspace.
§Overview
The crate provides:
compute_paired_stats— Compute summary statistics from paired samplescompare_paired_stats— Compare paired statistics with confidence intervalsPairedComparison— Result struct with significance testingsummarize_paired_diffs— Summarize the distribution of differences
§Statistical Methodology
§Paired t-test
The comparison uses a paired t-test approach:
- For n >= 30 samples: uses t-value of 1.96 (normal approximation)
- For n < 30 samples: uses t-value of 2.0 (conservative small-sample estimate)
§Confidence Intervals
95% confidence intervals are computed as:
CI = mean ± t_value × (std_dev / sqrt(n))A result is considered statistically significant if the confidence interval
does not span zero (i.e., ci_lower > 0 or ci_upper < 0).
§Example
use perfgate_paired::{compute_paired_stats, compare_paired_stats, PairedError};
use perfgate_types::{PairedSample, PairedSampleHalf};
fn make_half(wall_ms: u64) -> PairedSampleHalf {
PairedSampleHalf {
wall_ms,
exit_code: 0,
timed_out: false,
max_rss_kb: None,
stdout: None,
stderr: None,
}
}
fn make_sample(idx: u32, baseline_ms: u64, current_ms: u64) -> PairedSample {
PairedSample {
pair_index: idx,
warmup: false,
baseline: make_half(baseline_ms),
current: make_half(current_ms),
wall_diff_ms: current_ms as i64 - baseline_ms as i64,
rss_diff_kb: None,
}
}
let samples = vec![
make_sample(0, 100, 95), // 5ms improvement
make_sample(1, 105, 100), // 5ms improvement
make_sample(2, 110, 103), // 7ms improvement
];
let stats = compute_paired_stats(&samples, None, None)?;
let comparison = compare_paired_stats(&stats);
println!("Mean diff: {:.2}ms", comparison.mean_diff_ms);
println!("% change: {:.2}%", comparison.pct_change * 100.0);
println!("Significant: {}", comparison.is_significant);Structs§
- Paired
Comparison - Result of comparing paired statistics, including significance testing.
Enums§
Functions§
- compare_
paired_ stats - Compare paired statistics and compute a confidence interval.
- compute_
paired_ cv - Compute the coefficient of variation (CV) of the wall-time differences from a set of paired samples (excluding warmups).
- compute_
paired_ stats - Compute summary statistics from paired benchmark samples.
- summarize_
paired_ diffs - Summarize the distribution of paired differences.