Expand description
Lightweight benchmarking toolkit focused on practical performance analysis and report generation.
Β§benchkit
Practical, Documentation-First Benchmarking for Rust.
benchkit
is a lightweight toolkit for performance analysis, born from the hard-learned lessons of optimizing high-performance libraries. It rejects rigid, all-or-nothing frameworks in favor of flexible, composable tools that integrate seamlessly into your existing workflow.
π― NEW TO benchkit? Start with
recommendations.md
- Essential guidelines from real-world performance optimization experience.
Β§The Benchmarking Dilemma
In Rust, developers often face a frustrating choice:
- The Heavy Framework (
criterion
): Statistically powerful, but forces a rigid structure (benches/
), complex setup, and produces reports that are difficult to integrate into your projectβs documentation. You must adapt your project to the framework. - The Manual Approach (
std::time
): Simple to start, but statistically naive. It leads to boilerplate, inconsistent measurements, and conclusions that are easily skewed by system noise.
benchkit
offers a third way.
π Important: For production use and development contributions, see
recommendations.md
- a comprehensive guide with proven patterns, requirements, and best practices from real-world benchmarking experience.
Β§A Toolkit, Not a Framework
This is the core philosophy of benchkit
. It doesnβt impose a workflow; it provides a set of professional, composable tools that you can use however you see fit.
- β Integrate Anywhere: Write benchmarks in your test files, examples, or binaries. No required directory structure.
- β Documentation-First: Treat performance reports as a first-class part of your documentation, with tools to automatically keep them in sync with your code.
- β Practical Focus: Surface the key metrics needed for optimization decisions, hiding deep statistical complexity until you ask for it.
- β Zero Setup: Start measuring performance in minutes with a simple, intuitive API.
Β§π Quick Start: Compare, Analyze, and Document
π First time? Review recommendations.md
for comprehensive best practices and development guidelines.
This example demonstrates the core benchkit
workflow: comparing two algorithms and automatically updating a performance section in your readme.md
.
1. Add to dev-dependencies
in Cargo.toml
:
[dev-dependencies]
benchkit = { version = "0.1", features = [ "full" ] }
2. Create a benchmark in your benches
directory:
// In benches/performance_demo.rs
#![ cfg( feature = "enabled" ) ]
use benchkit::prelude::*;
fn generate_data( size : usize ) -> Vec< u32 >
{
( 0..size ).map( | x | x as u32 ).collect()
}
#[ test ]
fn update_readme_performance_docs()
{
let mut comparison = ComparativeAnalysis::new( "Sorting Algorithms" );
let data = generate_data( 1000 );
// Benchmark the first algorithm
comparison = comparison.algorithm
(
"std_stable_sort",
{
let mut d = data.clone();
move ||
{
d.sort();
}
}
);
// Benchmark the second algorithm
comparison = comparison.algorithm
(
"std_unstable_sort",
{
let mut d = data.clone();
move ||
{
d.sort_unstable();
}
}
);
// Run the comparison and update readme.md
let report = comparison.run();
let markdown = report.to_markdown();
let updater = MarkdownUpdater::new( "readme.md", "Benchmark Results" ).unwrap();
updater.update_section( &markdown ).unwrap();
}
3. Run your benchmark and watch readme.md update automatically:
cargo run --bin performance_demo --features enabled
Β§π§° Whatβs in the Toolkit?
benchkit
provides a suite of composable tools. Use only what you need.
Β§π Enhanced Features
π₯ NEW: Comprehensive Regression Analysis System
Advanced performance regression detection with statistical analysis and trend identification.
use benchkit::prelude::*;
use std::collections::HashMap;
use std::time::{ Duration, SystemTime };
fn regression_analysis_example() -> Result< (), Box< dyn std::error::Error > > {
// Current benchmark results
let mut current_results = HashMap::new();
let current_times = vec![ Duration::from_micros( 85 ), Duration::from_micros( 88 ), Duration::from_micros( 82 ) ];
current_results.insert( "fast_sort".to_string(), BenchmarkResult::new( "fast_sort", current_times ) );
// Historical baseline data
let mut baseline_data = HashMap::new();
let baseline_times = vec![ Duration::from_micros( 110 ), Duration::from_micros( 115 ), Duration::from_micros( 108 ) ];
baseline_data.insert( "fast_sort".to_string(), BenchmarkResult::new( "fast_sort", baseline_times ) );
let historical = HistoricalResults::new().with_baseline( baseline_data );
// Configure regression analyzer
let analyzer = RegressionAnalyzer::new()
.with_baseline_strategy( BaselineStrategy::FixedBaseline )
.with_significance_threshold( 0.05 ) // 5% significance level
.with_trend_window( 5 );
// Perform regression analysis
let regression_report = analyzer.analyze( ¤t_results, &historical );
// Check results
if regression_report.has_significant_changes() {
println!( "π Significant performance changes detected!" );
if let Some( trend ) = regression_report.get_trend_for( "fast_sort" ) {
match trend {
PerformanceTrend::Improving => println!( "π’ Performance improved!" ),
PerformanceTrend::Degrading => println!( "π΄ Performance regression detected!" ),
PerformanceTrend::Stable => println!( "π‘ Performance remains stable" ),
}
}
// Generate professional markdown report
let markdown_report = regression_report.format_markdown();
println!( "{}", markdown_report );
}
Ok(())
}
Key Features:
- Three Baseline Strategies: Fixed baseline, rolling average, and previous run comparison
- Statistical Significance: Configurable thresholds with proper statistical testing
- Trend Detection: Automatic identification of improving, degrading, or stable performance
- Professional Reports: Publication-quality markdown with statistical analysis
- CI/CD Integration: Automated regression detection for deployment pipelines
- Historical Data Management: Long-term performance tracking with quality validation
Use Cases:
- Automated performance regression detection in CI/CD pipelines
- Long-term performance monitoring and trend analysis
- Code optimization validation with statistical confidence
- Production deployment gates with zero-regression tolerance
- Performance documentation with automated updates
Safe Update Chain Pattern - Atomic Documentation Updates
Coordinate multiple markdown section updates atomically - either all succeed or none are modified.
use benchkit::prelude::*;
fn update_markdown_atomically() -> Result< (), Box< dyn std::error::Error > > {
let performance_markdown = "## Performance Results\n\nFast!";
let memory_markdown = "## Memory Usage\n\nLow!";
let cpu_markdown = "## CPU Usage\n\nOptimal!";
// Update multiple sections atomically
let chain = MarkdownUpdateChain::new("readme.md")?
.add_section("Performance Benchmarks", performance_markdown)
.add_section("Memory Analysis", memory_markdown)
.add_section("CPU Profiling", cpu_markdown);
// Validate all sections before any updates
let conflicts = chain.check_all_conflicts()?;
if !conflicts.is_empty() {
return Err(format!("Section conflicts detected: {:?}", conflicts).into());
}
// Atomic update - either all succeed or all fail
chain.execute()?;
Ok(())
}
Key Features:
- Atomic Operations: Either all sections update successfully or none are modified
- Conflict Detection: Validates all sections exist and are unambiguous before any changes
- Automatic Rollback: Failed operations restore original file state
- Reduced I/O: Single read and write operation instead of multiple file accesses
- Error Recovery: Comprehensive error handling with detailed diagnostics
Use Cases:
- Multi-section benchmark reports that must stay synchronized
- CI/CD pipelines requiring consistent documentation updates
- Coordinated updates across large documentation projects
- Production deployments where partial updates would be problematic
Advanced Example:
use benchkit::prelude::*;
fn complex_update_example() -> Result< (), Box< dyn std::error::Error > > {
let performance_report = "Performance analysis results";
let memory_report = "Memory usage analysis";
let comparison_report = "Algorithm comparison data";
let validation_report = "Quality assessment report";
// Complex coordinated update across multiple report types
let chain = MarkdownUpdateChain::new("PROJECT_BENCHMARKS.md")?
.add_section("Performance Analysis", performance_report)
.add_section("Memory Usage Analysis", memory_report)
.add_section("Algorithm Comparison", comparison_report)
.add_section("Quality Assessment", validation_report);
// Validate everything before committing any changes
match chain.check_all_conflicts() {
Ok(conflicts) if conflicts.is_empty() => {
println!("β
All {} sections validated", chain.len());
chain.execute()?;
},
Ok(conflicts) => {
eprintln!("β οΈ Conflicts: {:?}", conflicts);
// Handle conflicts or use more specific section names
},
Err(e) => eprintln!("β Validation failed: {}", e),
}
Ok(())
}
Professional Report Templates - Research-Grade Documentation
Generate standardized, publication-quality reports with full statistical analysis and customizable sections.
use benchkit::prelude::*;
use std::collections::HashMap;
fn generate_reports() -> Result< (), Box< dyn std::error::Error > > {
let results = HashMap::new();
let comparison_results = HashMap::new();
// Comprehensive performance analysis
let performance_template = PerformanceReport::new()
.title("Algorithm Performance Analysis")
.add_context("Comparing sequential vs parallel processing approaches")
.include_statistical_analysis(true)
.include_regression_analysis(true)
.add_custom_section(CustomSection::new(
"Implementation Notes",
"Detailed implementation considerations and optimizations applied"
));
let performance_report = performance_template.generate(&results)?;
// A/B testing comparison with statistical significance
let comparison_template = ComparisonReport::new()
.title("Sequential vs Parallel Processing Comparison")
.baseline("Sequential Processing")
.candidate("Parallel Processing")
.significance_threshold(0.01) // 1% statistical significance
.practical_significance_threshold(0.05); // 5% practical significance
let comparison_report = comparison_template.generate(&comparison_results)?;
Ok(())
}
Performance Report Features:
- Executive Summary: Key metrics and performance indicators
- Statistical Analysis: Confidence intervals, coefficient of variation, reliability assessment
- Performance Tables: Sorted results with throughput, latency, and quality indicators
- Custom Sections: Domain-specific analysis and recommendations
- Professional Formatting: Publication-ready markdown with proper statistical notation
Comparison Report Features:
- Significance Testing: Both statistical and practical significance analysis
- Confidence Intervals: 95% CI analysis with overlap detection
- Performance Ratios: Clear improvement/regression percentages
- Reliability Assessment: Quality validation for both baseline and candidate
- Decision Support: Clear recommendations based on statistical analysis
Advanced Template Composition:
use benchkit::prelude::*;
fn create_enterprise_template() -> PerformanceReport {
// Create domain-specific template with multiple custom sections
let enterprise_template = PerformanceReport::new()
.title("Enterprise Algorithm Performance Audit")
.add_context("Monthly performance review for production trading systems")
.include_statistical_analysis(true)
.add_custom_section(CustomSection::new(
"Risk Assessment",
r#"### Performance Risk Analysis
| Algorithm | Latency Risk | Throughput Risk | Stability | Overall |
|-----------|-------------|-----------------|-----------|----------|
| Current | π’ Low | π‘ Medium | π’ Low | π‘ Medium |
| Proposed | π’ Low | π’ Low | π’ Low | π’ Low |"#
))
.add_custom_section(CustomSection::new(
"Business Impact",
r#"### Projected Business Impact
- **Latency Improvement**: 15% faster response times
- **Throughput Increase**: +2,000 req/sec capacity
- **Cost Reduction**: -$50K/month in infrastructure
- **SLA Compliance**: 99.9% β 99.99% uptime"#
));
enterprise_template
}
Benchmark Validation Framework - Quality Assurance
Comprehensive quality assessment system with configurable criteria and automatic reliability analysis.
use benchkit::prelude::*;
use std::collections::HashMap;
fn validate_benchmark_results() {
let results = HashMap::new();
// Configure validator for your specific requirements
let validator = BenchmarkValidator::new()
.min_samples(20) // Require 20+ measurements
.max_coefficient_variation(0.10) // 10% maximum variability
.require_warmup(true) // Detect warm-up periods
.max_time_ratio(3.0) // 3x max/min ratio
.min_measurement_time(Duration::from_micros(50)); // 50ΞΌs minimum duration
// Validate all results with detailed analysis
let validated_results = ValidatedResults::new(results, validator);
println!("Reliability: {:.1}%", validated_results.reliability_rate());
// Get detailed quality warnings
if let Some(warnings) = validated_results.reliability_warnings() {
println!("β οΈ Quality Issues Detected:");
for warning in warnings {
println!(" - {}", warning);
}
}
// Work with only statistically reliable results
let reliable_only = validated_results.reliable_results();
println!("Using {}/{} reliable benchmarks for analysis",
reliable_only.len(), validated_results.results.len());
}
Validation Criteria:
- Sample Size: Ensure sufficient measurements for statistical power
- Variability: Detect high coefficient of variation indicating noise
- Measurement Duration: Flag measurements that may be timing-resolution limited
- Performance Range: Identify outliers and wide performance distributions
- Warm-up Detection: Verify proper system warm-up for consistent results
Warning Types:
InsufficientSamples
: Too few measurements for reliable statisticsHighVariability
: Coefficient of variation exceeds thresholdShortMeasurementTime
: Measurements may be affected by timer resolutionWidePerformanceRange
: Large ratio between fastest/slowest measurementsNoWarmup
: Missing warm-up period may indicate measurement issues
Domain-Specific Validation:
use benchkit::prelude::*;
use std::collections::HashMap;
fn domain_specific_validation() {
let results = HashMap::new();
// Real-time systems validation (very strict)
let realtime_validator = BenchmarkValidator::new()
.min_samples(50)
.max_coefficient_variation(0.02) // 2% maximum
.max_time_ratio(1.5); // Very tight timing
// Interactive systems validation (balanced)
let interactive_validator = BenchmarkValidator::new()
.min_samples(15)
.max_coefficient_variation(0.15) // 15% acceptable
.require_warmup(false); // Interactive may not show warmup
// Batch processing validation (lenient)
let batch_validator = BenchmarkValidator::new()
.min_samples(10)
.max_coefficient_variation(0.25) // 25% acceptable
.max_time_ratio(5.0); // Allow more variation
// Apply appropriate validator for your domain
let domain_results = ValidatedResults::new(results, realtime_validator);
}
Quality Reporting:
use benchkit::prelude::*;
use std::collections::HashMap;
fn generate_validation_report() {
let results = HashMap::new();
let validator = BenchmarkValidator::new();
// Generate comprehensive validation report
let validation_report = validator.generate_validation_report(&results);
// Validation report includes:
// - Summary statistics and reliability rates
// - Detailed warnings with improvement recommendations
// - Validation criteria documentation
// - Quality assessment for each benchmark
// - Actionable steps to improve measurement quality
println!("{}", validation_report);
}
Complete Integration Examples
Comprehensive examples demonstrating real-world usage patterns and advanced integration scenarios.
Development Workflow Integration:
use benchkit::prelude::*;
// Complete development cycle: benchmark β validate β document β commit
fn development_workflow() -> Result< (), Box< dyn std::error::Error > > {
// Mock implementations for doc test
fn quicksort_implementation() {}
fn mergesort_implementation() {}
// 1. Run benchmarks
let mut suite = BenchmarkSuite::new("Algorithm Performance");
suite.benchmark("quicksort", || quicksort_implementation());
suite.benchmark("mergesort", || mergesort_implementation());
let results = suite.run_all();
// 2. Validate quality
let validator = BenchmarkValidator::new()
.min_samples(15)
.max_coefficient_variation(0.15);
let validated_results = ValidatedResults::new(results.results, validator);
if validated_results.reliability_rate() < 80.0 {
return Err("Benchmark quality insufficient for analysis".into());
}
// 3. Generate professional report
let template = PerformanceReport::new()
.title("Algorithm Performance Analysis")
.include_statistical_analysis(true)
.add_custom_section(CustomSection::new(
"Development Notes",
"Analysis conducted during algorithm optimization phase"
));
let report = template.generate(&validated_results.results)?;
// 4. Update documentation atomically
let chain = MarkdownUpdateChain::new("README.md")?
.add_section("Performance Analysis", report)
.add_section("Quality Assessment", validated_results.validation_report());
chain.execute()?;
println!("β
Development documentation updated successfully");
Ok(())
}
CI/CD Pipeline Integration:
use benchkit::prelude::*;
use std::collections::HashMap;
// Automated performance regression detection
fn cicd_performance_check(baseline_results: HashMap<String, BenchmarkResult>,
pr_results: HashMap<String, BenchmarkResult>) -> Result< bool, Box< dyn std::error::Error > > {
// Validate both result sets
let validator = BenchmarkValidator::new().require_warmup(false);
let baseline_validated = ValidatedResults::new(baseline_results.clone(), validator.clone());
let pr_validated = ValidatedResults::new(pr_results.clone(), validator);
// Require high quality for regression analysis
if baseline_validated.reliability_rate() < 90.0 || pr_validated.reliability_rate() < 90.0 {
println!("β BLOCK: Insufficient benchmark quality for regression analysis");
return Ok(false);
}
// Compare performance for regression detection
let comparison = ComparisonReport::new()
.title("Performance Regression Analysis")
.baseline("baseline_version")
.candidate("pr_version")
.practical_significance_threshold(0.05); // 5% regression threshold
// Create combined results for comparison
let mut combined = HashMap::new();
combined.insert("baseline_version".to_string(),
baseline_results.values().next().unwrap().clone());
combined.insert("pr_version".to_string(),
pr_results.values().next().unwrap().clone());
let regression_report = comparison.generate(&combined)?;
// Check for regressions
let has_regression = regression_report.contains("slower");
if has_regression {
println!("β BLOCK: Performance regression detected");
// Save detailed report for review
std::fs::write("regression_analysis.md", regression_report)?;
Ok(false)
} else {
println!("β
ALLOW: No performance regressions detected");
Ok(true)
}
}
Multi-Project Coordination:
use benchkit::prelude::*;
use std::collections::HashMap;
// Coordinate benchmark updates across multiple related projects
fn coordinate_multi_project_benchmarks() -> Result< (), Box< dyn std::error::Error > > {
let projects = vec!["web-api", "batch-processor", "realtime-analyzer"];
let mut all_results = HashMap::new();
// Collect results from all projects
for project in &projects {
let project_results = run_project_benchmarks(project)?;
all_results.extend(project_results);
}
// Cross-project validation with lenient criteria
let validator = BenchmarkValidator::new()
.max_coefficient_variation(0.25) // Different environments have more noise
.require_warmup(false);
let cross_project_validated = ValidatedResults::new(all_results.clone(), validator);
// Generate consolidated impact analysis
let impact_template = PerformanceReport::new()
.title("Cross-Project Performance Impact Analysis")
.add_context("Shared library upgrade impact across all dependent projects")
.include_statistical_analysis(true)
.add_custom_section(CustomSection::new(
"Project Impact Summary",
format_project_impact_analysis(&projects, &all_results)
));
let impact_report = impact_template.generate(&all_results)?;
// Update shared documentation
let shared_chain = MarkdownUpdateChain::new("SHARED_LIBRARY_IMPACT.md")?
.add_section("Current Impact Analysis", &impact_report)
.add_section("Quality Assessment", &cross_project_validated.validation_report());
shared_chain.execute()?;
// Notify project maintainers
notify_project_teams(&projects, &impact_report)?;
Ok(())
}
// Helper functions for the example
fn run_project_benchmarks(_project: &str) -> Result< HashMap< String, BenchmarkResult >, Box< dyn std::error::Error > > {
// Mock implementation for doc test
Ok(HashMap::new())
}
fn format_project_impact_analysis(_projects: &[&str], _results: &HashMap< String, BenchmarkResult >) -> String {
// Mock implementation for doc test
"Impact analysis summary".to_string()
}
fn notify_project_teams(_projects: &[&str], _report: &str) -> Result< (), Box< dyn std::error::Error > > {
// Mock implementation for doc test
Ok(())
}
Measure: Core Timing and Profiling
At its heart, benchkit
provides simple and accurate measurement primitives.
use benchkit::prelude::*;
// A robust measurement with multiple iterations and statistical cleanup.
let result = bench_function
(
"summation_1000",
||
{
( 0..1000 ).fold( 0, | acc, x | acc + x )
}
);
println!( "Avg time: {:.2?}", result.mean_time() );
println!( "Throughput: {:.0} ops/sec", result.operations_per_second() );
// Track memory usage patterns alongside timing.
let memory_benchmark = MemoryBenchmark::new( "allocation_test" );
let ( timing, memory_stats ) = memory_benchmark.run_with_tracking
(
10,
||
{
let data = vec![ 0u8; 1024 ];
memory_benchmark.tracker.record_allocation( 1024 );
std::hint::black_box( data );
}
);
println!( "Peak memory usage: {} bytes", memory_stats.peak_usage );
Analyze: Find Insights and Regressions
Turn raw numbers into actionable insights.
use benchkit::prelude::*;
// Compare multiple implementations to find the best one.
let report = ComparativeAnalysis::new( "Hashing" )
.algorithm( "fnv", || { /* ... */ } )
.algorithm( "siphash", || { /* ... */ } )
.run();
if let Some( ( fastest_name, _ ) ) = report.fastest()
{
println!( "Fastest algorithm: {}", fastest_name );
}
// Example benchmark results
let result_a = bench_function( "test_a", || { /* ... */ } );
let result_b = bench_function( "test_b", || { /* ... */ } );
// Compare two benchmark results
let comparison = result_a.compare( &result_b );
if comparison.is_improvement()
{
println!( "Performance improved!" );
}
Generate: Create Realistic Test Data
Stop writing boilerplate to create test data. benchkit
provides generators for common scenarios.
use benchkit::prelude::*;
// Generate a comma-separated list of 100 items.
let list_data = generate_list_data( DataSize::Medium );
// Generate realistic unilang command strings for parser benchmarking.
let command_generator = DataGenerator::new()
.complexity( DataComplexity::Complex );
let commands = command_generator.generate_unilang_commands( 10 );
// Create reproducible data with a specific seed.
let mut seeded_gen = SeededGenerator::new( 42 );
let random_data = seeded_gen.random_string( 1024 );
Document: Automate Your Reports
The βdocumentation-firstβ philosophy is enabled by powerful report generation and file updating tools.
use benchkit::prelude::*;
fn main() -> Result< (), Box< dyn std::error::Error > >
{
let mut suite = BenchmarkSuite::new( "api_performance" );
suite.benchmark( "get_user", || { /* ... */ } );
suite.benchmark( "create_user", || { /* ... */ } );
let results = suite.run_analysis();
// Generate a markdown report from the results.
let markdown_report = results.generate_markdown_report().generate();
// Automatically update the "## Performance" section of a file.
let updater = MarkdownUpdater::new( "readme.md", "Performance" )?;
updater.update_section( &markdown_report )?;
Ok( () )
}
Β§The benchkit
Workflow
benchkit
is designed to make performance analysis a natural part of your development cycle.
[ 1. Write Code ] -> [ 2. Add Benchmark in `benches/` ] -> [ 3. Run `cargo run --bin` ]
^ |
| v
[ 5. Commit Code + Perf Docs ] <- [ 4. Auto-Update `benchmark_results.md` ] <- [ Analyze Results ]
Β§π Why Not benches/
? Standard Directory Integration
The traditional benches/
directory creates artificial separation between ALL your benchmark content and the standard Rust project structure. benchkit
encourages you to use standard directories for ALL benchmark-related files:
- β
Use
tests/
: Performance benchmarks alongside unit tests - β
Use
examples/
: Demonstration benchmarks and showcases - β
Use
src/bin/
: Dedicated benchmark executables - β Standard integration: Keep ALL benchmark content in standard Rust directories
- β Avoid
benches/
: Donβt isolate ANY benchmark files in framework-specific directories
Β§Why This Matters
Workflow Integration: ALL benchmark content should be part of regular development, not isolated in framework-specific directories.
Documentation Proximity: ALL benchmark files are documentation - keep them integrated with your standard project structure for better maintainability.
Testing Philosophy: Performance is part of correctness validation - integrate benchmarks with your existing test suite.
Toolkit vs Framework: Frameworks enforce rigid benches/
isolation; toolkits integrate with your existing project structure.
Β§Automatic Documentation Updates
benchkit
excels at maintaining comprehensive, automatically updated documentation in your project files:
# Benchmark Results
## Algorithm Comparison
| Algorithm | Mean Time | Throughput | Relative |
|-----------|-----------|------------|----------|
| quicksort | 1.23ms | 815 ops/s | baseline |
| mergesort | 1.45ms | 689 ops/s | 1.18x |
| heapsort | 1.67ms | 599 ops/s | 1.36x |
*Last updated: 2024-01-15 14:32:18 UTC*
*Generated by benchkit v0.4.0*
## Performance Trends
- quicksort maintains consistent performance across data sizes
- mergesort shows better cache behavior on large datasets
- heapsort provides predictable O(n log n) guarantees
## Test Configuration
- Hardware: 16-core AMD Ryzen, 32GB RAM
- Rust version: 1.75.0
- Optimization: --release
- Iterations: 1000 per benchmark
This documentation is automatically generated and updated every time you run benchmarks.
Β§Integration Examples
// β
In standard tests/ directory alongside unit tests
// tests/performance_comparison.rs
use benchkit::prelude::*;
#[test]
fn benchmark_algorithms()
{
let mut suite = BenchmarkSuite::new( "Algorithm Comparison" );
suite.benchmark( "quick_sort", ||
{
// Your quicksort implementation
});
suite.benchmark( "merge_sort", ||
{
// Your mergesort implementation
});
let results = suite.run_all();
// Automatically update readme.md with results
let updater = MarkdownUpdater::new( "readme.md", "Performance" ).unwrap();
updater.update_section( &results.generate_markdown_report().generate() ).unwrap();
}
// β
In examples/ directory for demonstrations
// examples/comprehensive_benchmark.rs
use benchkit::prelude::*;
fn main()
{
let mut comprehensive = BenchmarkSuite::new( "Comprehensive Performance Analysis" );
// Add multiple benchmarks
comprehensive.benchmark( "data_processing", || { /* code */ } );
comprehensive.benchmark( "memory_operations", || { /* code */ } );
comprehensive.benchmark( "io_operations", || { /* code */ } );
let results = comprehensive.run_all();
// Update readme.md with comprehensive report
let report = results.generate_markdown_report();
let updater = MarkdownUpdater::new( "readme.md", "Performance Analysis" ).unwrap();
updater.update_section( &report.generate() ).unwrap();
println!( "Updated readme.md with latest performance results" );
}
Β§π§ Feature Flag Recommendations
For optimal build performance and clean separation, put your benchmark code behind feature flags:
// β
In src/bin/ directory for dedicated benchmark executables
// src/bin/comprehensive_benchmark.rs
#[ cfg( feature = "enabled" ) ]
use benchkit::prelude::*;
#[ cfg( feature = "enabled" ) ]
fn main()
{
let mut suite = BenchmarkSuite::new( "Comprehensive Performance Suite" );
suite.benchmark( "algorithm_a", || { /* implementation */ } );
suite.benchmark( "algorithm_b", || { /* implementation */ } );
suite.benchmark( "data_structure_ops", || { /* implementation */ } );
let results = suite.run_all();
// Automatically update readme.md
let updater = MarkdownUpdater::new( "readme.md", "Latest Results" ).unwrap();
updater.update_section( &results.generate_markdown_report().generate() ).unwrap();
println!( "Benchmarks completed - readme.md updated" );
}
#[ cfg( not( feature = "enabled" ) ) ]
fn main()
{
println!( "Run with: cargo run --bin comprehensive_benchmark --features enabled" );
println!( "Results will be automatically saved to readme.md" );
}
Add to your Cargo.toml
:
[features]
benchmark = ["benchkit"]
[dev-dependencies]
benchkit = { version = "0.1", features = ["full"], optional = true }
Run benchmarks selectively:
# Run only unit tests (fast)
cargo test
# Run specific benchmark binary (updates readme.md)
cargo run --bin comprehensive_benchmark --features enabled
# Run benchmarks from examples/
cargo run --example performance_demo --features enabled
# Run all binaries containing benchmarks
cargo run --bin performance_suite --features enabled
This approach keeps your regular builds fast while making comprehensive performance testing available when needed.
Β§π Comprehensive Examples
benchkit
includes extensive examples demonstrating every feature and usage pattern:
Β§π― Feature-Specific Examples
-
Update Chain Comprehensive: Complete demonstration of atomic documentation updates
- Single and multi-section updates with conflict detection
- Error handling and recovery patterns
- Advanced conflict resolution strategies
- Performance optimization for bulk updates
- Full integration with validation and templates
-
Templates Comprehensive: Professional report generation in all scenarios
- Basic and fully customized Performance Report templates
- A/B testing with Comparison Report templates
- Custom sections with advanced markdown formatting
- Multiple comparison scenarios and batch processing
- Business impact analysis and risk assessment templates
- Comprehensive error handling for edge cases
-
Validation Comprehensive: Quality assurance for reliable benchmarking
- Default and custom validator configurations
- Individual warning types with detailed analysis
- Validation report generation and interpretation
- Reliable results filtering for analysis
- Domain-specific validation scenarios (research, development, production, micro)
- Full integration with templates and update chains
-
Regression Analysis Comprehensive: Complete regression analysis system demonstration
- All baseline strategies (Fixed, Rolling Average, Previous Run)
- Performance trend detection (Improving, Degrading, Stable)
- Statistical significance testing with configurable thresholds
- Professional markdown report generation with regression insights
- Real-world optimization scenarios and configuration guidance
- Full integration with PerformanceReport templates
-
Historical Data Management: Managing long-term performance data
- Incremental historical data building and TimestampedResults creation
- Data quality validation and cleanup procedures
- Performance trend analysis across multiple time windows
- Storage and serialization strategy recommendations
- Data retention and archival best practices
- Integration with RegressionAnalyzer for trend detection
Β§π§ Integration Examples
-
Integration Workflows: Real-world workflow automation
- Development cycle: benchmark β validate β document β commit
- CI/CD pipeline: regression detection β merge decision β automated reporting
- Multi-project coordination: impact analysis β consolidated reporting β team alignment
- Production monitoring: continuous tracking β alerting β dashboard updates
-
Error Handling Patterns: Robust operation under adverse conditions
- Update Chain file system errors (permissions, conflicts, recovery)
- Template generation errors (missing data, invalid parameters)
- Validation framework edge cases (malformed data, extreme variance)
- System errors (resource limits, concurrent access)
- Graceful degradation strategies with automatic fallbacks
-
Advanced Usage Patterns: Enterprise-scale benchmarking
- Domain-specific validation criteria (real-time, interactive, batch processing)
- Template composition and inheritance patterns
- Coordinated multi-document updates with consistency guarantees
- Memory-efficient large-scale processing (1000+ algorithms)
- Performance optimization techniques (caching, concurrency, incremental processing)
-
CI/CD Regression Detection: Automated performance validation in CI/CD pipelines
- Multi-environment validation (development, staging, production)
- Configurable regression thresholds and statistical significance levels
- Automated performance gate decisions with proper exit codes
- GitHub Actions compatible reporting and documentation updates
- Progressive validation pipeline with halt-on-failure
- Real-world CI/CD integration patterns and best practices
-
π¨ Cargo Bench Integration: CRITICAL - Standard
cargo bench
integration patterns- Seamless integration with Rustβs standard
cargo bench
command - Automatic documentation updates during benchmark execution
- Standard
benches/
directory structure support - Criterion compatibility layer for zero-migration adoption
- CI/CD integration with standard workflows and conventions
- Real-world project structure and configuration examples
- This is the foundation requirement for benchkit adoption
- Seamless integration with Rustβs standard
Β§π Running the Examples
# Feature-specific examples
cargo run --example update_chain_comprehensive --all-features
cargo run --example templates_comprehensive --all-features
cargo run --example validation_comprehensive --all-features
# NEW: Regression Analysis Examples
cargo run --example regression_analysis_comprehensive --all-features
cargo run --example historical_data_management --all-features
# Integration examples
cargo run --example integration_workflows --all-features
cargo run --example error_handling_patterns --all-features
cargo run --example advanced_usage_patterns --all-features
# NEW: CI/CD Integration Example
cargo run --example cicd_regression_detection --all-features
# π¨ CRITICAL: Cargo Bench Integration Example
cargo run --example cargo_bench_integration --all-features
# Original enhanced features demo
cargo run --example enhanced_features_demo --all-features
Each example is fully documented with detailed explanations and demonstrates production-ready patterns you can adapt to your specific needs.
Β§Installation
Add benchkit
to your [dev-dependencies]
in Cargo.toml
.
[dev-dependencies]
# For core functionality
benchkit = "0.1"
# Or enable all features for the full toolkit
benchkit = { version = "0.1", features = [ "full" ] }
Β§π Development Guidelines & Best Practices
β οΈ IMPORTANT: Before using benchkit in production or contributing to development, strongly review the comprehensive recommendations.md
file. This document contains essential requirements, best practices, and lessons learned from real-world performance analysis work.
The recommendations cover:
- β Core philosophy and toolkit vs framework principles
- β Technical architecture requirements and feature organization
- β Performance analysis best practices with standardized data patterns
- β Documentation integration requirements for automated reporting
- β Statistical analysis requirements for reliable measurements
π Read recommendations.md
first - it will save you time and ensure youβre following proven patterns.
Β§Contributing
Contributions are welcome! benchkit
aims to be a community-driven toolkit that solves real-world benchmarking problems.
Before contributing:
- π Read
recommendations.md
- Contains all development requirements and design principles - Review open tasks in the
task/
directory - Check our contribution guidelines
All contributions must align with the principles and requirements outlined in recommendations.md
.
Β§License
This project is licensed under the MIT License.
Β§Performance
This section is automatically updated by benchkit when you run benchmarks.
ModulesΒ§
- analysis
- Analysis tools for benchmark results
- comparison
- Framework and algorithm comparison utilities
- data_
generation - Advanced data generation utilities for benchmarking
- diff
- Git-style diff functionality for benchmark results
- documentation
- Documentation integration and auto-update utilities
- generators
- Data generators for benchmarking
- measurement
- Core measurement and timing functionality
- memory_
tracking - Memory allocation tracking and analysis for benchmarks
- parser_
analysis - Parser-specific analysis utilities
- parser_
data_ generation - Parser-specific data generation utilities
- plotting
- Visualization and plotting utilities for benchmark results
- prelude
- Prelude module for convenient imports
- profiling
- Memory allocation and performance profiling tools
- reporting
- Report generation and markdown integration
- scaling
- Scaling analysis tools for performance testing
- statistical
- Research-grade statistical analysis for benchmark results
- suite
- Benchmark suite management
- templates
- Template system for consistent documentation formatting
- throughput
- Throughput calculation and analysis utilities
- update_
chain - Safe Update Chain Pattern for coordinated markdown section updates
- validation
- Benchmark validation and quality assessment framework
MacrosΒ§
- bench_
block - Measure a block of code (convenience macro)