benchkit 0.12.0

Lightweight benchmarking toolkit focused on practical performance analysis and report generation. Non-restrictive alternative to criterion, designed for easy integration and markdown report generation.
Documentation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
1001
1002
1003
1004
1005
1006
1007
1008
1009
1010
1011
1012
1013
1014
1015
1016
1017
1018
1019
1020
1021
1022
1023
1024
1025
1026
1027
1028
1029
1030
1031
1032
1033
1034
1035
1036
1037
1038
1039
1040
1041
1042
1043
1044
1045
1046
1047
1048
1049
1050
1051
1052
1053
1054
1055
1056
1057
1058
1059
1060
1061
1062
1063
1064
1065
1066
1067
1068
1069
1070
1071
1072
1073
1074
1075
1076
1077
1078
1079
1080
1081
1082
1083
1084
1085
1086
1087
1088
1089
1090
1091
1092
1093
1094
1095
1096
1097
1098
1099
1100
1101
1102
1103
1104
1105
1106
1107
1108
1109
1110
# benchkit

[![docs.rs](https://docs.rs/benchkit/badge.svg)](https://docs.rs/benchkit)
[![discord](https://img.shields.io/discord/872391416519647252?color=eee&logo=discord&logoColor=eee&label=ask%20on%20discord)](https://discord.gg/m3YfbXpUUY)

**Practical, Documentation-First Benchmarking for Rust.**

`benchkit` is a lightweight toolkit for performance analysis, born from the hard-learned lessons of optimizing high-performance libraries. It rejects rigid, all-or-nothing frameworks in favor of flexible, composable tools that integrate seamlessly into your existing workflow.

> 🎯 **NEW TO benchkit?** Start with [`usage.md`]usage.md - Mandatory standards and requirements from production systems.

## The Benchmarking Dilemma

In Rust, developers often face a frustrating choice:

1.  **The Heavy Framework (`criterion`):** Statistically powerful, but forces a rigid structure (`benches/`), complex setup, and produces reports that are difficult to integrate into your project's documentation. You must adapt your project to the framework.
2.  **The Manual Approach (`std::time`):** Simple to start, but statistically naive. It leads to boilerplate, inconsistent measurements, and conclusions that are easily skewed by system noise.

`benchkit` offers a third way.

> **📋 Important**: For production use and development contributions, see [`usage.md`]usage.md - mandatory standards with proven patterns, requirements, and compliance standards from production systems.

## A Toolkit, Not a Framework

This is the core philosophy of `benchkit`. It doesn't impose a workflow; it provides a set of professional, composable tools that you can use however you see fit.

*   **Integrate Anywhere:** Write benchmarks in your test files, examples, or binaries. No required directory structure.
***Documentation-First:** Treat performance reports as a first-class part of your documentation, with tools to automatically keep them in sync with your code.
***Practical Focus:** Surface the key metrics needed for optimization decisions, hiding deep statistical complexity until you ask for it.
***Zero Setup:** Start measuring performance in minutes with a simple, intuitive API.

---

## 🚀 Quick Start: Compare, Analyze, and Document

**📖 First time?** Review [`usage.md`](usage.md) for mandatory compliance standards and development requirements.

This example demonstrates the core `benchkit` workflow: comparing two algorithms and automatically updating a performance section in your `readme.md`.

**1. Add to `dev-dependencies` in `Cargo.toml`:**
```toml
[dev-dependencies]
benchkit = { version = "0.8.0", features = [ "full" ] }
```

**2. Create a benchmark in your `benches` directory:**

```rust  
// In benches/performance_demo.rs
#![ cfg( feature = "enabled" ) ]
use benchkit::prelude::*;

fn generate_data( size : usize ) -> Vec< u32 >
{
  ( 0..size ).map( | x | x as u32 ).collect()
}

#[ test ]
fn update_readme_performance_docs()
{
  let mut comparison = ComparativeAnalysis::new( "Sorting Algorithms" );
  let data = generate_data( 1000 );

  // Benchmark the first algorithm
  comparison = comparison.algorithm
  (
    "std_stable_sort",
    {
      let mut d = data.clone();
      move ||
      {
        d.sort();
      }
    }
  );

  // Benchmark the second algorithm
  comparison = comparison.algorithm
  (
    "std_unstable_sort",
    {
      let mut d = data.clone();
      move ||
      {
        d.sort_unstable();
      }
    }
  );

  // Run the comparison and update readme.md
  let report = comparison.run();
  let markdown = report.to_markdown();

  let updater = MarkdownUpdater::new( "readme.md", "Benchmark Results" ).unwrap();
  updater.update_section( &markdown ).unwrap();
}
```

**3. Run your benchmark and watch readme.md update automatically:**
```bash
cargo run --bin performance_demo --features enabled  
```

---

## 🧰 What's in the Toolkit?

`benchkit` provides a suite of composable tools. Use only what you need.

### 🆕 Enhanced Features

<details>
<summary><strong>🔥 NEW: Comprehensive Regression Analysis System</strong></summary>

Advanced performance regression detection with statistical analysis and trend identification.

```rust
use benchkit::prelude::*;
use std::collections::HashMap;
use std::time::{ Duration, SystemTime };

fn regression_analysis_example() -> Result< (), Box< dyn std::error::Error > > 
{
    // Current benchmark results
    let mut current_results = HashMap::new();
    let current_times = vec![ Duration::from_micros( 85 ), Duration::from_micros( 88 ), Duration::from_micros( 82 ) ];
    current_results.insert( "fast_sort".to_string(), BenchmarkResult::new( "fast_sort", current_times ) );
    
    // Historical baseline data
    let mut baseline_data = HashMap::new();
    let baseline_times = vec![ Duration::from_micros( 110 ), Duration::from_micros( 115 ), Duration::from_micros( 108 ) ];
    baseline_data.insert( "fast_sort".to_string(), BenchmarkResult::new( "fast_sort", baseline_times ) );
    
    let historical = HistoricalResults::new().with_baseline( baseline_data );
    
    // Configure regression analyzer
    let analyzer = RegressionAnalyzer::new()
        .with_baseline_strategy( BaselineStrategy::FixedBaseline )
        .with_significance_threshold( 0.05 )  // 5% significance level
        .with_trend_window( 5 );
    
    // Perform regression analysis
    let regression_report = analyzer.analyze( &current_results, &historical );
    
    // Check results
    if regression_report.has_significant_changes() {
        println!( "📊 Significant performance changes detected!" );
        
        if let Some( trend ) = regression_report.get_trend_for( "fast_sort" ) {
            match trend {
                PerformanceTrend::Improving => println!( "🟢 Performance improved!" ),
                PerformanceTrend::Degrading => println!( "🔴 Performance regression detected!" ),
                PerformanceTrend::Stable => println!( "🟡 Performance remains stable" ),
            }
        }
        
        // Generate professional markdown report
        let markdown_report = regression_report.format_markdown();
        println!( "{}", markdown_report );
    }
    
    Ok(())
}
```

**Key Features:**
- **Three Baseline Strategies**: Fixed baseline, rolling average, and previous run comparison
- **Statistical Significance**: Configurable thresholds with proper statistical testing
- **Trend Detection**: Automatic identification of improving, degrading, or stable performance
- **Professional Reports**: Publication-quality markdown with statistical analysis
- **CI/CD Integration**: Automated regression detection for deployment pipelines
- **Historical Data Management**: Long-term performance tracking with quality validation

**Use Cases:**
- Automated performance regression detection in CI/CD pipelines
- Long-term performance monitoring and trend analysis
- Code optimization validation with statistical confidence
- Production deployment gates with zero-regression tolerance
- Performance documentation with automated updates

</details>

<details>
<summary><strong>Safe Update Chain Pattern - Atomic Documentation Updates</strong></summary>

Coordinate multiple markdown section updates atomically - either all succeed or none are modified.

```rust
use benchkit::prelude::*;

fn update_markdown_atomically() -> Result< (), Box< dyn std::error::Error > > 
{
    let performance_markdown = "## Performance Results\n\nFast!";
    let memory_markdown = "## Memory Usage\n\nLow!";
    let cpu_markdown = "## CPU Usage\n\nOptimal!";
    
    // Update multiple sections atomically
    let chain = MarkdownUpdateChain::new("readme.md")?
        .add_section("Performance Benchmarks", performance_markdown)
        .add_section("Memory Analysis", memory_markdown)
        .add_section("CPU Profiling", cpu_markdown);

    // Validate all sections before any updates
    let conflicts = chain.check_all_conflicts()?;
    if !conflicts.is_empty() {
        return Err(format!("Section conflicts detected: {:?}", conflicts).into());
    }

    // Atomic update - either all succeed or all fail
    chain.execute()?;
    Ok(())
}
```

**Key Features:**
- **Atomic Operations**: Either all sections update successfully or none are modified
- **Conflict Detection**: Validates all sections exist and are unambiguous before any changes
- **Automatic Rollback**: Failed operations restore original file state
- **Reduced I/O**: Single read and write operation instead of multiple file accesses
- **Error Recovery**: Comprehensive error handling with detailed diagnostics

**Use Cases:**
- Multi-section benchmark reports that must stay synchronized
- CI/CD pipelines requiring consistent documentation updates
- Coordinated updates across large documentation projects
- Production deployments where partial updates would be problematic

**Advanced Example:**
```rust
use benchkit::prelude::*;

fn complex_update_example() -> Result< (), Box< dyn std::error::Error > > 
{
    let performance_report = "Performance analysis results";
    let memory_report = "Memory usage analysis";
    let comparison_report = "Algorithm comparison data";
    let validation_report = "Quality assessment report";
    
    // Complex coordinated update across multiple report types
    let chain = MarkdownUpdateChain::new("PROJECT_BENCHMARKS.md")?
        .add_section("Performance Analysis", performance_report)
        .add_section("Memory Usage Analysis", memory_report)
        .add_section("Algorithm Comparison", comparison_report)
        .add_section("Quality Assessment", validation_report);

    // Validate everything before committing any changes
    match chain.check_all_conflicts() {
        Ok(conflicts) if conflicts.is_empty() => {
            println!("✅ All {} sections validated", chain.len());
            chain.execute()?;
        },
        Ok(conflicts) => {
            eprintln!("⚠️ Conflicts: {:?}", conflicts);
            // Handle conflicts or use more specific section names
        },
        Err(e) => eprintln!("❌ Validation failed: {}", e),
    }
    Ok(())
}
```

</details>

<details>
<summary><strong>Professional Report Templates - Research-Grade Documentation</strong></summary>

Generate standardized, publication-quality reports with full statistical analysis and customizable sections.

```rust
use benchkit::prelude::*;
use std::collections::HashMap;

fn generate_reports() -> Result< (), Box< dyn std::error::Error > > 
{
    let results = HashMap::new();
    let comparison_results = HashMap::new();
    
    // Comprehensive performance analysis
    let performance_template = PerformanceReport::new()
        .title("Algorithm Performance Analysis")
        .add_context("Comparing sequential vs parallel processing approaches")
        .include_statistical_analysis(true)
        .include_regression_analysis(true)
        .add_custom_section(CustomSection::new(
            "Implementation Notes", 
            "Detailed implementation considerations and optimizations applied"
        ));

    let performance_report = performance_template.generate(&results)?;

    // A/B testing comparison with statistical significance
    let comparison_template = ComparisonReport::new()
        .title("Sequential vs Parallel Processing Comparison")
        .baseline("Sequential Processing")
        .candidate("Parallel Processing") 
        .significance_threshold(0.01)     // 1% statistical significance
        .practical_significance_threshold(0.05);  // 5% practical significance

    let comparison_report = comparison_template.generate(&comparison_results)?;
    Ok(())
}
```

**Performance Report Features:**
- **Executive Summary**: Key metrics and performance indicators
- **Statistical Analysis**: Confidence intervals, coefficient of variation, reliability assessment
- **Performance Tables**: Sorted results with throughput, latency, and quality indicators
- **Custom Sections**: Domain-specific analysis and recommendations
- **Professional Formatting**: Publication-ready markdown with proper statistical notation

**Comparison Report Features:**
- **Significance Testing**: Both statistical and practical significance analysis
- **Confidence Intervals**: 95% CI analysis with overlap detection
- **Performance Ratios**: Clear improvement/regression percentages
- **Reliability Assessment**: Quality validation for both baseline and candidate
- **Decision Support**: Clear recommendations based on statistical analysis

**Advanced Template Composition:**
```rust
use benchkit::prelude::*;

fn create_enterprise_template() -> PerformanceReport 
{
    // Create domain-specific template with multiple custom sections
    let enterprise_template = PerformanceReport::new()
        .title("Enterprise Algorithm Performance Audit")
        .add_context("Monthly performance review for production trading systems")
        .include_statistical_analysis(true)
        .add_custom_section(CustomSection::new(
            "Risk Assessment",
            r#"### Performance Risk Analysis
            
    | Algorithm | Latency Risk | Throughput Risk | Stability | Overall |
    |-----------|-------------|-----------------|-----------|----------|
    | Current   | 🟢 Low     | 🟡 Medium      | 🟢 Low   | 🟡 Medium |
    | Proposed  | 🟢 Low     | 🟢 Low        | 🟢 Low   | 🟢 Low    |"#
        ))
        .add_custom_section(CustomSection::new(
            "Business Impact",
            r#"### Projected Business Impact

    - **Latency Improvement**: 15% faster response times
    - **Throughput Increase**: +2,000 req/sec capacity
    - **Cost Reduction**: -$50K/month in infrastructure
    - **SLA Compliance**: 99.9% → 99.99% uptime"#
        ));
    enterprise_template
}
```

</details>

<details>
<summary><strong>Benchmark Validation Framework - Quality Assurance</strong></summary>

Comprehensive quality assessment system with configurable criteria and automatic reliability analysis.

```rust
use benchkit::prelude::*;
use std::collections::HashMap;

fn validate_benchmark_results() 
{
    let results = HashMap::new();
    
    // Configure validator for your specific requirements
    let validator = BenchmarkValidator::new()
        .min_samples(20)                           // Require 20+ measurements
        .max_coefficient_variation(0.10)           // 10% maximum variability
        .require_warmup(true)                      // Detect warm-up periods
        .max_time_ratio(3.0)                       // 3x max/min ratio
        .min_measurement_time(Duration::from_micros(50)); // 50μs minimum duration

    // Validate all results with detailed analysis
    let validated_results = ValidatedResults::new(results, validator);

    println!("Reliability: {:.1}%", validated_results.reliability_rate());

    // Get detailed quality warnings
    if let Some(warnings) = validated_results.reliability_warnings() {
        println!("⚠️ Quality Issues Detected:");
        for warning in warnings {
            println!("  - {}", warning);
        }
    }

    // Work with only statistically reliable results
    let reliable_only = validated_results.reliable_results();
    println!("Using {}/{} reliable benchmarks for analysis", 
             reliable_only.len(), validated_results.results.len());
}
```

**Validation Criteria:**
- **Sample Size**: Ensure sufficient measurements for statistical power
- **Variability**: Detect high coefficient of variation indicating noise
- **Measurement Duration**: Flag measurements that may be timing-resolution limited
- **Performance Range**: Identify outliers and wide performance distributions  
- **Warm-up Detection**: Verify proper system warm-up for consistent results

**Warning Types:**
- `InsufficientSamples`: Too few measurements for reliable statistics
- `HighVariability`: Coefficient of variation exceeds threshold
- `ShortMeasurementTime`: Measurements may be affected by timer resolution
- `WidePerformanceRange`: Large ratio between fastest/slowest measurements
- `NoWarmup`: Missing warm-up period may indicate measurement issues

**Domain-Specific Validation:**
```rust
use benchkit::prelude::*;
use std::collections::HashMap;

fn domain_specific_validation() 
{
    let results = HashMap::new();
    
    // Real-time systems validation (very strict)
    let realtime_validator = BenchmarkValidator::new()
        .min_samples(50)
        .max_coefficient_variation(0.02)  // 2% maximum
        .max_time_ratio(1.5);             // Very tight timing

    // Interactive systems validation (balanced)  
    let interactive_validator = BenchmarkValidator::new()
        .min_samples(15)
        .max_coefficient_variation(0.15)  // 15% acceptable
        .require_warmup(false);           // Interactive may not show warmup

    // Batch processing validation (lenient)
    let batch_validator = BenchmarkValidator::new()
        .min_samples(10)
        .max_coefficient_variation(0.25)  // 25% acceptable  
        .max_time_ratio(5.0);             // Allow more variation

    // Apply appropriate validator for your domain
    let domain_results = ValidatedResults::new(results, realtime_validator);
}
```

**Quality Reporting:**
```rust
use benchkit::prelude::*;
use std::collections::HashMap;

fn generate_validation_report() 
{
    let results = HashMap::new();
    let validator = BenchmarkValidator::new();
    
    // Generate comprehensive validation report
    let validation_report = validator.generate_validation_report(&results);

    // Validation report includes:
    // - Summary statistics and reliability rates
    // - Detailed warnings with improvement recommendations  
    // - Validation criteria documentation
    // - Quality assessment for each benchmark
    // - Actionable steps to improve measurement quality

    println!("{}", validation_report);
}
```

</details>

<details>
<summary><strong>Complete Integration Examples</strong></summary>

Comprehensive examples demonstrating real-world usage patterns and advanced integration scenarios.

**Development Workflow Integration:**
```rust
use benchkit::prelude::*;

// Complete development cycle: benchmark → validate → document → commit
fn development_workflow() -> Result< (), Box< dyn std::error::Error > > 
{
    // Mock implementations for doc test
    fn quicksort_implementation() {}
    fn mergesort_implementation() {}
    
    // 1. Run benchmarks
    let mut suite = BenchmarkSuite::new("Algorithm Performance");
    suite.benchmark("quicksort", || quicksort_implementation());
    suite.benchmark("mergesort", || mergesort_implementation());
    let results = suite.run_all();
    
    // 2. Validate quality
    let validator = BenchmarkValidator::new()
        .min_samples(15)
        .max_coefficient_variation(0.15);
    let validated_results = ValidatedResults::new(results.results, validator);
    
    if validated_results.reliability_rate() < 80.0 {
        return Err("Benchmark quality insufficient for analysis".into());
    }
    
    // 3. Generate professional report
    let template = PerformanceReport::new()
        .title("Algorithm Performance Analysis")
        .include_statistical_analysis(true)
        .add_custom_section(CustomSection::new(
            "Development Notes",
            "Analysis conducted during algorithm optimization phase"
        ));
    
    let report = template.generate(&validated_results.results)?;
    
    // 4. Update documentation atomically
    let chain = MarkdownUpdateChain::new("README.md")?
        .add_section("Performance Analysis", report)
        .add_section("Quality Assessment", validated_results.validation_report());
    
    chain.execute()?;
    println!("✅ Development documentation updated successfully");
    
    Ok(())
}
```

**CI/CD Pipeline Integration:**
```rust
use benchkit::prelude::*;
use std::collections::HashMap;

// Automated performance regression detection
fn cicd_performance_check(baseline_results: HashMap<String, BenchmarkResult>, 
                          pr_results: HashMap<String, BenchmarkResult>) -> Result< bool, Box< dyn std::error::Error > > {
    // Validate both result sets
    let validator = BenchmarkValidator::new().require_warmup(false);
    let baseline_validated = ValidatedResults::new(baseline_results.clone(), validator.clone());
    let pr_validated = ValidatedResults::new(pr_results.clone(), validator);
    
    // Require high quality for regression analysis
    if baseline_validated.reliability_rate() < 90.0 || pr_validated.reliability_rate() < 90.0 {
        println!("❌ BLOCK: Insufficient benchmark quality for regression analysis");
        return Ok(false);
    }
    
    // Compare performance for regression detection
    let comparison = ComparisonReport::new()
        .title("Performance Regression Analysis")
        .baseline("baseline_version")
        .candidate("pr_version")
        .practical_significance_threshold(0.05);  // 5% regression threshold
    
    // Create combined results for comparison
    let mut combined = HashMap::new();
    combined.insert("baseline_version".to_string(), 
                   baseline_results.values().next().unwrap().clone());
    combined.insert("pr_version".to_string(), 
                   pr_results.values().next().unwrap().clone());
    
    let regression_report = comparison.generate(&combined)?;
    
    // Check for regressions
    let has_regression = regression_report.contains("slower");
    
    if has_regression {
        println!("❌ BLOCK: Performance regression detected");
        // Save detailed report for review
        std::fs::write("regression_analysis.md", regression_report)?;
        Ok(false)
    } else {
        println!("✅ ALLOW: No performance regressions detected");
        Ok(true)
    }
}
```

**Multi-Project Coordination:**
```rust
use benchkit::prelude::*;
use std::collections::HashMap;

// Coordinate benchmark updates across multiple related projects
fn coordinate_multi_project_benchmarks() -> Result< (), Box< dyn std::error::Error > > 
{
    let projects = vec!["web-api", "batch-processor", "realtime-analyzer"];
    let mut all_results = HashMap::new();
    
    // Collect results from all projects  
    for project in &projects {
        let project_results = run_project_benchmarks(project)?;
        all_results.extend(project_results);
    }
    
    // Cross-project validation with lenient criteria
    let validator = BenchmarkValidator::new()
        .max_coefficient_variation(0.25)  // Different environments have more noise
        .require_warmup(false);
    
    let cross_project_validated = ValidatedResults::new(all_results.clone(), validator);
    
    // Generate consolidated impact analysis
    let impact_template = PerformanceReport::new()
        .title("Cross-Project Performance Impact Analysis")
        .add_context("Shared library upgrade impact across all dependent projects")
        .include_statistical_analysis(true)
        .add_custom_section(CustomSection::new(
            "Project Impact Summary",
            format_project_impact_analysis(&projects, &all_results)
        ));
    
    let impact_report = impact_template.generate(&all_results)?;
    
    // Update shared documentation
    let shared_chain = MarkdownUpdateChain::new("SHARED_LIBRARY_IMPACT.md")?
        .add_section("Current Impact Analysis", &impact_report)
        .add_section("Quality Assessment", &cross_project_validated.validation_report());
    
    shared_chain.execute()?;
    
    // Notify project maintainers
    notify_project_teams(&projects, &impact_report)?;
    
    Ok(())
}

// Helper functions for the example
fn run_project_benchmarks(_project: &str) -> Result< HashMap< String, BenchmarkResult >, Box< dyn std::error::Error > > 
{
    // Mock implementation for doc test
    Ok(HashMap::new())
}

fn format_project_impact_analysis(_projects: &[&str], _results: &HashMap< String, BenchmarkResult >) -> String 
{
    // Mock implementation for doc test  
    "Impact analysis summary".to_string()
}

fn notify_project_teams(_projects: &[&str], _report: &str) -> Result< (), Box< dyn std::error::Error > > 
{
    // Mock implementation for doc test
    Ok(())
}
```

</details>

<details>
<summary><strong>Measure: Core Timing and Profiling</strong></summary>

At its heart, `benchkit` provides simple and accurate measurement primitives.

```rust
use benchkit::prelude::*;

// A robust measurement with multiple iterations and statistical cleanup.
let result = bench_function
(
  "summation_1000",
  ||
  {
    ( 0..1000 ).fold( 0, | acc, x | acc + x )
  }
);
println!( "Avg time: {:.2?}", result.mean_time() );
println!( "Throughput: {:.0} ops/sec", result.operations_per_second() );

// Track memory usage patterns alongside timing.
let memory_benchmark = MemoryBenchmark::new( "allocation_test" );
let ( timing, memory_stats ) = memory_benchmark.run_with_tracking
(
  10,
  ||
  {
    let data = vec![ 0u8; 1024 ];
    memory_benchmark.tracker.record_allocation( 1024 );
    std::hint::black_box( data );
  }
);
println!( "Peak memory usage: {} bytes", memory_stats.peak_usage );
```

</details>

<details>
<summary><strong>Analyze: Find Insights and Regressions</strong></summary>

Turn raw numbers into actionable insights.

```rust
use benchkit::prelude::*;

// Compare multiple implementations to find the best one.
let report = ComparativeAnalysis::new( "Hashing" )
.algorithm( "fnv", || { /* ... */ } )
.algorithm( "siphash", || { /* ... */ } )
.run();

if let Some( ( fastest_name, _ ) ) = report.fastest()
{
  println!( "Fastest algorithm: {}", fastest_name );
}

// Example benchmark results
let result_a = bench_function( "test_a", || { /* ... */ } );
let result_b = bench_function( "test_b", || { /* ... */ } );

// Compare two benchmark results
let comparison = result_a.compare( &result_b );
if comparison.is_improvement()
{
  println!( "Performance improved!" );
}
```

</details>

<details>
<summary><strong>Generate: Create Realistic Test Data</strong></summary>

Stop writing boilerplate to create test data. `benchkit` provides generators for common scenarios.

```rust
use benchkit::prelude::*;

// Generate a comma-separated list of 100 items.
let list_data = generate_list_data( DataSize::Medium );

// Generate realistic unilang command strings for parser benchmarking.
let command_generator = DataGenerator::new()
.complexity( DataComplexity::Complex );
let commands = command_generator.generate_unilang_commands( 10 );

// Create reproducible data with a specific seed.
let mut seeded_gen = SeededGenerator::new( 42 );
let random_data = seeded_gen.random_string( 1024 );
```

</details>

<details>
<summary><strong>Document: Automate Your Reports</strong></summary>

The "documentation-first" philosophy is enabled by powerful report generation and file updating tools.

```rust,no_run
use benchkit::prelude::*;

fn main() -> Result< (), Box< dyn std::error::Error > >
{
  let mut suite = BenchmarkSuite::new( "api_performance" );
  suite.benchmark( "get_user", || { /* ... */ } );
  suite.benchmark( "create_user", || { /* ... */ } );
  let results = suite.run_analysis();

  // Generate a markdown report from the results.
  let markdown_report = results.generate_markdown_report().generate();

  // Automatically update the "## Performance" section of a file.
  let updater = MarkdownUpdater::new( "readme.md", "Performance" )?;
  updater.update_section( &markdown_report )?;
  
  Ok( () )
}
```

</details>

## The `benchkit` Workflow

`benchkit` is designed to make performance analysis a natural part of your development cycle.

```text
[ 1. Write Code ] -> [ 2. Add Benchmark in `benches/` ] -> [ 3. Run `cargo run --bin` ]
       ^                                                                   |
       |                                                                   v
[ 5. Commit Code + Perf Docs ] <- [ 4. Auto-Update `benchmark_results.md` ] <- [ Analyze Results ]
```

## 📁 MANDATORY `benches/` Directory - NO ALTERNATIVES

**ABSOLUTE REQUIREMENT**: ALL benchmark-related files MUST be in the `benches/` directory. This is NON-NEGOTIABLE for proper benchkit functionality:

- 🚫 **NEVER in `tests/`**: Benchmarks are NOT tests and MUST NOT be mixed with unit tests
- 🚫 **NEVER in `examples/`**: Examples are demonstrations, NOT performance measurements  
- 🚫 **NEVER in `src/bin/`**: Source binaries are NOT benchmarks
-**ONLY in `benches/`**: This is the EXCLUSIVE location for ALL benchmark content

**Why This Requirement Exists:**

- **Cargo Requirement**: `cargo bench` ONLY works with `benches/` directory
- 🏗️ **Ecosystem Standard**: ALL professional Rust projects use `benches/` EXCLUSIVELY
- 🔧 **Tool Compatibility**: IDEs, CI systems, linters expect benchmarks ONLY in `benches/`
- 📊 **Performance Isolation**: Benchmarks require different compilation and execution than tests

### Why This Matters

**Ecosystem Integration**: The `benches/` directory is the official Rust standard, ensuring compatibility with the entire Rust toolchain.

**Zero Configuration**: `cargo bench` automatically discovers and runs benchmarks in the `benches/` directory without additional setup.

**Community Expectations**: Developers expect to find benchmarks in `benches/` - this follows the principle of least surprise.

**Tool Compatibility**: All Rust tooling (IDEs, CI/CD, linters) is designed around the standard `benches/` structure.

### Automatic Documentation Updates

`benchkit` excels at maintaining comprehensive, automatically updated documentation in your project files:

```markdown
# Benchmark Results

## Algorithm Comparison

| Algorithm | Mean Time | Throughput | Relative |
|-----------|-----------|------------|----------|
| quicksort | 1.23ms    | 815 ops/s  | baseline |
| mergesort | 1.45ms    | 689 ops/s  | 1.18x    |
| heapsort  | 1.67ms    | 599 ops/s  | 1.36x    |

*Last updated: 2024-01-15 14:32:18 UTC*
*Generated by benchkit v0.4.0*

## Performance Trends

- quicksort maintains consistent performance across data sizes
- mergesort shows better cache behavior on large datasets
- heapsort provides predictable O(n log n) guarantees

## Test Configuration

- Hardware: 16-core AMD Ryzen, 32GB RAM
- Rust version: 1.75.0
- Optimization: --release
- Iterations: 1000 per benchmark
```

This documentation is automatically generated and updated every time you run benchmarks.

### Integration Examples

```rust,no_run
// ✅ In standard tests/ directory alongside unit tests
// tests/performance_comparison.rs
use benchkit::prelude::*;

#[test]
fn benchmark_algorithms()
{
  let mut suite = BenchmarkSuite::new( "Algorithm Comparison" );
  
  suite.benchmark( "quick_sort", ||
  {
    // Your quicksort implementation
  });
  
  suite.benchmark( "merge_sort", ||  
  {
    // Your mergesort implementation
  });
  
  let results = suite.run_all();
  
  // Automatically update readme.md with results
  let updater = MarkdownUpdater::new( "readme.md", "Performance" ).unwrap();
  updater.update_section( &results.generate_markdown_report().generate() ).unwrap();
}
```

```rust,no_run
// ✅ In examples/ directory for demonstrations
// examples/comprehensive_benchmark.rs  
use benchkit::prelude::*;

fn main()
{
  let mut comprehensive = BenchmarkSuite::new( "Comprehensive Performance Analysis" );
  
  // Add multiple benchmarks
  comprehensive.benchmark( "data_processing", || { /* code */ } );
  comprehensive.benchmark( "memory_operations", || { /* code */ } );
  comprehensive.benchmark( "io_operations", || { /* code */ } );
  
  let results = comprehensive.run_all();
  
  // Update readme.md with comprehensive report
  let report = results.generate_markdown_report();
  let updater = MarkdownUpdater::new( "readme.md", "Performance Analysis" ).unwrap();
  updater.update_section( &report.generate() ).unwrap();
  
  println!( "Updated readme.md with latest performance results" );
}
```

### 🔧 Feature Flag Recommendations

For optimal build performance and clean separation, put your benchmark code behind feature flags:

```rust,no_run
// ✅ In src/bin/ directory for dedicated benchmark executables  
// src/bin/comprehensive_benchmark.rs
#[ cfg( feature = "enabled" ) ]
use benchkit::prelude::*;

#[ cfg( feature = "enabled" ) ]
fn main()
{
  let mut suite = BenchmarkSuite::new( "Comprehensive Performance Suite" );
  
  suite.benchmark( "algorithm_a", || { /* implementation */ } );
  suite.benchmark( "algorithm_b", || { /* implementation */ } );
  suite.benchmark( "data_structure_ops", || { /* implementation */ } );
  
  let results = suite.run_all();
  
  // Automatically update readme.md
  let updater = MarkdownUpdater::new( "readme.md", "Latest Results" ).unwrap();
  updater.update_section( &results.generate_markdown_report().generate() ).unwrap();
  
  println!( "Benchmarks completed - readme.md updated" );
}

#[ cfg( not( feature = "enabled" ) ) ]
fn main()
{
  println!( "Run with: cargo run --bin comprehensive_benchmark --features enabled" );
  println!( "Results will be automatically saved to readme.md" );
}
```

Add to your `Cargo.toml`:

```toml
[features]
benchmark = ["benchkit"]

[dev-dependencies]
benchkit = { version = "0.8.0", features = ["full"], optional = true }
```

Run benchmarks selectively:
```bash
# Run only unit tests (fast)
cargo test

# Run specific benchmark binary (updates readme.md)
cargo run --bin comprehensive_benchmark --features enabled

# Run benchmarks from examples/
cargo run --example performance_demo --features enabled

# Run all binaries containing benchmarks
cargo run --bin performance_suite --features enabled
```

This approach keeps your regular builds fast while making comprehensive performance testing available when needed.

## 📚 Comprehensive Examples

`benchkit` includes extensive examples demonstrating every feature and usage pattern:

### 🎯 Feature-Specific Examples

- **[Update Chain Comprehensive]examples/update_chain_comprehensive.rs**: Complete demonstration of atomic documentation updates
  - Single and multi-section updates with conflict detection  
  - Error handling and recovery patterns
  - Advanced conflict resolution strategies
  - Performance optimization for bulk updates
  - Full integration with validation and templates

- **[Templates Comprehensive]examples/templates_comprehensive.rs**: Professional report generation in all scenarios
  - Basic and fully customized Performance Report templates
  - A/B testing with Comparison Report templates  
  - Custom sections with advanced markdown formatting
  - Multiple comparison scenarios and batch processing
  - Business impact analysis and risk assessment templates
  - Comprehensive error handling for edge cases

- **[Validation Comprehensive]examples/validation_comprehensive.rs**: Quality assurance for reliable benchmarking
  - Default and custom validator configurations
  - Individual warning types with detailed analysis
  - Validation report generation and interpretation
  - Reliable results filtering for analysis
  - Domain-specific validation scenarios (research, development, production, micro)
  - Full integration with templates and update chains

- **[Regression Analysis Comprehensive]examples/regression_analysis_comprehensive.rs**: Complete regression analysis system demonstration
  - All baseline strategies (Fixed, Rolling Average, Previous Run)
  - Performance trend detection (Improving, Degrading, Stable)
  - Statistical significance testing with configurable thresholds
  - Professional markdown report generation with regression insights
  - Real-world optimization scenarios and configuration guidance
  - Full integration with PerformanceReport templates

- **[Historical Data Management]examples/historical_data_management.rs**: Managing long-term performance data
  - Incremental historical data building and TimestampedResults creation
  - Data quality validation and cleanup procedures
  - Performance trend analysis across multiple time windows
  - Storage and serialization strategy recommendations
  - Data retention and archival best practices
  - Integration with RegressionAnalyzer for trend detection

### 🔧 Integration Examples

- **[Integration Workflows]examples/integration_workflows.rs**: Real-world workflow automation
  - Development cycle: benchmark → validate → document → commit
  - CI/CD pipeline: regression detection → merge decision → automated reporting
  - Multi-project coordination: impact analysis → consolidated reporting → team alignment
  - Production monitoring: continuous tracking → alerting → dashboard updates

- **[Error Handling Patterns]examples/error_handling_patterns.rs**: Robust operation under adverse conditions
  - Update Chain file system errors (permissions, conflicts, recovery)
  - Template generation errors (missing data, invalid parameters)
  - Validation framework edge cases (malformed data, extreme variance)
  - System errors (resource limits, concurrent access)
  - Graceful degradation strategies with automatic fallbacks

- **[Advanced Usage Patterns]examples/advanced_usage_patterns.rs**: Enterprise-scale benchmarking
  - Domain-specific validation criteria (real-time, interactive, batch processing)
  - Template composition and inheritance patterns
  - Coordinated multi-document updates with consistency guarantees
  - Memory-efficient large-scale processing (1000+ algorithms)
  - Performance optimization techniques (caching, concurrency, incremental processing)

- **[CI/CD Regression Detection]examples/cicd_regression_detection.rs**: Automated performance validation in CI/CD pipelines
  - Multi-environment validation (development, staging, production)
  - Configurable regression thresholds and statistical significance levels
  - Automated performance gate decisions with proper exit codes
  - GitHub Actions compatible reporting and documentation updates
  - Progressive validation pipeline with halt-on-failure
  - Real-world CI/CD integration patterns and best practices

- **🚨 [Cargo Bench Integration]examples/cargo_bench_integration.rs**: CRITICAL - Standard `cargo bench` integration patterns
  - Seamless integration with Rust's standard `cargo bench` command
  - Automatic documentation updates during benchmark execution
  - Standard `benches/` directory structure support
  - Criterion compatibility layer for zero-migration adoption
  - CI/CD integration with standard workflows and conventions
  - Real-world project structure and configuration examples
  - **This is the foundation requirement for benchkit adoption**

### 🚀 Running the Examples

```bash
# Feature-specific examples
cargo run --example update_chain_comprehensive --all-features
cargo run --example templates_comprehensive --all-features  
cargo run --example validation_comprehensive --all-features

# NEW: Regression Analysis Examples
cargo run --example regression_analysis_comprehensive --all-features
cargo run --example historical_data_management --all-features

# Integration examples
cargo run --example integration_workflows --all-features
cargo run --example error_handling_patterns --all-features
cargo run --example advanced_usage_patterns --all-features

# NEW: CI/CD Integration Example
cargo run --example cicd_regression_detection --all-features

# 🚨 CRITICAL: Cargo Bench Integration Example
cargo run --example cargo_bench_integration --all-features

# Original enhanced features demo
cargo run --example enhanced_features_demo --all-features
```

Each example is fully documented with detailed explanations and demonstrates production-ready patterns you can adapt to your specific needs.

## Installation

Add `benchkit` to your `[dev-dependencies]` in `Cargo.toml`.

```toml
[dev-dependencies]
# For core functionality
benchkit = "0.1"

# Or enable all features for the full toolkit
benchkit = { version = "0.8.0", features = [ "full" ] }
```

## 📋 Development Guidelines & Best Practices

**⚠️ IMPORTANT**: Before using benchkit in production or contributing to development, **strongly review** the comprehensive [`usage.md`](usage.md) file. This document contains essential requirements, best practices, and lessons learned from real-world performance analysis work.

The recommendations cover:
- **Core philosophy** and toolkit vs framework principles
-**Technical architecture** requirements and feature organization
-**Performance analysis** best practices with standardized data patterns
-**Documentation integration** requirements for automated reporting
-**Statistical analysis** requirements for reliable measurements

**📖 Read [`usage.md`](usage.md) first** - it will save you time and ensure you're following proven patterns.

## Contributing

Contributions are welcome! `benchkit` aims to be a community-driven toolkit that solves real-world benchmarking problems. 

**Before contributing:**
1. **📖 Read [`usage.md`]usage.md** - Contains all development requirements and design principles
2. Review open tasks in the [`task/`]task/ directory  
3. Check our contribution guidelines

All contributions must align with the principles and requirements outlined in [`usage.md`](usage.md).

## License

This project is licensed under the **MIT License**.


## Performance

*This section is automatically updated by benchkit when you run benchmarks.*