ELARA Benchmark Suite

Production-grade benchmark suite for the ELARA Protocol using criterion for statistical analysis and performance tracking.

Overview

This crate provides comprehensive benchmarks for all critical ELARA Protocol components:

Wire Protocol: Frame encoding/decoding, header parsing, packet serialization
Cryptographic Operations: Encryption/decryption, signatures, key derivation
State Reconciliation: Version vector operations, causality checking, state merge
Time Engine: Time classification, clock operations, drift estimation

Running Benchmarks

Run All Benchmarks

cargo bench --package elara-bench

Run Specific Benchmark Suite

# Wire protocol benchmarks

cargo bench --package elara-bench --bench wire_protocol


# Crypto benchmarks

cargo bench --package elara-bench --bench crypto_operations


# State benchmarks

cargo bench --package elara-bench --bench state_reconciliation


# Time engine benchmarks

cargo bench --package elara-bench --bench time_engine

Run Specific Benchmark

cargo bench --package elara-bench --bench wire_protocol -- frame_encoding

Benchmark Results

Criterion generates detailed HTML reports in target/criterion/. Open target/criterion/report/index.html in a browser to view:

Statistical analysis (mean, median, std dev)
Performance trends over time
Regression detection
Detailed plots and charts

Performance Baselines

Wire Protocol

Expected performance on reference hardware (Intel i7-10700K, 32GB RAM):

Operation	Payload Size	Throughput	Latency
Frame Encoding	64B	~2M ops/sec	~500ns
Frame Encoding	1KB	~500K ops/sec	~2μs
Frame Encoding	16KB	~50K ops/sec	~20μs
Frame Decoding	64B	~1.5M ops/sec	~650ns
Frame Decoding	1KB	~400K ops/sec	~2.5μs
Frame Decoding	16KB	~40K ops/sec	~25μs
Header Parse	-	~10M ops/sec	~100ns
Header Serialize	-	~8M ops/sec	~125ns

Cryptographic Operations

Operation	Payload Size	Throughput	Latency
Identity Generation	-	~5K ops/sec	~200μs
Encryption	64B	~200K ops/sec	~5μs
Encryption	1KB	~100K ops/sec	~10μs
Encryption	16KB	~10K ops/sec	~100μs
Decryption	64B	~180K ops/sec	~5.5μs
Decryption	1KB	~90K ops/sec	~11μs
Decryption	16KB	~9K ops/sec	~110μs
Sign	-	~50K ops/sec	~20μs
Verify	-	~20K ops/sec	~50μs
Key Derivation	-	~10K ops/sec	~100μs

State Reconciliation

Operation	Event Count	Throughput	Latency
VV Increment	-	~50M ops/sec	~20ns
VV Get	100 entries	~20M ops/sec	~50ns
VV Merge	10 events	~5M ops/sec	~200ns
VV Merge	100 events	~1M ops/sec	~1μs
VV Merge	1000 events	~100K ops/sec	~10μs
Happens-Before	10 events	~10M ops/sec	~100ns
Happens-Before	100 events	~2M ops/sec	~500ns
Happens-Before	1000 events	~200K ops/sec	~5μs

Time Engine

Operation	Throughput	Latency
Engine Tick	~20M ops/sec	~50ns
Time Classify	~10M ops/sec	~100ns
Perceptual Clock Tick	~50M ops/sec	~20ns
State Clock Advance	~10M ops/sec	~100ns
Network Model Update	~5M ops/sec	~200ns
Drift Estimation	~2M ops/sec	~500ns

Note: These are approximate baseline numbers. Actual performance varies based on hardware, system load, and configuration.

CI Integration

Benchmarks run automatically in CI on:

Push to main or develop branches (for performance-critical paths)
Pull requests affecting core components
Weekly schedule (every Monday at 2 AM UTC)
Manual workflow dispatch

Regression Detection

The CI workflow compares benchmark results against the baseline and detects regressions >10%. If a significant regression is detected, the workflow fails and requires investigation.

Baseline Updates

When changes are merged to main, the benchmark results become the new baseline for future comparisons.

Configuration

Benchmarks can be configured via BenchmarkConfig:

use elara_bench::BenchmarkConfig;

// Quick benchmarks for CI
let config = BenchmarkConfig::quick();

// Thorough benchmarks for baseline establishment
let config = BenchmarkConfig::thorough();

// Custom configuration
let config = BenchmarkConfig {
    warmup_iterations: 100,
    warmup_time: Duration::from_secs(3),
    measurement_time: Duration::from_secs(5),
    sample_size: 100,
};

Interpreting Results

Statistical Measures

Mean: Average execution time
Median: Middle value (50th percentile)
Std Dev: Variability in measurements
MAD: Median Absolute Deviation (robust measure of variability)

Performance Trends

Criterion tracks performance over time and generates plots showing:

Performance changes across commits
Regression/improvement detection
Statistical confidence intervals

Regression Threshold

A regression is considered significant if:

Performance degrades by >10%
Change is statistically significant (p < 0.05)
Consistent across multiple runs

Best Practices

Running Benchmarks

Minimize system load: Close unnecessary applications
Consistent environment: Use the same hardware for comparisons
Multiple runs: Run benchmarks multiple times for reliability
Warm cache: Criterion handles warmup automatically

Investigating Regressions

Verify reproducibility: Run benchmarks multiple times
Check recent changes: Review commits since last baseline
Profile hot paths: Use profiling tools (perf, flamegraph)
Compare assembly: Check for unexpected code generation changes

Adding New Benchmarks

Add benchmark function to appropriate file
Use black_box() to prevent compiler optimizations
Set appropriate throughput for size-based benchmarks
Document expected performance in this README

Profiling

For detailed profiling, use:

# Generate flamegraph

cargo flamegraph --bench wire_protocol


# Use perf for detailed analysis

perf record --call-graph dwarf cargo bench --bench crypto_operations

perf report

Contributing

When adding new features:

Add corresponding benchmarks
Run benchmarks before and after changes
Document expected performance
Update baseline if intentional performance changes

License

MIT OR Apache-2.0

elara-bench 0.2.0

ELARA Benchmark Suite

Overview

Running Benchmarks

Run All Benchmarks

Run Specific Benchmark Suite

Run Specific Benchmark

Benchmark Results

Performance Baselines

Wire Protocol

Cryptographic Operations

State Reconciliation

Time Engine

CI Integration

Regression Detection

Baseline Updates

Configuration

Interpreting Results

Statistical Measures

Performance Trends

Regression Threshold

Best Practices

Running Benchmarks

Investigating Regressions

Adding New Benchmarks

Profiling

Contributing

License