aprender 0.31.2

Next-generation ML framework in pure Rust — `cargo install aprender` for the `apr` CLI
Documentation
<!-- PCU: examples-bench-comparison | contract: contracts/apr-page-examples-bench-comparison-v1.yaml -->
<!-- Example: cargo run -p aprender-core --example bench_comparison -->
<!-- Status: enforced -->

# Benchmark Comparison

This example demonstrates how to compare performance across different implementations and configurations in the aprender ecosystem.

## Overview

The `bench_comparison` example provides a standardized way to measure and compare:

- CPU vs GPU performance
- Different quantization levels (Q4_K, Q8, F16, F32)
- Inference throughput (tokens per second)
- Memory bandwidth utilization

## Running the Benchmark

```bash
cargo run --release --example bench_comparison
```

## Key Metrics

| Metric | Description |
|--------|-------------|
| **tok/s** | Tokens generated per second |
| **Bandwidth** | Memory throughput (GB/s) |
| **Latency** | Time per token (ms) |
| **Efficiency** | % of theoretical peak |

## See Also

<!-- Performance Profiling and Quantization Guide chapters not yet written -->