# Benchmark Results
Early benchmarking (January 2026, on an Apple M4 MacBook Air) suggested at
least a 4x single-threaded speedup over pymarc with the Python wrapper, with
the pure Rust library reading on the order of 1M records/second. Those
benchmarks need to be updated and reconsidered: the comparison harness they
used is no longer in the repository, the parser has been substantially
optimized since, and the pymarc version measured against was not recorded.
Treat the multipliers as historical indications, not current measurements.
This page describes what the benchmark infrastructure measures today and the
procedure for producing numbers worth citing.
## What CI measures
Two CodSpeed jobs run on every pull request, both in simulation mode:
- **Rust**: criterion benches (`benches/marc_benchmarks.rs`,
`benches/error_handling_benchmarks.rs`) via
`.github/workflows/benchmark-rust.yml`
- **Python**: pytest benchmarks (`tests/python/test_benchmark_reading.py`,
`test_benchmark_writing.py`, `test_memory_benchmarks.py`) via
`.github/workflows/benchmark-python.yml`
Simulation mode executes each benchmark once under Valgrind and models its
cost from instruction counts and cache behavior. The result is deterministic:
the same code produces the same number regardless of runner speed, which makes
it reliable for detecting regressions between commits. It is **not wall-clock
time** — simulation results cannot be quoted as records/second.
Parallel-throughput benchmarks are excluded from CI because Valgrind
serializes threads, so multi-threaded speedup cannot be measured under
simulation. Measure parallelism locally instead (below).
## Measuring locally
Local runs use real wall-clock time. For stable numbers: run on AC power, on
a quiet machine, and let the frameworks' warm-up and repeated rounds do their
work.
### Single-threaded
```bash
# Rust (criterion)
cargo bench --bench marc_benchmarks
cargo bench --bench error_handling_benchmarks
# Python (pytest-benchmark)
uv run maturin develop --release
uv run pytest tests/python/ -m "benchmark and not slow" --benchmark-only -v
```
### Parallel throughput
```bash
# Rust (criterion, rayon)
cargo bench --bench parallel_benchmarks
# Python (ThreadPoolExecutor and ProducerConsumerPipeline)
uv run pytest tests/python/test_benchmark_parallel.py \
tests/python/test_benchmark_pipeline_parallel.py --benchmark-only -v
```
## Producing a citable comparison
Any published figure — especially a comparison against pymarc — must come
from a run that records:
- the date of the run
- hardware: CPU model, core count, memory
- OS name and version
- Rust toolchain version and Python version
- the exact, pinned version of every library measured (including pymarc)
- the harness used, committed to this repository
- the fixture data and its size
A multiplier without this context is not reproducible and does not belong in
the documentation.
## Test fixtures
Benchmark fixtures are synthetic MARC records generated by
`scripts/generate_benchmark_fixtures.py`, stored in `tests/data/fixtures/`:
- `1k_records.mrc` (~257 KB) — quick tests
- `10k_records.mrc` (~2.5 MB) — standard benchmarks
Synthetic fixtures are adequate for regression detection. Figures intended to
describe real-world performance should also be measured against
representative library data.
## References
- Rust benchmarks: `benches/marc_benchmarks.rs`,
`benches/error_handling_benchmarks.rs`, `benches/parallel_benchmarks.rs`
- Python benchmarks: `tests/python/test_benchmark_*.py`
- Memory benchmarks: `tests/python/test_memory_benchmarks.py`
- Fixture generator: `scripts/generate_benchmark_fixtures.py`
- CI workflows: `.github/workflows/benchmark-rust.yml`,
`.github/workflows/benchmark-python.yml`