# Python Wrapper Benchmark Tests
This directory contains performance benchmarks for the MRRC Python wrapper (PyO3/Maturin).
## Structure
- `conftest.py` - Pytest configuration and fixtures for test data loading
- `test_benchmark_reading.py` - Benchmarks for record reading and field access
- `test_benchmark_writing.py` - Benchmarks for record writing and roundtrip operations
## Test Fixtures
Large test fixtures are stored in `../data/fixtures/`:
- `1k_records.mrc` (0.25 MB) - Small fixture for quick tests
- `10k_records.mrc` (2.52 MB) - Medium fixture for standard benchmarks
Fixtures are automatically generated by `scripts/generate_benchmark_fixtures.py`.
## Running Benchmarks
### Quick benchmarks (1k records only)
```bash
pytest tests/python/ -v --benchmark-only -m "not slow"
```
### All benchmarks (includes 10k)
```bash
pytest tests/python/ -v --benchmark-only
```
### Detailed benchmark output with comparison
```bash
pytest tests/python/ -v --benchmark-only --benchmark-compare
```
### Generate HTML report
```bash
pytest tests/python/ --benchmark-json=.benchmarks/results.json
pytest-benchmark compare .benchmarks/results.json --group-by=func --sort=name
```
### Run specific benchmark
```bash
pytest tests/python/test_benchmark_reading.py::TestReadingBenchmarks::test_read_1k_records -v --benchmark-only
```
## Benchmark Types
### Reading Benchmarks (`test_benchmark_reading.py`)
- **test_read_Nk_records**: Pure reading performance - how fast can we parse records?
- **test_read_and_extract_titles_Nk**: Realistic workload - read records and extract data
### Writing Benchmarks (`test_benchmark_writing.py`)
- **test_write_only_Nk**: Pure writing performance - how fast can we serialize records?
- **test_roundtrip_Nk**: Read + write combined - tests the full pipeline
- **test_stream_write_Nk**: Streaming pattern - process records one at a time
## Interpreting Results
Pytest-benchmark provides these metrics:
- **min/max/mean**: Timing statistics across runs
- **rounds**: Number of times the test was executed
- **stddev**: Standard deviation of timing measurements
- **iterations**: How many times the operation was repeated per round
Example output:
```
test_read_1k_records 1000 calls, 50.43 ms/call
test_read_10k_records 10000 calls, 50.62 ms/call
```
This means:
- We read ~1000 records per call
- Each call takes ~50 milliseconds
- Throughput: ~20,000 records/second
## Performance Goals
Based on pymarc baseline:
- **Reading**: Target 5-10x faster than pure Python pymarc
- **Writing**: Target 3-5x faster than pure Python pymarc
- **Memory**: Similar or better memory efficiency (Rust backing)
## Generating New Fixtures
To generate test fixtures:
```bash
python scripts/generate_benchmark_fixtures.py
```
This creates:
- 1,000 records (quick testing)
- 10,000 records (standard benchmarks)
Fixtures include:
- Diverse record types (books, authority records)
- Varying field counts and subfield patterns
- Realistic MARC data structure
## CI Integration
Benchmarks run in GitHub Actions via:
- `.github/workflows/python-build.yml` - Builds wheels and runs tests
- `.github/workflows/python-release.yml` - Optional codspeed integration (future)
To prevent slow benchmarks from timing out in CI, they're marked with `@pytest.mark.slow`.
## Three-Way Comparison
For performance comparison with pymarc and raw Rust:
```bash
# Install pymarc for comparison (optional)
pip install pymarc
# Run comparison script (future: mrrc-9ic.12.3)
python scripts/benchmark_comparison.py
```
## Troubleshooting
### "Fixture not found" error
Ensure fixtures are generated:
```bash
python scripts/generate_benchmark_fixtures.py
```
### Benchmarks are slow/timing out
- Adjust `benchmark_min_rounds` in `pyproject.toml` for faster iteration
- Build with `maturin develop` instead of full wheel build
### Results vary significantly between runs
This is normal for benchmarks on shared/busy systems:
- Close other applications
- Use `--benchmark-compare-fail=mean:10%` to set tolerance
- Run multiple times with `--benchmark-count=5`