pgn-reader
A fast non-allocating and streaming reader for chess games in PGN notation. Experimental.
Design priorities
In this order:
- Safe to run on untrusted inputs. No panics, no denial of service (linear time complexity).
- Correct on valid PGNs.
- Performance. One of the fastest PGN parsers in any language.
- Reasonable behavior on invalid PGNs. Common quirks may be supported, but only if there is essentially no performance cost.
- Usability. Basic operation can be quite verbose and users need to bring their own in-memory representation for games.
Introduction
Reader parses games and calls methods of a user provided Visitor.
Implementing custom visitors allows for maximum flexibility:
- The reader itself does not allocate (besides a single fixed-size buffer). The visitor can decide if and how to represent games in memory.
- The reader does not validate move legality. This allows implementing support for custom chess variants, or delaying move validation.
- The visitor can short-circuit and let the reader use a fast path for skipping games or variations.
Example
A visitor that counts the number of syntactically valid moves in the mainline of each game.
use ;
use ;
;
Documentation
Benchmarks (v0.28.0)
Run with lichess_db_standard_rated_2018-10.pgn, a very orderly PGN file with additional headers and many small comments for evaluations and clock times, containing 24,784,600 games, 50,307 MiB uncompressed on tmpfs, AMD Ryzen 9 9950X @ 4.3 GHz, compiled with Rust 1.88.0:
| Benchmark | Time | Throuhput (games) | Throughput (data) |
|---|---|---|---|
| examples/stats.rs | 50.6 s | 489,814 /s | 994 MiB/s |
| examples/validate.rs | 116.8 s | 212,197 /s | 431 MiB/s |
| examples/parallel_validate.rs (1 + 3 threads) | 62.5 s | 396,554 /s | 805 MiB/s |
grep -F "[Event " -c |
24.0 s | 1,032,691 /s | 2,096 MiB/s |
License
pgn-reader is licensed under the GPL-3.0 (or any later version at your option).