jja 0.6.0

swiss army knife for chess file formats
Documentation

Swiss army knife for chess file formats

crates.io msrv documentation build status downloads Crowdin stability-beta license dependency status maintenance-status

This is jja, a command line utility to interact with various chess file formats. It is still in its early stages of development. The initial intention of the author was to convert their opening books which were saved with ChessBase's proprietary CTG format to the free and open PolyGlot format. Overtime they intend to add support for other chess file formats ( cbh, epd, pgn, si4, si5 and so on).

Formats

As of version 0.6.0, jja supports reading/querying:

opening book files, whereas it supports writing/converting to:

opening book files.

As of version 0.5.0, jja supports exporting all the supported opening book formats to PGN. To use this functionality, specify an output file with pgn extension as an argument to jja edit.

During opening book conversion, jja uses the information provided in various input opening book formats to come up with a move weight which accompanies the move in the PolyGlot opening file. jja also writes some custom numbers in the learn field, such as NAGs during ctg conversion or priority during abk conversion. You may disable this custom usage using --no-learn as it may confuse other software making use of this field.

Note, Arena, aka abk, opening book file writing support is only supported from ChessBase, aka ctg books. Use the command line flags --author, --comment, --probability-priority, --probability-games, --probability-win-percent to configure ABK header metadata. Game statistics (minimum number of games/wins, win percentages for both sides) are managed automatically by jja.

In-place editing for Arena opening books is also possible using -i, --in-place=SUFFIX command line option. Conversion from PolyGlot, aka bin, and ChessMaster, aka obk opening books to Arena, aka abk opening book files is planned for a future release.

PGN Book Making

Since version 0.4.0, jja can make PolyGlot books out of PGN files. This feature is similar to polyglot make-book with the following differences:

  1. jja may directly read compressed PGN files .pgn.{bz2,gz,lz4,zst}
  2. jja can process PGN files bigger than your system's available memory, by persisting statistics in a temporary RocksDB database.
  3. jja scales move weights by default to prevent potential overflows with huge PGN files bigger files, use --no-scale to disable.
  4. jja can filter moves using Filter Expressions, allowing the user to filter-out unwanted games and create specialised opening books.

Filter Expressions

The filter expression string should contain filter conditions, which consist of a tag name, a comparison operator, and a value. The following operators are supported:

  • > (greater than)
  • >= (greater than or equal to)
  • < (less than)
  • <= (less than or equal to)
  • = (equal to)
  • != (not equal to)
  • =~ (regex match, case insensitive)
  • !~ (negated regex match, case insensitive)

Filter conditions can be combined using the following logical operators:

  • AND (logical AND)
  • OR (logical OR)

Example:

--filter="Event =~ World AND White =~ Carlsen AND ( Result = 1-0 OR ECO = B33 )"

Supported tags are Event, Site, Date, UTCDate, Round, Black, White, Result, BlackElo, WhiteElo, BlackRatingDiff, WhiteRatingDiff, BlackTitle, WhiteTitle, ECO, Opening, TimeControl, Termination, and ScidFlags.

In addition to these are four special variables, namely, Player, Elo, Title, and RatingDiff. These variables may be used to match the relevant header from either one of the sides. E.g the filter:

--filter="Player =~ Carlsen"

is functionally equivalent to

--filter="( White =~ Carlsen OR Black =~ Carlsen )"

Note: The filtering is designed to be simple and fast. The tokens, including parantheses are split by whitespace. Quoting values is not allowed. For more sophisticated filtering needs, use pgn-extract.

Scid Flags

Scid uses one character flags, DWBMENPTKQ!?U123456, for each field where:

  • D - Deleted
  • W - White opening
  • B - Black opening
  • M - Middlegame
  • E - Endgame
  • N - Novelty
  • P - Pawn structure
  • T - Tactics
  • K - Kingside play
  • Q - Queenside play
  • ! - Brilliancy
  • ? - Blunder
  • U - User-defined
  • 1..6 - Custom flags

It is ill-advised to rely on the order of the characters flags.

Use a regex match if/when you can.

Tips and Tricks about PGN Book Making

  1. The defaults run best on my laptop and in my personal benchmarks on the SourceHut build server, they're not universal truth.
  2. jja processes input PGN files in parallel. You can use this to your advantage by giving many split PGNs as input to increase parallelism and performance.
  3. Increasing batch size is good as long as you have constant memory usage. When threads can't keep up, you'll get increased memory usage so that's the point you really know your limit.
  4. Try increasing max open files to the point you don't get too many open files error from your operating system. You may specify --max-open-files=-1 to keep files always open.
  5. Try different compression algorithms for the temporary RocksDB database, or try disabling compression completely if you have enough space. jja writes the temporary RocksDB database in the same directory as the first pgn file argument. The default compression algorithm, Lz4, and the default compression level, 4, are aimed at speedy conversion with relatively moderate space usage. If you run out of space during conversion, try to use an algorithm like Zstd with an "ultra" level, ie. a level greater or equal to 20.
  6. Use filters which are processed during PGN traversal when possible. Due to the fact that these filters are matched before writing the game data to the temporary database, when use wisely, they may have a vast on impact space and memory costs and therefore improve overall performance. These filters are --filter=<expr>, --max-ply=<ply>, --min-pieces=<piece-count>, --only-white, and --only-black.

Install

To compile from source, use cargo install jja. This requires the Rust Toolchain to be installed. In addition you are going to need OpenSSL libraries on UNIX systems. Moreover you need liburing on Linux. If you're on a Linux system older than 5.1 or you are unable to install liburing for another reason, you may disable the feature by building jja with cargo install jja --no-default-features.

As an alternative, release builds of jja are hosted on chesswob.org for 64-bit Linux and Windows. These versions are signed by GnuPG, using key D076A377FB27DE70. To install, acquire the latest version from chesswob.org, verify the checksum and the GnuPG signature:

$> export JJA_VERSION=0.5.0
$> export JJA_FLAVOUR=glibc
$> curl https://keybase.io/alip/pgp_keys.asc | gpg --import
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 13292  100 13292    0     0  13535      0 --:--:-- --:--:-- --:--:-- 26584
gpg: key D076A377FB27DE70: public key "Ali Polatel (Caissa AI) <alip@caissa.ai>" imported
gpg: Total number processed: 1
gpg:               imported: 1
$> for f in jja-${JJA_VERSION}-${JJA_FLAVOUR}.bin{,.sha512sum,.sha512sum.asc}; do wget -q https://chesswob.org/jja/${f}; done
$> gpg --verify jja-${JJA_VERSION}.bin.sha512sum.asc jja-${JJA_VERSION}.bin.sha512sum
gpg: Signature made Sun Mar 19 20:52:41 2023 CET
gpg:                using RSA key 5DF763560390A149AC6C14C7D076A377FB27DE70
gpg: Good signature from "Ali Polatel (Caissa AI) ...
$> sha512sum -c jja-${JJA_VERSION}.bin.sha512sum
jja: OK
$> sudo install -m755 jja-${JJA_VERSION}-${JJA_FLAVOUR}.bin /usr/local/bin

Finally you may download the builds of the latest git version via the SourceHut build server. There're three flavours, windows, linux-glibc, and linux-musl. Simply browse to the latest build and download the artifact listed on the left. Note: these artifacts are kept for 90 days.

Usage

  • Use jja info to get brief information about the chess file.
  • Use jja find to search for a position in a chess file.
  • Use jja edit to edit opening book files and convert to PolyGlot files or Arena files.
  • Use jja make to compile PGN files into PolyGlot opening books.
  • Use jja dump to dump a PolyGlot or BrainLearn file as a stream of JSON arrays.
  • Use jja restore to restore a PolyGlot or BrainLearn file from a stream of JSON arrays.
  • Use jja merge to merge two PolyGlot opening books.
  • Use jja match to arrange book matches using random playouts.
  • Use jja play to make random playouts, optionally using books.
  • Use jja hash to calculate Zobrist hash of a given chess position.
  • Use jja open to browse ECO classification.
  • Use jja quote to print a chess quote.

jja determines the type of the file using its file extension. Files with the extension .bin are considered PolyGlot books. Files with the extension .ctg are considered ChessBase books. Files with the extension .abk are considered Arena books. Files with extension .obk are considered ChessMaster books. Files with extension .exp are considered BrainLearn experience files.

By default if the standard output is a TTY, jja will display information using fancy tables. Use --porcelain command line option to get the output in CSV (comma-separated values) format instead.

Demo

Acknowledgements

Thanks to Steinar H. Gunderson, for publishing the CTG Specification, and authoring the remoteglot tool: The CTG probing code in jja is very directly ported from their C probing code, and the specification has been an enormous help in clearing up various rough edges. Thanks to Fabien Letouzey, the author of the original PolyGlot software: The PolyGlot probing, book making and merging code in jja is mostly ported from their respective C code. Thanks to Michel Van den Bergh, the author of pg_utils, a collection of tools to interact with PolyGlot opening books: The PolyGlot book editing code of jja uses many ideas and code excerpts from pg_utils. Thanks to Peter Österlund, the author of DroidFish: The ABK opening book interface code in jja makes use of ideas and code excerpts from DroidFish. Thanks to Jens Nissen, the author of ChessX: The CTG and ABK probing codes in jja use ideas and code excerpts from ChessX. Thanks to LiChess, the best chess website on the planet. The quote command of jja has a selection of quotes imported from the LiChess codebase. Thanks to Shane Hudson, the author of Scid vs. PC: The jja eco command uses the ECO classification which has been done by the Scid project. In addition, the PolyGlot editing code of jja uses ideas and code from Scid. Thanks to Marcus Bufett, the author of chess-tactics-cli: The chessboard displaying code in PolyGlot and ABK edit screens is borrowed from chess-tactics-cli.

License

jja is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.

Bugs

Hey you, out there beyond the wall,
Breaking bottles in the hall,
Can you help me?

Report bugs to jja's bug tracker at https://todo.sr.ht/~alip/jja/:

  1. Always be polite, respectful, and kind: https://css-tricks.com/open-source-etiquette-guidebook/
  2. Keep your final change as small and neat as possible: https://tirania.org/blog/archive/2010/Dec-31.html
  3. Attaching poems with the bug report encourages consideration tremendously.

Jin, Jîyan, Azadî

I've started hacking this on International Women's Day 2023, a day to honor the achievements of women and advocate for their rights worldwide. As a person of Kurdish heritage, I am particularly moved by the slogan "Woman, Life, Freedom", which has become a symbol of resistance against oppression and a call for equality. In the spirit of free software and free speech, I strive to contribute to the creation of a more just and inclusive society, where every human being is granted the freedom to express themselves and pursue their dreams. I also honor the memory of Mahsa Amini, whose tragic death reminds us of the urgent need to fight for women's freedom and safety.

More on Wikipedia, WikiPedia::Woman,_Life,_Freedom

ChangeLog

0.6.0

  • jja::pgnfilt::Operator and jja::pgnfilt::LogicalOperator implements Eq as well as PartialEq now.
  • new restore command to accompany the dump command which restores JSON serialized PolyGlot or BrainLearn file entries into the given output file.
  • fix SIGPIPE handling on UNIX systems so that jja does not panic when the output is piped to another program such as a pager.
  • find learned -z <HASH>, --hash=<HASH> to query PolyGlot opening books and BrainLearn experience files by Zobrist hash.
  • readonly support for BrainLearn experience file format. The subcommands info, dump, and find are able to handle files in BrainLearn experience file format with the extension .exp.
  • improve polyglot key lookup by reading only the key rather than the whole entry from file. The new public function polyglotbook::PolyGlotBook::read_book_key is used for that.
  • new dump command to dump the full contents of a PolyGlot opening book or a PGN file. The dump format of the PolyGlot opening book is JSON, whereas for PGN files this is CSV.
  • breaking change: polyglotbook::PolyglotBook::lookup_moves has been changed to take a zobrist hash of a chess position as an argument rather than the shakmaty::Chess position itself.
  • hash learned --signed to print Zobrist hashes as signed decimal numbers.
  • new module jja::file which exports utilities for binary file i/o.
  • breaking change: jja::polyglot::entry_{from,to}_file have been renamed to jja::polyglot::bin_entry_{from,to}_file.
  • use the XorShift random number generator to randomly pick moves during book matches using the match command. This algorithm is cryptographically insecure but is very fast.
  • fix match command from panicing on certain cases when there is a book lookup miss.
  • Print more detailed build information on long version, --version output.
  • Memory map CTG files to speed up random access in return for increased memory usage. This brings a new dependency upon the crate memmap.
  • use buffered read/write in interacting with PolyGlot books which reduces the read/write system calls by a huge margin and thereby improves performance. The book member of PolyGlotBook is now a BufReader<File> rather than a File which is a breaking change.
  • important fix for calculating PolyGlot compatible Zobrist hashes wrt. en-passant legality. In PolyGlot format, en-passant moves are only pseudo-legal whereas previously the jja::hash::zobrist_hash function mistakenly checked for full legality.
  • breaking change: new type ctgbook::CtgTree which holds the new return value of the functions CtgBook::extract_all, and CtgBook::extract_all2. The tree element of CtgEntry which were used by these functions has also been dropped and the functions have been implemented in a much more performant way using considerably less memory. As a result, most ctg to abk/polyglot conversions are almost double as fast.
  • breaking change: ctgbook::CtgEntry member uci's type has been changed from String to shakmaty::uci::Uci, and the nags member has been renamed to nag and its type has been changed from Option<String> to Option<Nag>.
  • ctg move comment entries were parsed and silently discarded, they're no longer parsed. Moreover, ctgbook::CtgEntry no longer has a comment member which is a breaking change.
  • breaking change: ctg::colored_uci function now accepts a shakmaty::uci::Uci rather than a UCI string. The order of the function arguments is also changed.
  • ctg: new type Ctg::Nag to abstract CTG NAG (Numeric Annotation Glyph) entries
  • obk entries with zero weight, are now assigned weight 1 during PolyGlot conversion to prevent skipping these entries. We plan to make this user-configurable in the future.
  • use the standard IsTerminal trait, and drop the dependency on is_terminal crate
  • use the standard to_le_bytes(), to_be_bytes(), and drop the dependency on byteorder crate
  • hash learned -e=<MODE>, --enpassant-mode=<MODE> to select en-passant mode when to include the en-passant square in Zobrist hash calculation.
  • important fix for encoding castling moves during book conversions to the PolyGlot format. See details in the respective issue
  • breaking change: polyglot.from_uci now expects a bool argument to correctly encode castling positions by determining whether the king to move is on their starting square.
  • bump minimum supported Rust version (MSRV) from 1.64 to 1.70 due to shakmaty bump
  • drop dependency on the unmaintained and insecure chrono crate
  • PolyGlotBook has two new public functions: find_book_key, and read_book_entry
  • hash learned -x, --hex to print hash as hexadecimal rather than decimal
  • upgrade pgn-reader crate from 0.24 to 0.25
  • upgrade shakmaty crate from 0.25 to 0.26
  • optimize various ctg functions
  • info learned to print the total number of positions for ctg opening books
  • breaking change: CtgBook::num_entries has been renamed to total_positions.
  • breaking change: CtgBook::total_pages function accepts a reference to self, rather than consuming self. The return type is usize now.
  • breaking change: Drop close functions of CtgBook and PolyGlotBook, improve CtgBook to close the cto file immediately after open, and keep a File, rather than an Option<File> in CtgBook. The now unused path element of CtgBook is also dropped.
  • find: simplify & optimize polyglot entry lookup
  • display license, version and author information in --help output
  • set minimum supported Rust version (MSRV) to 1.64 as determined by cargo-msrv
  • fix a bug which caused key and epd to be displayed incorrectly in editor screen
  • edit: implement --rescale for PolyGlot books. When specified, edit will rescale weights of all entries in the book, rather than a single entry. This is useful to quickly correct/optimize PolyGlot books which were generated without weight scaling.
  • fix build with i18n feature disabled
  • upgrade tempfile crate from 3.5 to 3.6
  • upgrade rust-embed crate from 6.6 to 6.7
  • upgrade once_cell crate from 1.17 to 1.18
  • upgrade ctrlc crate from 3.3 to 3.4
  • important fix CTG to Polyglot weight conversion which caused all entries in CTG books with missing performance information to be skipped from the output PolyGlot book. Read more about it in the respective issue.
  • make learned -H, --hashcode argument to skip duplicate games based on the HashCode PGN tag. PGN files may be tagged using pgn-extract --addhashcode

0.5.0

  • important fix, for encoding of castling in polyglot books. Previously, we encoded castling as e1g1, e1c1, e8g8, and e8c8, whereas the correct encoding is e1h1, e1a1, e8h8, and e8a8.
  • when printing version with --version, prefer git version over package version for git builds
  • upgrade clap crate from 4.2 to 4.3
  • upgrade ctrlc crate from 3.2 to 3.3
  • upgrade rocksdb crate to 0.21.0 which bundles RocksDB-8.1.1
  • document all the public code and enable the lint #[deny(missing_docs)]
  • edit learned to export opening books in PGN format, use an output file with pgn extension to export an opening book to a PGN
  • add back the build dependency upon the built crate
  • find learned -l <max-ply>, --line=<max-ply> to display lines from the opening book as a table of opening variations reverse-sorted by cumulative weight
  • make --min-score now accepts floating point values as argument rather than an unsigned 64-bit integer
  • make learned --win-factor, --draw-factor, and --loss-factor to specify respective factors during score calculation, the defaults, 2, 1, and 0 respectively, resembles the original polyglot tool
  • find --tree=<max-ply> no longer panics on broken pipe, so it's more convenient to use with a pager
  • make learned -p, --min-pieces to specify the minimum number of pieces on the board for a position to be included in the book, defaults to 8
  • merge learned -c, --weight-cutoff to specify the minimum weight of entries to be included in the book
  • merge learned about merge strategies avg and wavg to merge using average weight or weighted average weight respectively; the weighted averages should be specified for wavg using -w, --weight1, and -W, --weight2
  • in-place editing of polyglot files allows editing empty books which makes it practical to create polyglot book from scratch
  • merge learned about merge strategies max, min, ours, and sum, default is sum which adds together move weights, max picks the one with the maximum weight, min picks up the one with the minimum weight, and ours always picks the entries from the first book
  • translate the README to German language
  • turn i18n support into a feature which defaults to on. For static linking this must be disabled as embed makes use of proc macros.
  • drop depedency on unused libc crate

0.4.1

  • fix docs.rs build
  • update shakmaty, and pgn-reader creates
  • add initial translation to Turkish language
  • add initial translation to German language
  • use gettext for i18n

0.4.0

  • replace atty with the better maintained is-terminal crate.
  • replace default-editor with the more advanced dialoguer crate.
  • replace colored with the more portable console crate.
  • match learned -i, --irreversible to prefer irreversible moves during random playouts. Pawn moves, captures, moves that destroy castling rights, and moves that cede en-passant are irreversible.
  • replace the progress_bar crate with the more portable and advanced indicatif crate.
  • make learned -B <games>, --batch-size=<games> to determine write batch size in number of games
  • Linux builds require liburing to be installed by default. This may be disabled using --no-default-features on installation.
  • make learned about ScidFlags, a set of character flags used by the Scid software, which may be used in filter expressions
  • make learned --debug flag to print information on matching filter expressions
  • quote learned many more chess quotes which were imported from goodreads.com
  • make learned --max-open-files=<int> to specify the maximum number of open files per thread opened by the temporary RocksDB database.
  • match command is now multithreaded and uses as many threads as there're cpus on the machine by default. Use -T=<threads>, --threads=<threads> or the JJA_NPROC environment variable to override
  • fix match to alternate colour between books during random playouts
  • quote accepts an optional index argument to print the quote at the specified index, rather than a random quote
  • quote learned many more chess quotes which were imported from archive.org, goodreads.com and nitter.net
  • make filter learned special variables Player, Elo, Title, and RatingDiff to match the relevant field from either colour
  • make learned --filter=<expression> to filter PGN games by headers. The filtering is designed to be simple and fast. The tokens, including parantheses are split by whitespace. Quoting variables is not allowed. See jja make --help for more information on Filter Expressions
  • find learned -t <max-ply>, --tree=<max-ply> to display lines from the opening book as a tree using the nice termtree crate
  • drop the build dependency upon the built crate
  • drop the unused dependency on the ħyphenation crate
  • replace csv crate with the more lightweight quick-csv crate
  • replace cli-table crate with the more lightweight prettytable-rs crate
  • use pgn-reader crate instead of pgnparse for parsing the --pgn commandline option
  • strip off the unneeded cli-table-derive dependency
  • fix hash subcommand option parsing causing panic
  • ctg find prioritises wins & draws over performance
  • make learned to preserve moves with null moves in the book with -0, --null
  • make learned to avoid scaling weights using --no-scale
  • make learned to configure compression for the rocksdb database using --compression={none,bzip2,lz4,lz4hc,snappy,zlib,zstd},--compression-level=<level>, defaults to lz4, level 4
  • use a temporary RocksDB database during pgn book make so as to better make use of memory, this reduces the performance a little, however in return makes importing huge PGN files possible.
  • new make command to make polyglot books out of pgn files, runs multithreaded with as many threads as the cpu number of the system by default, use -T, --threads to override, works transparently with compressed PGN files (zstd, bzip2, gunzip, lz4)

0.3.2

  • edit writes the name of the user and jja's version as comment to Arena opening book metadata on ChessBase to Arena opening book conversion, override with --author, --comment
  • edit displays a unicode chess board in edit tempfile
  • fix book traversal on Arena opening book to PolyGlot conversion
  • enable ansi colors when running on windows terminal
  • drop unixisms, cross-compiles for windows
  • fix yet another bug with castle decoding on polyglot read/query
  • edit no longer tries to spawn the default editor if standard output is not a TTY
  • fix find and edit for Arena opening book reading, move selection is on par with the Arena GUI
  • new hash command to calculate the Zobrist hash of the given position
  • fix infinite loop while converting some big Arena opening book files
  • improve hashing performance by avoiding double hashing using a hasher builder for Zobrist hashes
  • improve hashing performance using shakmaty crate's Zobrist Hashing implementation rather than the internal one.

0.3.1

  • edit learned to calculate & write ABK header game statistics fields
  • edit learned to convert CTG book files to ABK book files
  • edit learned --author and --comment to specify metadata for Arena opening books
  • edit can edit Arena opening book (abk) files in-place with -i, --in-place=SUFFIX
  • support for writing Arena (abk) opening books
  • open learned to wrap long ECO opening lines into multiple lines
  • find no longer panics on some abk books with entries having invalid uci
  • edit takes move priority into account for weight on abk to bin conversion
  • match learned --move-selection={best_move,uniform_random,weighted_random} to pick move selection algoritm for book moves
  • fix castle decoding on polyglot read/query
  • fix error return when no positions found in abk, obk and ctg find
  • fix promotion handling in ctg edit
  • many improvements to ctg find (move coloring & sorting, average statistics)
  • new merge command to merge two PolyGlot opening books
  • new match command to arrange book matches with random playouts

0.3.0

  • refactor code to unify various opening book reader interfaces
  • support obk version 1 as well as 2 (ChessMaster books with and without notes)
  • support for reading obk (ChessMaster) books and converting them to polyglot books
  • do not display progress bar if standard output is not a TTY
  • support for reading abk (Arena) books and converting them to polyglot books
  • new play command can be used to make random playouts using opening books

0.2.1

  • weight conversion in ctg to polyglot edit can be tuned with --nag-weight-{good,mistake,hard,blunder,interesting,dubious,forced}=
  • edit learned --no-scale to avoid scaling weights globally to fit into 16 bits
  • the code is now relatively well documented
  • edit --in-place now properly deletes the output temp file on interrupt
  • edit filters out moves with zero weights, use -0, --null to preserve them

0.2.0

  • edit window lists position info (key, epd, legal moves) as comment
  • edit no longer silently discards illegal moves
  • edit can edit PolyGlot files in-place with -i, --in-place=SUFFIX
  • edit can convert CTG opening books into PolyGlot opening books
  • default to start position when no --fen or --pgn is given for edit and find
  • info prints number of total pages in CTG books

0.1.1

  • new positions can be added to polyglot files
  • many bugs fixed with polyglot edit
  • quote command added to print a random chess quote
  • open command added to query ECO classification

0.1.0

  • edit polyglot files, only editing present positions work
  • read polyglot files
  • read ctg files