Swiss army knife for chess file formats
This is jja, a command line utility to interact with various chess file formats. It is still in its early stages of development. The initial intention of the author was to convert their opening books which were saved with ChessBase's proprietary CTG format to the free and open PolyGlot format. Overtime they intend to add support for other chess file formats ( cbh, epd, pgn, si4, si5 and so on).
Formats
As of version 0.6.0, jja supports reading/querying:
- PolyGlot, aka
bin - Arena, aka
abk - ChessBase, aka
ctg - ChessMaster, aka
obk(version 1 and 2, w\o,with text notes) - BrainLearn, aka
exp
opening book files, whereas it supports writing/converting to:
opening book files.
As of version 0.5.0, jja supports exporting all the supported opening book
formats to PGN. To
use this functionality, specify an output file with pgn extension as an argument
to jja edit.
During opening book conversion, jja uses the information provided in various
input opening book formats to come up with a move weight which accompanies the
move in the PolyGlot opening file.
jja also writes some custom numbers in the learn field, such as
NAGs during ctg
conversion or priority during abk conversion. You may disable this custom
usage using --no-learn as it may confuse other software making use of this field.
Note, Arena, aka abk, opening book file
writing support is only supported from
ChessBase, aka ctg books. Use the command
line flags --author, --comment, --probability-priority, --probability-games,
--probability-win-percent to configure ABK
header metadata. Game statistics (minimum number of games/wins, win percentages for
both sides) are managed automatically by jja.
In-place editing for Arena opening books is
also possible using -i, --in-place=SUFFIX command line option. Conversion from
PolyGlot, aka bin, and
ChessMaster, aka obk opening books
to Arena, aka abk opening book files is
planned for a future release.
PGN Book Making
Since version 0.4.0, jja can make
PolyGlot books out of
PGN
files. This feature is similar to polyglot make-book with the following
differences:
- jja may directly read compressed PGN files .pgn.{bz2,gz,lz4,zst}
- jja can process PGN files bigger than your system's available memory, by persisting statistics in a temporary RocksDB database.
- jja scales move weights by default to prevent potential overflows with huge
PGN files bigger
files, use
--no-scaleto disable. - jja can filter moves using Filter Expressions, allowing the user to filter-out unwanted games and create specialised opening books.
Filter Expressions
The filter expression string should contain filter conditions, which consist of a tag name, a comparison operator, and a value. The following operators are supported:
>(greater than)>=(greater than or equal to)<(less than)<=(less than or equal to)=(equal to)!=(not equal to)=~(regex match, case insensitive)!~(negated regex match, case insensitive)
Filter conditions can be combined using the following logical operators:
AND(logical AND)OR(logical OR)
Example:
--filter="Event =~ World AND White =~ Carlsen AND ( Result = 1-0 OR ECO = B33 )"
Supported tags are Event, Site, Date, UTCDate, Round, Black, White, Result, BlackElo, WhiteElo, BlackRatingDiff, WhiteRatingDiff, BlackTitle, WhiteTitle, ECO, Opening, TimeControl, Termination, and ScidFlags.
In addition to these are four special variables, namely, Player, Elo, Title, and RatingDiff. These variables may be used to match the relevant header from either one of the sides. E.g the filter:
--filter="Player =~ Carlsen"
is functionally equivalent to
--filter="( White =~ Carlsen OR Black =~ Carlsen )"
Note: The filtering is designed to be simple and fast. The tokens, including parantheses are split by whitespace. Quoting values is not allowed. For more sophisticated filtering needs, use pgn-extract.
Scid Flags
Scid uses one character flags, DWBMENPTKQ!?U123456, for each field where:
D- DeletedW- White openingB- Black openingM- MiddlegameE- EndgameN- NoveltyP- Pawn structureT- TacticsK- Kingside playQ- Queenside play!- Brilliancy?- BlunderU- User-defined1..6- Custom flags
It is ill-advised to rely on the order of the characters flags.
Use a regex match if/when you can.
Tips and Tricks about PGN Book Making
- The defaults run best on my laptop and in my personal benchmarks on the SourceHut build server, they're not universal truth.
- jja processes input PGN files in parallel. You can use this to your advantage by giving many split PGNs as input to increase parallelism and performance.
- Increasing batch size is good as long as you have constant memory usage. When threads can't keep up, you'll get increased memory usage so that's the point you really know your limit.
- Try increasing max open files to the point you don't get too many open files
error from your operating system. You may specify
--max-open-files=-1to keep files always open. - Try different compression algorithms for the temporary
RocksDB database, or try disabling compression completely
if you have enough space. jja writes the temporary
RocksDB database in the same directory as the first
pgn file argument. The default compression algorithm,
Lz4, and the default compression level,
4, are aimed at speedy conversion with relatively moderate space usage. If you run out of space during conversion, try to use an algorithm like Zstd with an "ultra" level, ie. a level greater or equal to20. - Use filters which are processed during PGN traversal when possible. Due to the
fact that these filters are matched before writing the game data to the temporary
database, when use wisely, they may have a vast on impact space and memory costs
and therefore improve overall performance. These filters are
--filter=<expr>,--max-ply=<ply>,--min-pieces=<piece-count>,--only-white, and--only-black.
Install
To compile from source, use cargo install jja. This requires the Rust
Toolchain to be installed. In addition you are going to need
OpenSSL libraries on
UNIX systems. Moreover you need
liburing on Linux. If
you're on a Linux system older than
5.1 or you are unable to install
liburing for another reason, you may disable
the feature by building jja with cargo install jja --no-default-features.
As an alternative, release builds of jja are hosted on chesswob.org for 64-bit Linux and Windows. These versions are signed by GnuPG, using key D076A377FB27DE70. To install, acquire the latest version from chesswob.org, verify the checksum and the GnuPG signature:
$> export JJA_VERSION=0.5.0
$> export JJA_FLAVOUR=glibc
$> curl https://keybase.io/alip/pgp_keys.asc | gpg --import
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 13292 100 13292 0 0 13535 0 --:--:-- --:--:-- --:--:-- 26584
gpg: key D076A377FB27DE70: public key "Ali Polatel (Caissa AI) <alip@caissa.ai>" imported
gpg: Total number processed: 1
gpg: imported: 1
$> for f in jja-${JJA_VERSION}-${JJA_FLAVOUR}.bin{,.sha512sum,.sha512sum.asc}; do wget -q https://chesswob.org/jja/${f}; done
$> gpg --verify jja-${JJA_VERSION}.bin.sha512sum.asc jja-${JJA_VERSION}.bin.sha512sum
gpg: Signature made Sun Mar 19 20:52:41 2023 CET
gpg: using RSA key 5DF763560390A149AC6C14C7D076A377FB27DE70
gpg: Good signature from "Ali Polatel (Caissa AI) ...
$> sha512sum -c jja-${JJA_VERSION}.bin.sha512sum
jja: OK
$> sudo install -m755 jja-${JJA_VERSION}-${JJA_FLAVOUR}.bin /usr/local/bin
Finally you may download the builds of the latest git version via the
SourceHut build server. There're three flavours,
windows, linux-glibc, and linux-musl. Simply browse to the latest build and
download the artifact listed on the left. Note: these artifacts are kept for 90
days.
Usage
- Use
jja infoto get brief information about the chess file. - Use
jja findto search for a position in a chess file. - Use
jja editto edit opening book files and convert to PolyGlot files or Arena files. - Use
jja maketo compile PGN files into PolyGlot opening books. - Use
jja dumpto dump a PolyGlot or BrainLearn file as a stream of JSON arrays. - Use
jja restoreto restore a PolyGlot or BrainLearn file from a stream of JSON arrays. - Use
jja mergeto merge two PolyGlot opening books. - Use
jja matchto arrange book matches using random playouts. - Use
jja playto make random playouts, optionally using books. - Use
jja hashto calculate Zobrist hash of a given chess position. - Use
jja opento browse ECO classification. - Use
jja quoteto print a chess quote.
jja determines the type of the file using its file extension. Files with the
extension .bin are considered PolyGlot
books. Files with the extension .ctg are considered
ChessBase books. Files with the extension
.abk are considered Arena books. Files
with extension .obk are considered
ChessMaster books. Files with
extension .exp are considered BrainLearn
experience files.
By default if the standard output is a
TTY, jja will display
information using fancy tables. Use --porcelain command line option to get the
output in CSV
(comma-separated values) format instead.
Demo
Acknowledgements
Thanks to Steinar H. Gunderson, for publishing the CTG
Specification,
and authoring the remoteglot tool:
The CTG probing code in jja is very
directly ported from their
C probing code,
and the specification has been an enormous help in clearing up various rough edges.
Thanks to Fabien Letouzey, the author of the original
PolyGlot software: The
PolyGlot probing, book making and merging
code in jja is mostly ported from their respective
C code. Thanks to
Michel Van den Bergh, the author of pg_utils, a collection of tools to
interact with PolyGlot opening books: The
PolyGlot book editing code of jja
uses many ideas and code excerpts from pg_utils. Thanks to Peter Österlund,
the author of DroidFish: The
ABK opening book interface code in jja
makes use of ideas and code excerpts from
DroidFish. Thanks to Jens
Nissen, the author of ChessX: The
CTG and
ABK probing codes in jja use ideas and
code excerpts from ChessX. Thanks to
LiChess, the best chess website on the planet. The quote
command of jja has a selection of quotes imported from the
LiChess codebase. Thanks to Shane Hudson, the author of
Scid vs. PC: The jja eco command uses the
ECO classification which has been done by
the Scid project. In addition, the
PolyGlot editing code of jja uses
ideas and code from Scid. Thanks to Marcus
Bufett, the author of
chess-tactics-cli: The
chessboard displaying code in PolyGlot and
ABK edit screens is borrowed from
chess-tactics-cli.
License
jja is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
Bugs
Hey you, out there beyond the wall,
Breaking bottles in the hall,
Can you help me?
Report bugs to jja's bug tracker at https://todo.sr.ht/~alip/jja/:
- Always be polite, respectful, and kind: https://css-tricks.com/open-source-etiquette-guidebook/
- Keep your final change as small and neat as possible: https://tirania.org/blog/archive/2010/Dec-31.html
- Attaching poems with the bug report encourages consideration tremendously.
Jin, Jîyan, Azadî
I've started hacking this on International Women's Day 2023, a day to honor the achievements of women and advocate for their rights worldwide. As a person of Kurdish heritage, I am particularly moved by the slogan "Woman, Life, Freedom", which has become a symbol of resistance against oppression and a call for equality. In the spirit of free software and free speech, I strive to contribute to the creation of a more just and inclusive society, where every human being is granted the freedom to express themselves and pursue their dreams. I also honor the memory of Mahsa Amini, whose tragic death reminds us of the urgent need to fight for women's freedom and safety.
More on Wikipedia, WikiPedia::Woman,_Life,_Freedom
ChangeLog
0.6.0
jja::pgnfilt::Operatorandjja::pgnfilt::LogicalOperatorimplementsEqas well asPartialEqnow.- new
restorecommand to accompany thedumpcommand which restores JSON serialized PolyGlot or BrainLearn file entries into the given output file. - fix
SIGPIPEhandling on UNIX systems so thatjjadoes not panic when the output is piped to another program such as a pager. - find learned
-z <HASH>, --hash=<HASH>to query PolyGlot opening books and BrainLearn experience files by Zobrist hash. - readonly support for BrainLearn experience file format. The subcommands
info,dump, andfindare able to handle files in BrainLearn experience file format with the extension.exp. - improve polyglot key lookup by reading only the key rather than the whole entry
from file. The new public function
polyglotbook::PolyGlotBook::read_book_keyis used for that. - new
dumpcommand to dump the full contents of a PolyGlot opening book or a PGN file. The dump format of the PolyGlot opening book is JSON, whereas for PGN files this is CSV. - breaking change:
polyglotbook::PolyglotBook::lookup_moveshas been changed to take a zobrist hash of a chess position as an argument rather than theshakmaty::Chessposition itself. - hash learned
--signedto print Zobrist hashes as signed decimal numbers. - new module
jja::filewhich exports utilities for binary file i/o. - breaking change:
jja::polyglot::entry_{from,to}_filehave been renamed tojja::polyglot::bin_entry_{from,to}_file. - use the
XorShiftrandom number generator to randomly pick moves during book matches using the match command. This algorithm is cryptographically insecure but is very fast. - fix match command from panicing on certain cases when there is a book lookup miss.
- Print more detailed build information on long version,
--versionoutput. - Memory map CTG files to speed up random access in return for increased memory
usage. This brings a new dependency upon the crate
memmap. - use buffered read/write in interacting with PolyGlot books which reduces the
read/write system calls by a huge margin and thereby improves performance. The
bookmember ofPolyGlotBookis now aBufReader<File>rather than aFilewhich is a breaking change. - important fix for calculating PolyGlot compatible Zobrist hashes wrt.
en-passant legality. In PolyGlot format, en-passant moves are only pseudo-legal
whereas previously the
jja::hash::zobrist_hashfunction mistakenly checked for full legality. - breaking change: new type
ctgbook::CtgTreewhich holds the new return value of the functionsCtgBook::extract_all, andCtgBook::extract_all2. Thetreeelement ofCtgEntrywhich were used by these functions has also been dropped and the functions have been implemented in a much more performant way using considerably less memory. As a result, most ctg to abk/polyglot conversions are almost double as fast. - breaking change:
ctgbook::CtgEntrymember uci's type has been changed fromStringtoshakmaty::uci::Uci, and thenagsmember has been renamed tonagand its type has been changed fromOption<String>toOption<Nag>. - ctg move comment entries were parsed and silently discarded, they're no longer
parsed. Moreover,
ctgbook::CtgEntryno longer has acommentmember which is a breaking change. - breaking change:
ctg::colored_ucifunction now accepts ashakmaty::uci::Ucirather than a UCI string. The order of the function arguments is also changed. - ctg: new type
Ctg::Nagto abstract CTG NAG (Numeric Annotation Glyph) entries - obk entries with zero weight, are now assigned weight
1during PolyGlot conversion to prevent skipping these entries. We plan to make this user-configurable in the future. - use the standard
IsTerminaltrait, and drop the dependency onis_terminalcrate - use the standard
to_le_bytes(),to_be_bytes(), and drop the dependency onbyteordercrate - hash learned
-e=<MODE>,--enpassant-mode=<MODE>to select en-passant mode when to include the en-passant square in Zobrist hash calculation. - important fix for encoding castling moves during book conversions to the PolyGlot format. See details in the respective issue
- breaking change:
polyglot.from_ucinow expects aboolargument to correctly encode castling positions by determining whether the king to move is on their starting square. - bump minimum supported Rust version (MSRV) from
1.64to1.70due toshakmatybump - drop dependency on the unmaintained and insecure
chronocrate PolyGlotBookhas two new public functions:find_book_key, andread_book_entry- hash learned
-x,--hexto print hash as hexadecimal rather than decimal - upgrade
pgn-readercrate from0.24to0.25 - upgrade
shakmatycrate from0.25to0.26 - optimize various ctg functions
- info learned to print the total number of positions for
ctgopening books - breaking change:
CtgBook::num_entrieshas been renamed tototal_positions. - breaking change:
CtgBook::total_pagesfunction accepts a reference toself, rather than consumingself. The return type isusizenow. - breaking change: Drop
closefunctions ofCtgBookandPolyGlotBook, improveCtgBookto close the cto file immediately after open, and keep aFile, rather than anOption<File>inCtgBook. The now unusedpathelement ofCtgBookis also dropped. - find: simplify & optimize polyglot entry lookup
- display license, version and author information in
--helpoutput - set minimum supported Rust version (MSRV) to
1.64as determined bycargo-msrv - fix a bug which caused key and epd to be displayed incorrectly in editor screen
- edit: implement
--rescalefor PolyGlot books. When specified, edit will rescale weights of all entries in the book, rather than a single entry. This is useful to quickly correct/optimize PolyGlot books which were generated without weight scaling. - fix build with
i18nfeature disabled - upgrade
tempfilecrate from3.5to3.6 - upgrade
rust-embedcrate from6.6to6.7 - upgrade
once_cellcrate from1.17to1.18 - upgrade
ctrlccrate from3.3to3.4 - important fix CTG to Polyglot weight conversion which caused all entries in CTG books with missing performance information to be skipped from the output PolyGlot book. Read more about it in the respective issue.
- make learned
-H,--hashcodeargument to skip duplicate games based on the HashCode PGN tag. PGN files may be tagged usingpgn-extract --addhashcode
0.5.0
- important fix, for encoding of castling in polyglot books. Previously, we encoded castling as e1g1, e1c1, e8g8, and e8c8, whereas the correct encoding is e1h1, e1a1, e8h8, and e8a8.
- when printing version with
--version, prefergitversion over package version for git builds - upgrade
clapcrate from4.2to4.3 - upgrade
ctrlccrate from3.2to3.3 - upgrade
rocksdbcrate to0.21.0which bundles RocksDB-8.1.1 - document all the public code and enable the lint
#[deny(missing_docs)] - edit learned to export opening books in PGN format, use an output file with
pgnextension to export an opening book to a PGN - add back the build dependency upon the built crate
- find learned
-l <max-ply>,--line=<max-ply>to display lines from the opening book as a table of opening variations reverse-sorted by cumulative weight - make
--min-scorenow accepts floating point values as argument rather than an unsigned 64-bit integer - make learned
--win-factor,--draw-factor, and--loss-factorto specify respective factors during score calculation, the defaults,2,1, and0respectively, resembles the originalpolyglottool - find
--tree=<max-ply>no longer panics on broken pipe, so it's more convenient to use with a pager - make learned
-p, --min-piecesto specify the minimum number of pieces on the board for a position to be included in the book, defaults to8 - merge learned
-c, --weight-cutoffto specify the minimum weight of entries to be included in the book - merge learned about merge strategies
avgandwavgto merge using average weight or weighted average weight respectively; the weighted averages should be specified forwavgusing-w, --weight1, and-W, --weight2 - in-place editing of polyglot files allows editing empty books which makes it practical to create polyglot book from scratch
- merge learned about merge strategies
max,min,ours, andsum, default issumwhich adds together move weights,maxpicks the one with the maximum weight,minpicks up the one with the minimum weight, andoursalways picks the entries from the first book - translate the README to German language
- turn i18n support into a feature which defaults to on. For static linking this must be disabled as embed makes use of proc macros.
- drop depedency on unused libc crate
0.4.1
- fix docs.rs build
- update shakmaty, and pgn-reader creates
- add initial translation to Turkish language
- add initial translation to German language
- use gettext for i18n
0.4.0
- replace atty with the better maintained is-terminal crate.
- replace default-editor with the more advanced dialoguer crate.
- replace colored with the more portable console crate.
- match learned
-i,--irreversibleto prefer irreversible moves during random playouts. Pawn moves, captures, moves that destroy castling rights, and moves that cede en-passant are irreversible. - replace the progress_bar crate with the more portable and advanced indicatif crate.
- make learned
-B <games>,--batch-size=<games>to determine write batch size in number of games - Linux builds require liburing to be installed
by default. This may be disabled using
--no-default-featureson installation. - make learned about
ScidFlags, a set of character flags used by the Scid software, which may be used in filter expressions - make learned
--debugflag to print information on matching filter expressions - quote learned many more chess quotes which were imported from goodreads.com
- make learned
--max-open-files=<int>to specify the maximum number of open files per thread opened by the temporary RocksDB database. - match command is now multithreaded and uses as many threads as there're cpus on
the machine by default. Use
-T=<threads>,--threads=<threads>or theJJA_NPROCenvironment variable to override - fix match to alternate colour between books during random playouts
- quote accepts an optional index argument to print the quote at the specified index, rather than a random quote
- quote learned many more chess quotes which were imported from archive.org, goodreads.com and nitter.net
- make filter learned special variables
Player,Elo,Title, andRatingDiffto match the relevant field from either colour - make learned
--filter=<expression>to filter PGN games by headers. The filtering is designed to be simple and fast. The tokens, including parantheses are split by whitespace. Quoting variables is not allowed. Seejja make --helpfor more information on Filter Expressions - find learned
-t <max-ply>,--tree=<max-ply>to display lines from the opening book as a tree using the nice termtree crate - drop the build dependency upon the built crate
- drop the unused dependency on the ħyphenation crate
- replace csv crate with the more lightweight quick-csv crate
- replace cli-table crate with the more lightweight prettytable-rs crate
- use pgn-reader crate instead of
pgnparse for parsing the
--pgncommandline option - strip off the unneeded cli-table-derive dependency
- fix hash subcommand option parsing causing panic
- ctg find prioritises wins & draws over performance
- make learned to preserve moves with null moves in the book with
-0,--null - make learned to avoid scaling weights using
--no-scale - make learned to configure compression for the rocksdb database using
--compression={none,bzip2,lz4,lz4hc,snappy,zlib,zstd},--compression-level=<level>, defaults tolz4, level 4 - use a temporary RocksDB database during pgn book make so as to better make use of memory, this reduces the performance a little, however in return makes importing huge PGN files possible.
- new make command to make polyglot books out of pgn files, runs multithreaded
with as many threads as the cpu number of the system by default, use
-T,--threadsto override, works transparently with compressed PGN files (zstd, bzip2, gunzip, lz4)
0.3.2
- edit writes the name of the user and jja's version as comment to Arena opening book metadata on ChessBase to Arena opening book conversion, override with --author, --comment
- edit displays a unicode chess board in edit tempfile
- fix book traversal on Arena opening book to PolyGlot conversion
- enable ansi colors when running on windows terminal
- drop unixisms, cross-compiles for windows
- fix yet another bug with castle decoding on polyglot read/query
- edit no longer tries to spawn the default editor if standard output is not a TTY
- fix find and edit for Arena opening book reading, move selection is on par with the Arena GUI
- new hash command to calculate the Zobrist hash of the given position
- fix infinite loop while converting some big Arena opening book files
- improve hashing performance by avoiding double hashing using a hasher builder for Zobrist hashes
- improve hashing performance using shakmaty crate's Zobrist Hashing implementation rather than the internal one.
0.3.1
- edit learned to calculate & write ABK header game statistics fields
- edit learned to convert CTG book files to ABK book files
- edit learned --author and --comment to specify metadata for Arena opening books
- edit can edit Arena opening book (abk) files in-place with -i, --in-place=SUFFIX
- support for writing Arena (abk) opening books
- open learned to wrap long ECO opening lines into multiple lines
- find no longer panics on some abk books with entries having invalid uci
- edit takes move priority into account for weight on abk to bin conversion
- match learned --move-selection={best_move,uniform_random,weighted_random} to pick move selection algoritm for book moves
- fix castle decoding on polyglot read/query
- fix error return when no positions found in abk, obk and ctg find
- fix promotion handling in ctg edit
- many improvements to ctg find (move coloring & sorting, average statistics)
- new merge command to merge two PolyGlot opening books
- new match command to arrange book matches with random playouts
0.3.0
- refactor code to unify various opening book reader interfaces
- support obk version 1 as well as 2 (ChessMaster books with and without notes)
- support for reading obk (ChessMaster) books and converting them to polyglot books
- do not display progress bar if standard output is not a TTY
- support for reading abk (Arena) books and converting them to polyglot books
- new play command can be used to make random playouts using opening books
0.2.1
- weight conversion in ctg to polyglot edit can be tuned with --nag-weight-{good,mistake,hard,blunder,interesting,dubious,forced}=
- edit learned --no-scale to avoid scaling weights globally to fit into 16 bits
- the code is now relatively well documented
- edit --in-place now properly deletes the output temp file on interrupt
- edit filters out moves with zero weights, use -0, --null to preserve them
0.2.0
- edit window lists position info (key, epd, legal moves) as comment
- edit no longer silently discards illegal moves
- edit can edit PolyGlot files in-place with -i, --in-place=SUFFIX
- edit can convert CTG opening books into PolyGlot opening books
- default to start position when no --fen or --pgn is given for edit and find
- info prints number of total pages in CTG books
0.1.1
- new positions can be added to polyglot files
- many bugs fixed with polyglot edit
- quote command added to print a random chess quote
- open command added to query ECO classification
0.1.0
- edit polyglot files, only editing present positions work
- read polyglot files
- read ctg files
