pcompress 0.1.0

Experimental, efficient, and performant binary representation of districting plans
pcompress-0.1.0 is not a library.
Visit the last successful build: pcompress-1.0.7

pcompress

Currently it is hard to store the state of every single step of a normal Markov Chain Monte Carlo from GerryChain Python or GerryChain Julia. This repo aims to produce an efficient binary representation of partitions/districting assignments that will enable for generated plans to be saved on-the-fly. Each step is represented as the diff from the previous step, enabling a significant reduction in disk usage per step.

Note that if a step repeats, it will be omitted.

Usage

See chain_flip and chain.sh.

To decode, simply pipe the compressed output into pcompress --decode.

Binary Representation

TODO: document this.

Further compression

If you want to compress the output file further, xz is recommended. With xz and pcompress, quite a few orders of magnitude of compression can be achieved.

E.g.:

xz -9 -k chain.output

TODOs

  • better checking/guarding against overflows
  • variable sizes
  • header format?
  • rewind functionality
  • poc of GerryChain Python and Julia rewind/replay