lzd-rs
This library provides a Rust implementation of LZ double-factor factorization, an efficient grammar-based compression algorithm, proposed in the paper:
K Goto, H Bannai, S Inenaga, and M Takeda. LZD Factorization: Simple and Practical Online Grammar Compression with Variable-to-Fixed Encoding. In CPM, 2015.
Examples
Factorization
use Compressor;
The output will be
factors: [97, 98, 97, 97, 256, 256, 256, 257, 98, 98, 258]
defined_factors: 261
NOTE: In this implementation, all 256 single characters are predefined as factors, so the number of factors defined will become 261.
Defactorization
use Decompressor;
The output will be
text: "abaaabababaabbabab"
Commnad line tools
This library provides two command line tools for compression and decompression. The tools will print the command line options by specifying the parameter -h. In the tools, LZ factors are serialized into a binary stream, in the same manner as tdc::BitCorder of tudocomp.
lzd command
It compresses an input data and writes the result into a file with the extension lzd. In the following case, english.50MB.lzd will be written as the compressed file.
$ ./target/release/lzd english.50MB -l
Compression ratio in factors: 0.121
Compression ratio in filesize: 0.313
Number of defined LZD-factors: 3177320
Number of written LZD-factors: 6354129
unlzd command
It decompresses a compressed file and writes the original data into a file without the extension lzd. In the following case, english.50MB will be written as the decompressed file.
$ ./target/release/unlzd english.50MB.lzd
Licensing
This library is free software provided under MIT.