bzip2_os/compression/mod.rs
1//! The compress module manages the compression side of the the Rust version of the standard BZIP2 library.
2//!
3//! BZIP2 compression happens in the following steps:
4//! - Run Length Encoding 1: Compress all runs of 4-255 identical bytes.
5//! - Burrow Wheeler Transform: Sort the data to increase the probability of runs of identical bytes.
6//! - Move To Front transform: Increase the frequency of lower byte values, especially the zero byte.
7//! - Run Length Encoding 2: Compress all runs of the zero byte.
8//! - Huffman coding: Encode frequent byte values using smaller bit codes and less frequent byte values with longer bit codes.
9//!
10//! While the initial RLE1 compression is probably not necessary, it is a legacy of the original implemention and must be preserved.
11//!
12//! The BZIP2 huffman stage actually makes four passes over the data to improve the compression ratio. Additionally, six different
13//! huffman tables are generated for each block of data, and every chunk of 50 bytes within that block is analyzed to determine which
14//! huffman table will result in the best compression ratio for that chunk.
15//!
16//! Decompression is single threaded (except for the generation of a frequency table within each block). It follows the inverse of the compression process.
17//! - Huffman decoding.
18//! - RLE 2: Expand all runs of the zero byte.
19//! - MTF transform: Convert from the Move-To-Front indecies to the symbols represented by the indecies.
20//! - BWT reversal: Restore the original data from the BWT transform.
21//! - RLE 1: Expand all runs of 4+ identical bytes.
22//!
23
24pub mod compress;
25pub mod compress_block;
26pub mod decompress;