Skip to main content

Module xz

Module xz 

Source
Available on crate feature xz only.
Expand description

xz container around LZMA2.

Reference: https://tukaani.org/xz/xz-file-format.txt.

Wire format (decoder view):

 Stream Header (12 B)
 Block Header (variable, multiple of 4, 8..=1024 B)
 LZMA2 payload
   (chunk: 01 hi lo <up-to-65536 bytes>)+
   00 end marker
 Block Padding (0..3 zero bytes, total Block size to 4 B alignment)
 Check (4 B CRC32 of uncompressed Block data)
 Index (Index Indicator 00 | NumRecords varint | (UnpaddedSize varint,
        UncompressedSize varint)+ | 0..3 zero pad | CRC32 of all the above)
 Stream Footer (12 B)

This implementation supports a single filter (LZMA2, filter ID 0x21). Decoding handles every LZMA2 chunk type: 0x00 (end marker), 0x01 (uncompressed + dictionary reset), 0x02 (uncompressed, no reset), and 0x80..=0xFF (LZMA-compressed, with the spec’s full matrix of state/properties/dictionary reset flags). The compressed-chunk decoder is in the private lzma2_decoder submodule and is adapted from this crate’s lzma module.

The encoder emits LZMA-compressed LZMA2 chunks (control byte 0xE0 — compressed, with dictionary + properties + state reset on every chunk so each chunk is independently decodable). When compression would expand a chunk relative to its uncompressed input — which happens on already-random or already-compressed data — we fall back to an uncompressed chunk (control byte 0x01). Real LZMA-compressed output is produced by the private lzma2_encoder submodule, which is a port of the .lzma encoder in src/lzma/ adapted to emit a raw range-coded body (no header, no EOS marker).

No dependency on the sibling lzma module: the small amount of LZMA2 framing we need (control byte, big-endian 16-bit size) and the CRC-32 we use for the Block Check, Stream Header CRC, and Index CRC are all defined inline below.

Structs§

Decoder
Encoder
EncoderConfig
Tunables for the xz encoder.
Xz
Zero-sized marker type implementing Algorithm for xz.