lbzip2-rs
π§ββ Med Allfaderns visdom, kompression och korruptionsskydd. β‘ Med hans blick ΓΆver varje bit.
Pure Rust parallel bzip2 decompressor. No C dependencies. Usable as a library or as a CLI tool:
# Decompress any bzip2 file (including pbzip2 concatenated streams)
What makes this crate unique: 100 % Rust (no C/FFI), in-process, zero-copy, and parallel block-boundary scanning β splitting a chunk across N cores is O(N), not O(n) where n is the raw byte count (e.g. 200 MB per chunk). Each core only scans its own 200 MB / N slice for the 48-bit magic, so with 12 cores the work per core is ~17 MB instead of a single thread walking all 200 MB.
Part of the znippy group of software, designed for fast zero-copy integration with osm-katana β the parallel OSM-to-GeoParquet pipeline.
Why
- In-process β no pipe, no process spawn. Decompressed segments go straight into the caller's memory.
- Shared thread pool β the rayon pool is shared with the host application (e.g. VTD XML parse + PBF encode). No thread contention.
- Zero dependency on C libbz2 β builds anywhere
rustcdoes.
Performance
| Test file | C lbzip2 | lbzip2-rs | Gap |
|---|---|---|---|
| Planet 1 GB slice (β ~10 GB) | 40.6 s | 42.4 s | 4% slower |
| Liechtenstein 3 MB (β 60 MB) | 0.15 s | 0.22 s | startup |
| Within 4 % of C lbzip2 on real workloads (8-core / 16-thread machine). | |||
| Handles pbzip2 concatenated streams natively. |
Current state: the block-level decompression (Huffman β MTF β inverse BWT β RLE) is heavily (ai cloned) inspired by Julian Seward's original C bzip2 library. This crate therefore includes the bzip2 license (BSD-style) alongside MIT / Apache-2.0.
Usage
use ChunkDecoder;
let data: & = /* compressed chunk including BZhN header */;
let decoder = from_header?;
// Returns segments separately β no giant memcpy
let = decoder.decode_chunk_segments?;
for seg in &segments
Single-stream sequential API also available:
let output = decompress?;
Backlog
Questions / wishes for the bzip2-rs crate author β API changes that
would have made parallel decode possible without reimplementing the decoder:
1. pub fn decode_block(data: &[u8], bit_offset: usize, max_blocksize: u32)
-> Result<(Vec<u8>, usize), Error>
β Expose single-block decode from arbitrary bit offset.
β Return (decompressed_bytes, bits_consumed).
2. Zero-copy input: accept &[u8] + bit_offset, not impl Write.
β For mmap / ring-buffer use cases, borrowing is essential.
3. Expose block boundary scanning or document the 48-bit bit-aligned
magic (0x314159265359) so callers can split the stream themselves.
4. Optional: fn decode_block_into(data: &[u8], bit_offset: usize,
out: &mut [u8]) -> Result<usize, Error>
β Write directly into caller-provided buffer.
Without (1) and (2), parallel decode requires reimplementing the full Huffman β MTF β BWT β RLE pipeline from scratch (which is what this crate does).
License
MIT OR Apache-2.0, plus the original bzip2 license (BSD-style, Julian Seward) for the block-decode routines derived from the C reference implementation. See LICENSE-BZIP2 for the full text.