lz4r
Pure-Rust port of the LZ4 compression library (v1.10.0), providing the full LZ4 C API surface — block compression, high-compression mode, streaming frame format, and file I/O — with no C dependencies.
Compressed output is bit-for-bit identical to the reference C implementation across all compression modes and levels.
Related Projects
| Project | Description |
|---|---|
| lz4/lz4 | Original C implementation — the upstream reference this crate ports |
| inikep/lzbench | In-memory benchmark harness used for apples-to-apples throughput comparison |
| jafreck/AAMF | Automated Architecture Migration Framework — the AI-assisted toolchain that performed the primary migration work |
Features
- Block API — one-shot
compress_default,compress_fast,decompress_safe, and partial decompression - High-Compression (HC) —
compress_hcwith configurable compression levels 1–12 - Frame API —
LZ4F-prefixed streaming compress/decompress with content checksums, dictionary support, and auto-flush - File I/O —
Lz4ReadFile/Lz4WriteFilewrappers forstd::io::{Read, Write} - C ABI shim — optional
c-abifeature exportsLZ4_compress_default,LZ4_compress_fast,LZ4_decompress_safe, andLZ4_compress_HCas astaticlibfor drop-in use with C consumers (e.g. lzbench) - Multi-threaded I/O — optional
multithreadfeature mirrors theLZ4IO_MULTITHREADpath from the C programs
Usage
[]
= "1.10.0"
Block compression
use ;
let input = b"hello world hello world hello world";
let bound = compress_bound as usize;
let mut compressed = vec!;
let compressed_size = compress_default.unwrap;
compressed.truncate;
let mut output = vec!;
let n = decompress_safe.unwrap;
assert_eq!;
High-compression block
use compress_hc;
let input = b"highly compressible text content …";
let bound = compress_bound as usize;
let mut compressed = vec!;
// Levels 1–12; 9 is a good balance of ratio vs speed
let n = compress_hc.unwrap;
Frame streaming
use ;
let prefs = default;
let mut ctx = new?;
// … write chunks via ctx.compress_update(…)
Building
# Debug build
# Optimised release build
# With multi-threaded I/O support
# As a C-compatible static library (for lzbench integration)
RUSTFLAGS="-C panic=abort"
# → target/release/liblz4.a
Testing
All 856 tests across 23 integration test suites and 2 doc-tests are expected to pass.
# Fuzz targets (requires cargo-fuzz + nightly)
Benchmarks
In-crate microbenchmarks (Criterion)
Results are written to target/criterion/. HTML reports are available at
target/criterion/report/index.html.
Apples-to-apples: lzbench (Rust vs C)
The definitive throughput comparison uses the lzbench harness to run both the C reference and this Rust port through the identical timing loop, eliminating harness artefacts.
Methodology
liblz4.a is built with --features c-abi and linked into a patched lzbench binary
(lzbench-rust) that replaces lz4.o / lz4hc.o with the Rust archive. The four
C-ABI symbols (LZ4_compress_default, LZ4_compress_fast, LZ4_decompress_safe,
LZ4_compress_HC) are exported as #[no_mangle] pub unsafe extern "C" shims that
forward to the native Rust block and HC APIs.
Environment: lzbench 2.2.1 | Clang 17 | 64-bit macOS (Apple Silicon) | Silesia corpus
Compression throughput summary (MB/s) — lz4 default
| File | C | Rust | Δ |
|---|---|---|---|
| webster | 613 | 569 | −7% |
| mozilla | 861 | 803 | −7% |
| mr | 865 | 815 | −6% |
| dickens | 534 | 509 | −5% |
| x-ray | 2706 | 2313 | −15% |
| ooffice | 814 | 729 | −10% |
| xml | 1131 | 1117 | −1% |
Typical gap: 0–15% slower on compression, 0–27% slower on decompression.
At lz4hc -1 many files are within measurement noise (0–2%).
Full per-file tables for all five codec variants (
lz4,lz4hc -1/-4/-8/-12), correctness verification (byte-for-byte size identity across all 12 Silesia files), and lzbench integration instructions are in docs/benchmark-results.md.
Migration Methodology
This crate was ported from LZ4 v1.10.0 (≈13,000 lines across 11 C source / header files) using AAMF (Automated Architecture Migration Framework), an AI-assisted toolchain powered by Claude Sonnet 4.6.
Strategy
Migration followed a bottom-up, dependency-ordered approach across 21 tasks in 6 serial phases so that each module could be built and tested in isolation before any downstream consumer depended on it:
lz4.c (block) → xxhash (crate) → lz4hc.c → lz4frame.c → lz4file.c → lib.rs
Source → target file map (abbreviated)
| C source | Rust target |
|---|---|
lz4.c / lz4.h |
src/block/{types,compress,stream,decompress_core,decompress_api}.rs |
lz4hc.c / lz4hc.h |
src/hc/{types,encode,lz4mid,search,compress_hc,dispatch,api}.rs |
lz4frame.c / headers |
src/frame/{types,header,cdict,compress,decompress}.rs |
lz4file.c / lz4file.h |
src/file.rs |
xxhash.c / xxhash.h |
xxhash-rust crate (intentional substitution) |
| All public headers | src/lib.rs |
Key decisions
xxhashreplaced by crate —xxhash.c(~1,000 lines of endianness / SIMD#ifdefchains) was substituted withxxhash-rust = "0.8", verified to produce identical wire output.- Large C files split into focused modules —
lz4.c(2,829 lines) andlz4hc.c(2,192 lines) were decomposed into Rust modules of 150–400 lines each. unsafeconfined to hot paths — pointer arithmetic in the decompressor core and C-ABI shims is kept in dedicated files; the rest of the API surface is safe.malloc/free→ RAII — all state objects becomeBox<T>withDropimpls; no explicit memory management at call sites.
See docs/migration-summary.md and docs/decision-log.md for the complete record.
Documentation
| Document | Description |
|---|---|
| docs/architecture-guide.md | Module structure, layer diagram, and design rationale |
| docs/api-reference.md | Public API surface — functions, types, and error codes |
| docs/developer-guide.md | Build, test, lint, and contribution workflow |
| docs/benchmark-results.md | Full lzbench Rust-vs-C throughput tables and methodology |
| docs/migration-summary.md | Source-to-target file map and pattern mapping reference |
| docs/decision-log.md | Architectural decisions with rationale and rejected alternatives |
| docs/known-issues.md | Known warnings, limitations, and open items |
License
GPL-2.0-only — same as the upstream LZ4 programs. The LZ4 library itself (which this crate ports) is BSD-2-Clause.