minlz
A high-performance Rust implementation of the S2 compression format, providing binary compatibility with the Go implementation at github.com/klauspost/compress/s2.
Features
- Binary Compatible: Produces output 100% compatible with the Go S2 implementation
- High Performance: 5–8× faster encoding and 60–270× faster decoding than the Go reference; see BENCHMARKS.md
- Multiple Compression Levels: Standard, Better, and Best modes
- Stateful Encoder:
Encoderstruct that reuses hash-table buffers across calls for hot-loop workloads - Stream Format: Full Reader/Writer support with CRC32 validation
- Block Format: Simple block-based compression for known-size data
- Command-Line Tools: Full-featured
s2cands2dtools compatible with Go implementation - Dictionary Compression: Full support for dictionary-based compression
- Concurrent Compression: Optional parallel compression with Rayon
- Index Support: Seeking within compressed streams
- Mostly Safe Rust: A few well-documented
unsafeblocks in hot paths (uninitialisedVecallocation); covered by 86 unit, 10 proptest, and integration tests
S2 Format
S2 is an extension of the Snappy compression format that provides:
- Better compression ratios than Snappy
- Faster decompression than Snappy
- Extended copy operations for better compression
- Repeat offset optimization (S2 extension)
- Compatible with Snappy-compressed data (for decompression)
Note: S2-compressed data cannot be decompressed by Snappy decoders.
More Information: S2 Design & Improvements - Overview of S2's design and improvements
Installation
Add this to your Cargo.toml:
[]
= "1"
Optional Features
Enable concurrent compression for improved performance on multi-core systems:
[]
= { = "1", = ["concurrent"] }
Usage
Block Format (Simple Compression)
use ;
Stream Format (With CRC Validation)
use ;
use ;
Multiple Compression Levels
use ;
let data = b"Some data to compress...";
// Fast compression (default)
let compressed = encode;
// Better compression (slower)
let compressed_better = encode_better;
// Best compression (slowest)
let compressed_best = encode_best;
Buffer Reuse with Encoder
For hot loops compressing many small/medium blocks, the stateful
Encoder keeps its internal hash tables across calls — eliminating
the per-call allocation cost. Output is bit-for-bit identical to the
corresponding free function.
use Encoder;
let mut enc = new;
let mut outputs: = Vecnew;
for chunk in inputs.chunks
# let _ = outputs;
# let inputs: & = b"";
Buffer reuse is up to +30 % on 1 KB encode_better and matches the
free-function performance for larger inputs.
Concurrent Compression (Optional Feature)
Enable the concurrent feature for parallel compression on multi-core systems:
use ConcurrentWriter;
use Write;
let mut compressed = Vecnew;
Dictionary Compression
Dictionaries can improve compression of similar data by pre-seeding the compressor with common patterns:
use ;
// Create a dictionary from sample data
let samples = b"Common patterns that appear frequently in your data...";
let dict = make_dict.unwrap;
// Encode with dictionary
let data = b"Data to compress...";
let compressed = encode_with_dict;
// Decode with dictionary
let decompressed = decode_with_dict?;
assert_eq!;
// Serialize dictionary for storage/transmission
let dict_bytes = dict.to_bytes;
Command-Line Tools
The minlz-tools package provides s2c (compression) and s2d (decompression) command-line tools that are fully compatible with the Go s2 tools.
# Install from source
# Compress a file
# Decompress a file
The tools are cross-compatible with Go's s2c/s2d and offer 12-98x faster performance depending on the operation.
See minlz-tools/README.md for complete documentation.
Performance
Benchmark Results (Intel i9-14900K, rustc 1.95, target-cpu=native)
Encoding Performance
| Mode | Data Size | Pattern | Rust | Go | Speedup |
|---|---|---|---|---|---|
| Standard | 1KB | Random | 4.51 GiB/s | 734 MB/s | 6.5× |
| Standard | 10KB | Random | 8.13 GiB/s | 1280 MB/s | 6.8× |
| Standard | 100KB | Text | 9.63 GiB/s | 1545 MB/s | 6.7× |
| Better | 10KB | Repeated | 10.91 GiB/s | 1430 MB/s | 8.2× |
| Better | 10KB | Text | 10.73 GiB/s | 2232 MB/s | 5.2× |
| Best | 10KB | Repeated | 106.9 MiB/s | 7 MB/s | 16× |
| Best | 10KB | Text | 109.6 MiB/s | 7 MB/s | 16× |
Decoding Performance
| Data Size | Pattern | Rust | Go | Speedup |
|---|---|---|---|---|
| 1KB | Random | 40.5 GiB/s | 672 MB/s | 65× |
| 10KB | Random | 110.2 GiB/s | 538 MB/s | 220× |
| 10KB | Repeated | 134.9 GiB/s | 537 MB/s | 270× |
| 10KB | Text | 94.1 GiB/s | 509 MB/s | 198× |
| 100KB | Random | 70.1 GiB/s | 654 MB/s | 115× |
| 100KB | Repeated | 79.5 GiB/s | 685 MB/s | 125× |
Key Takeaways:
- Decode-heavy workloads: Rust is 59–270× faster.
- All encode modes: Faster than Go on every measured case (5–8× standard/better, 16× best).
- Binary-identical encoder output:
encode,encode_better, andencode_bestall produce byte-for-byte identical output to Go'ss2.Encode,s2.EncodeBetter, ands2.EncodeBest. Verified by dedicated compat tests (tests/go_compatibility.rs,tests/better_compatibility.rs,tests/best_compatibility.rs).
See BENCHMARKS.md for the full table, per-version
changelog of optimisations, and reused-Encoder numbers.
Binary Compatibility
This implementation is binary compatible with the Go version in both directions:
- Decode: any S2 (or Snappy) stream produced by Go is accepted byte-for-byte.
- Encode: every encode mode (
encode,encode_better,encode_best,encode_snappy) produces byte-for-byte identical output to the corresponding Go function on the test inputs intests/go_compatibility.rs,tests/better_compatibility.rs,tests/best_compatibility.rs, andtests/snappy_compat.rs.
You can therefore compress data with this Rust library and decompress it with the Go library, and vice versa.
Example: Interoperability with Go
Rust side:
use encode;
use File;
use Write;
let data = b"Hello from Rust!";
let compressed = encode;
let mut file = create?;
file.write_all?;
Go side:
package main
import (
"os"
"github.com/klauspost/compress/s2"
)
func main()
Examples
Run the included examples:
# Basic compression example
# Debug/testing example
Block vs Stream Format
This library implements both formats:
Block Format
Suitable for:
- Data of known size
- In-memory compression
- Simple use cases
- Maximum compression speed
Stream Format
Includes:
- ✓ CRC32 validation (Castagnoli polynomial)
- ✓ Chunk framing with magic headers
- ✓ Full streaming support via Reader/Writer
- ✓ Incremental reading/writing
- ✓ Compatible with Go s2.Reader/Writer
Use stream format for file I/O, network streaming, or when you need data integrity validation.
Testing
This implementation includes comprehensive testing infrastructure:
Run Tests
# Unit and integration tests
# Property-based tests (proptest) — stress with 2000 cases each
PROPTEST_CASES=2000
# Benchmarks
RUSTFLAGS="-C target-cpu=native"
# Fuzz testing
Test Coverage
- 86 unit tests in
src/— core functionality, edge cases, encoder regressions - 10 property-based tests (
tests/proptest.rs) — roundtrip for every compression level, stream format, decoder robustness, empty/all-same-byte edges - Go binary-compat integration tests —
tests/go_compatibility.rs,tests/better_compatibility.rs,tests/best_compatibility.rs - Snappy round-trip tests —
tests/snappy_compat.rs - 3 libfuzzer targets —
fuzz_roundtrip,fuzz_decode,fuzz_stream - Concurrent compression tests (with
concurrentfeature) - Benchmark suite — encode/decode/roundtrip + Encoder-reuse group
License
BSD-3-Clause
References
- S2 Design & Improvements - Overview of S2's design and improvements over Snappy
- Go S2 Implementation - Reference implementation
- Snappy Format Specification - Base Snappy format
Contributing
Contributions are welcome! Please ensure:
- All tests pass (
cargo test) - Code is formatted (
cargo fmt) - No clippy warnings (
cargo clippy) - Binary compatibility with Go implementation is maintained
The current implementation passes all unit, integration, proptest, and
compatibility tests, is formatted with rustfmt, and has zero clippy
warnings under -D warnings.