pklib
A pure Rust implementation of the PKWare Data Compression Library (DCL) format (1980s DOS era), providing high-performance compression and decompression compatible with the original PKLib by Ladislav Zezula.
Overview
PKLib implements the PKWare DCL format used in game archives like MPQ and other legacy applications. It provides both compression ("implode") and decompression ("explode") functionality with full compatibility to the original PKLib specification. This format uses Huffman coding and sliding dictionary compression, and is covered by Patent No. 5,051,745.
Key Features
- 🔄 Full PKLib Compatibility - Bit-for-bit compatible with original PKLib
- 🚀 High Performance - Optimized Rust implementation with zero-copy where possible
- 🛡️ Memory Safe - Written in safe Rust with comprehensive error handling
- 📦 Multiple Formats - Support for Binary and ASCII compression modes
- 🎯 Flexible Dictionary Sizes - 1KB, 2KB, and 4KB dictionary support
- 📏 Extended Length Support - Maximum repetition length of 516 bytes
- 🔌 Streaming API - Implements standard
Read/Writetraits - 📚 Well Documented - Comprehensive documentation and examples
Quick Start
Add to your Cargo.toml:
[]
= "0.2"
Basic Usage
use ;
// Decompress PKLib-compressed data
let compressed_data = read?;
let decompressed = explode_bytes?;
// Compress data using PKLib format
let data = b"Hello, World! This is a test of the PKLib compression.";
let compressed = implode_bytes?;
Streaming API
use ;
use ;
// Decompress using streaming API
let compressed_data = read?;
let mut decompressor = new?;
let mut decompressed = Vecnew;
decompressor.read_to_end?;
// Compress using streaming API
let mut output = Vecnew;
let mut compressor = new?;
compressor.write_all?;
let compressed_output = compressor.finish?;
Command Line Interface
pklib includes a powerful CLI tool called blast-cli for compressing and decompressing files:
Installation
# Install from source
# Or run directly
CLI Usage
Compress a file
# Basic compression with ASCII mode (good for text)
# Binary mode with 4KB dictionary (good for binary data)
# Force overwrite existing files
Decompress a file
# Basic decompression
# With verbose output
Analyze compressed files
# Get information about a compressed file
# Verbose output shows additional details
CLI Options
--mode: Choosebinary(default) orasciicompression mode--dict-size: Dictionary size -size1-k,size2-k(default), orsize4-k--force: Overwrite existing output files--verbose: Show detailed progress and statistics--quiet: Suppress non-error output
Compression Modes
PKLib supports two compression modes optimized for different data types:
- Binary Mode - Optimized for binary data (executables, images, etc.)
- ASCII Mode - Optimized for text data with better compression ratios
Dictionary Sizes
Choose the dictionary size based on your data characteristics:
- 1KB (1024 bytes) - Fastest compression, smaller memory usage
- 2KB (2048 bytes) - Balanced performance and compression ratio
- 4KB (4096 bytes) - Best compression ratio, higher memory usage
Implementation Status
| Feature | Status | Notes |
|---|---|---|
| Core Infrastructure | ✅ Complete | Types, errors, constants |
| Static Lookup Tables | ✅ Complete | All PKLib tables ported |
| CRC32 Implementation | ✅ Complete | PKLib-compatible checksums |
| Decompression (Explode) | ✅ Complete | Full PKLib compatibility verified |
| Compression (Implode) | ✅ Complete | Full PKLib compatibility verified |
| Testing & Validation | ✅ Complete | Comprehensive test suite with property testing |
| CLI Tool | ✅ Complete | Command-line interface with compress/decompress/info commands |
Performance
PKLib is designed for high performance with several optimizations:
- Compile-time lookup table generation
- Efficient bit manipulation routines
- Zero-copy operations where possible
- Minimal memory allocations in hot paths
- Streaming API for processing large files without loading into memory
Performance characteristics:
- Decompression: Fast single-pass algorithm with bit-level decoding
- Compression: Hash-based pattern matching with 4-tier length encoding
- Memory Usage: Configurable dictionary sizes (1KB-4KB) for different memory constraints
- Throughput: Competitive with original PKLib C implementation
Benchmarks
PKLib includes a comprehensive benchmark suite to measure performance across various scenarios:
# Run all benchmarks
# Run specific benchmark suite
# Save baseline for comparison
# Compare against baseline
The benchmark suite covers:
- Throughput: MB/s for different file sizes and data patterns
- Compression Ratios: Effectiveness across various data types
- Memory Usage: Peak allocation tracking with custom allocator
- Round-trip Performance: Complete compress/decompress cycles
- Concurrent Processing: Multi-threaded performance scaling
Compatibility
This implementation achieves 100% compatibility with:
- ✅ Original PKLib by Ladislav Zezula (StormLib) - Full bit-for-bit compatibility verified
- ✅ PKWare DCL Format - Complete specification implementation with all edge cases
- ✅ Game Archives - Successfully processes files from MPQ archives and games like Diablo I
- ✅ Round-trip Testing - Compression/decompression cycles preserve data integrity
- ✅ All Compression Modes - Binary and ASCII modes with 1KB/2KB/4KB dictionaries
Contributing
Contributions are welcome!
Development Setup
# Clone the repository
# Run tests
# Format code
# Run linter
# Run benchmarks
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Ladislav Zezula - Original PKLib implementation (reverse-engineered from Diablo I)
- PKWare Inc. - Original DCL format specification (Version 1.11, Patent No. 5,051,745)
- StormLib Project - Reference implementation and test cases
References
Status: All 4 implementation phases are complete! PKLib provides a fully functional, production-ready implementation of the PKWare DCL format with comprehensive testing and validation.