rust-par2
Pure Rust implementation of PAR2 (Parity Archive 2.0) verification and repair.
Parses PAR2 parity files, verifies data file integrity, and repairs damaged or missing files using Reed-Solomon error correction -- all without any C dependencies. Competitive with par2cmdline-turbo on repair performance, and faster on verification.
Installation
Add to your Cargo.toml:
[]
= "0.1"
Features
- Full PAR2 verify and repair -- detects damaged/missing files and reconstructs them from parity data
- Pure Rust -- no C/C++ dependencies, no FFI, no unsafe outside of SIMD intrinsics
- SIMD-accelerated -- AVX2 and SSSE3 Galois field multiplication with automatic runtime detection and scalar fallback
- Parallel -- rayon-based multithreading for verification (per-file) and repair (per-block)
- Streaming I/O -- double-buffered reader thread overlaps disk reads with GF compute during repair
- Low memory -- streams source blocks instead of loading entire files into memory; repair allocates only O(damaged_blocks * slice_size)
- Structured logging -- uses
tracingfor optional debug/trace output with zero overhead in release builds
Quick start
use Path;
let par2_path = new;
let dir = new;
// Parse the PAR2 index file
let file_set = parse.unwrap;
// Verify all files
let result = verify;
if result.all_correct else if result.repair_possible
Skipping redundant verification
If you already have a VerifyResult (e.g., from displaying status to the user),
pass it directly to repair_from_verify to avoid re-verifying:
# use Path;
# let par2_path = new;
# let dir = new;
# let file_set = parse.unwrap;
let result = verify;
if !result.all_correct && result.repair_possible
Performance
Benchmarked against par2cmdline-turbo on 1 GB random data with 3% damage, 8% redundancy, 768 KB block size (16-thread AMD/Intel system):
| Operation | rust-par2 | par2cmdline-turbo | Ratio |
|---|---|---|---|
| Verify (intact) | 1.4s | 1.3s | ~1.0x |
| Verify (damaged) | 2.8s | 4.1s | 1.5x faster |
| Repair | 3.4s | 3.7s | 1.1x faster |
Repair time uses repair_from_verify (typical workflow where verify has already
been called). The standalone repair() function includes an internal verify pass
and takes ~6.5s.
Design
PAR2 format
PAR2 files are a container of typed packets. This crate parses five packet types:
| Packet | Purpose |
|---|---|
| Main | Slice size, file count, recovery block count |
| FileDesc | Per-file: filename, size, MD5, 16 KiB hash |
| IFSC | Per-block checksums (MD5 + CRC32) for each file |
| RecoverySlice | Reed-Solomon recovery data + exponent |
| Creator | Software identification string |
Packets can appear in any order and across multiple .par2 files (an index file
plus .volNNN+NNN.par2 volume files). All multi-byte fields are little-endian.
Packet bodies are verified with MD5 checksums.
Galois field arithmetic
PAR2 mandates GF(2^16) with irreducible polynomial
x^16 + x^12 + x^3 + x + 1 (0x1100B) and generator element 2.
Scalar operations use log/antilog tables (two 65536-entry u16 arrays,
initialized once via OnceLock) giving O(1) multiply, divide, and inverse.
For bulk operations (the repair inner loop), the crate uses the PSHUFB nibble-decomposition technique:
- Decompose each 16-bit GF element into four 4-bit nibbles
- For each nibble position, precompute a 16-entry lookup table:
table[n] = constant * (n << shift)in GF(2^16) - Use VPSHUFB/PSHUFB as a parallel 16-way table lookup
- XOR the four partial products to get the final result
This gives 8 VPSHUFB lookups + 7 XORs per 32 bytes (16 GF elements) with AVX2, or the same at 16 bytes with SSSE3. Runtime feature detection selects the best available path, with a pure scalar fallback for non-x86 platforms.
The dispatch hierarchy:
| Path | Width | Throughput (single core) |
|---|---|---|
| AVX2 | 256-bit | ~15 GB/s |
| SSSE3 | 128-bit | ~8 GB/s |
| Scalar | 16-bit | ~0.3 GB/s |
Reed-Solomon decoding
Repair works by constructing and inverting a Vandermonde matrix over GF(2^16):
-
Encoding matrix construction: Rows 0..N are the identity (data blocks pass through). Rows N..N+K encode recovery blocks using the PAR2 input constants
c[i] = 2^n(where n skips multiples of 3, 5, 17, 257). -
Submatrix selection: Select rows corresponding to available data: identity rows for intact blocks, recovery rows for damaged positions.
-
Gaussian elimination: Invert the submatrix over GF(2^16). If the matrix is singular (linearly dependent recovery blocks), repair fails.
-
Block reconstruction: Multiply the inverse matrix by the available data (intact blocks + recovery data) using SIMD multiply-accumulate.
Repair architecture
The repair pipeline uses a source-major streaming design:
Reader thread Compute threads (rayon)
| |
| read block[0] ──sync_channel──> | dst[0..D] ^= coeff * block[0]
| read block[1] ──sync_channel──> | dst[0..D] ^= coeff * block[1]
| read block[2] ──sync_channel──> | dst[0..D] ^= coeff * block[2]
| ... | ...
v v
- A dedicated reader thread reads source blocks sequentially, reusing file handles (one open per file, not per block)
- A
sync_channel(2)provides double-buffering: the reader can be two blocks ahead of compute - For each source block, rayon applies the GF multiply-accumulate to all damaged output buffers in parallel
- The source block (~768 KB) stays hot in shared L3 cache across rayon threads
- Each output buffer (~768 KB) fits in L2 cache per thread
This design reduces memory usage from O(total_blocks * slice_size) to O(damaged_blocks * slice_size + 2 * slice_size), and overlaps disk I/O with computation.
Verification
Verification is parallelized per-file using rayon. For each file:
- Compare file size against expected
- Compute full-file MD5 using double-buffered reads (2 MB buffers)
- If MD5 mismatches, identify specific damaged blocks via per-slice CRC32 + MD5
The per-slice identification allows repair to target only damaged blocks rather than treating the entire file as lost.
Scope and limitations
- Verify and repair only -- PAR2 file creation is not supported. Use
par2cmdline or par2cmdline-turbo to create
.par2files. - x86_64 optimized -- SIMD acceleration requires AVX2 or SSSE3 (available on essentially all x86_64 CPUs from 2013+). Other architectures fall back to scalar arithmetic.
- Blocking I/O -- all operations are synchronous. For async contexts, wrap
calls in
spawn_blocking. - Single recovery set -- the crate assumes one recovery set per directory, which is the standard PAR2 usage pattern.
API reference
Core functions
| Function | Description |
|---|---|
parse(path) |
Parse a PAR2 index file, returns Par2FileSet |
parse_par2_reader(reader, size) |
Parse from any Read + Seek source |
verify(file_set, dir) |
Verify files on disk, returns VerifyResult |
repair(file_set, dir) |
Verify + repair in one call |
repair_from_verify(file_set, dir, verify_result) |
Repair with pre-computed verify |
compute_hash_16k(path) |
MD5 of first 16 KiB (for file identification) |
Key types
| Type | Description |
|---|---|
Par2FileSet |
Parsed PAR2 metadata: files, slice size, recovery set ID |
Par2File |
Per-file metadata: hash, size, filename, per-slice checksums |
VerifyResult |
Verification outcome: intact, damaged, and missing files |
RepairResult |
Repair outcome: success flag, blocks/files repaired |
DamagedFile |
File with damage: includes specific damaged block indices |
MissingFile |
File that is entirely absent |
Error types
| Error | Cause |
|---|---|
ParseError |
Invalid/corrupt PAR2 file |
RepairError::Io |
Filesystem error during repair |
RepairError::InsufficientRecovery |
Not enough recovery blocks for repair |
RepairError::SingularMatrix |
Recovery blocks are linearly dependent |
RepairError::NoDamage |
Nothing to repair (all files intact) |
RepairError::VerifyFailed |
Post-repair verification failed |
Dependencies
| Crate | Purpose |
|---|---|
md-5 |
MD5 hashing (ASM-accelerated) |
crc32fast |
CRC32 checksums |
rayon |
Data parallelism |
thiserror |
Error type derivation |
tracing |
Structured logging (zero-cost) |
No system dependencies. No build scripts. No proc macros beyond thiserror.
License
Licensed under either of
- Apache License, Version 2.0 (LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0)
- MIT License (LICENSE-MIT or http://opensource.org/licenses/MIT)