# fpzip-rs
Lossless and lossy compression for multi-dimensional floating-point arrays, implemented in pure Rust.
A faithful port of Peter Lindstrom's [FPZip](https://computing.llnl.gov/projects/fpzip) algorithm, producing byte-identical output with the C++ reference implementation. Compresses `f32` and `f64` arrays with 1D, 2D, 3D, and 4D support. Designed for scientific and numerical data with high spatial correlation.
[](LICENSE)
## Features
- Lossless and lossy compression for `f32` and `f64` arrays
- Configurable bit precision (2-32 bits for float, 4-64 bits for double)
- Multi-dimensional support (1D, 2D, 3D, 4D)
- Byte-identical output with C++ fpzip (verified via checksums at all precisions)
- Pure Rust with no unsafe in the core library
- `no_std` compatible (with `alloc`)
- C FFI layer for calling from C/C++/Python/etc.
- Async support via tokio (`fpzip-async` crate)
- Optional parallel compression via rayon
## Quick Start
Add to your `Cargo.toml`:
```toml
[dependencies]
fpzip-rs = "0.1"
```
### Compress and decompress (lossless)
```rust
use fpzip_rs::{compress_f32, decompress_f32};
// 10x10x10 grid of float data
let compressed = compress_f32(&data, 10, 10, 10, 1).unwrap();
let decompressed = decompress_f32(&compressed).unwrap();
assert_eq!(data, decompressed); // lossless!
```
### Lossy compression with reduced precision
```rust
use fpzip_rs::{FpZipCompressor, decompress_f32};
// Compress at 16-bit precision (lossy, better compression ratio)
let compressed = FpZipCompressor::new(10)
.ny(10)
.nz(10)
.prec(16)
.compress_f32(&data)
.unwrap();
let decompressed = decompress_f32(&compressed).unwrap();
// Values are close but not identical due to reduced precision
```
### Builder API
```rust
use fpzip_rs::{FpZipCompressor, decompress_f64};
let data: Vec<f64> = vec![3.14; 64];
let compressed = FpZipCompressor::new(4)
.ny(4)
.nz(4)
.compress_f64(&data)
.unwrap();
let decompressed = decompress_f64(&compressed).unwrap();
assert_eq!(data, decompressed);
```
### Stream-based I/O
```rust
use fpzip_rs::{compress_f32_to_writer, decompress_f32_from_reader};
use std::io::Cursor;
let data = vec![1.0f32, 2.0, 3.0, 4.0];
let mut buf = Vec::new();
compress_f32_to_writer(&data, &mut buf, 4, 1, 1, 1).unwrap();
let mut reader = Cursor::new(&buf);
let (header, decompressed) = decompress_f32_from_reader(&mut reader).unwrap();
assert_eq!(data, decompressed);
```
### Pre-allocated buffers
```rust
use fpzip_rs::{compress_f32_into, decompress_f32_into, max_compressed_size, FpZipType};
let data = vec![0.0f32; 1000];
let max_size = max_compressed_size(1000, FpZipType::Float);
let mut buf = vec![0u8; max_size];
let written = compress_f32_into(&data, &mut buf, 10, 10, 10, 1).unwrap();
let compressed = &buf[..written];
let mut output = vec![0.0f32; 1000];
let header = decompress_f32_into(compressed, &mut output).unwrap();
```
## API Reference
### Free Functions
| `compress_f32(data, nx, ny, nz, nf)` | Compress `&[f32]` to `Vec<u8>` (lossless) |
| `compress_f64(data, nx, ny, nz, nf)` | Compress `&[f64]` to `Vec<u8>` (lossless) |
| `decompress_f32(data)` | Decompress `&[u8]` to `Vec<f32>` |
| `decompress_f64(data)` | Decompress `&[u8]` to `Vec<f64>` |
| `compress_f32_into(data, buf, ...)` | Compress into pre-allocated `&mut [u8]` |
| `decompress_f32_into(data, buf)` | Decompress into pre-allocated `&mut [f32]` |
| `compress_f32_to_writer(data, w, ...)` | Compress to `impl Write` |
| `decompress_f32_from_reader(r)` | Decompress from `impl Read` |
| `read_header(data)` | Read header without decompressing |
| `max_compressed_size(count, type)` | Upper bound on compressed size |
Double variants (`f64`) are available for all functions.
### Builder
```rust
FpZipCompressor::new(nx)
.ny(ny) // default: 1
.nz(nz) // default: 1
.nf(nf) // default: 1
.prec(prec) // default: full precision (lossless)
.compress_f32(data)
```
The `prec` parameter controls bit precision:
- **Float**: 2-32 (32 = lossless)
- **Double**: 4-64 (64 = lossless)
- Lower precision gives better compression ratios at the cost of accuracy.
## Workspace Crates
| `fpzip-rs` | Core compression library |
| `fpzip-ffi` | C FFI layer (`cdylib` + `staticlib`) with `fpzip.h` header |
| `fpzip-async` | Async wrappers using tokio `AsyncRead`/`AsyncWrite` |
## Feature Flags
| `std` | yes | `std::io::Read/Write` streaming APIs |
| `alloc` | yes | `Vec`-based return types (implied by `std`) |
| `rayon` | no | Parallel 4D field compression |
## C FFI
The `fpzip-ffi` crate builds a C-compatible shared/static library. A header file is provided at `fpzip-ffi/include/fpzip.h`.
```c
#include "fpzip.h"
float data[1000] = { /* ... */ };
uint8_t buf[8192];
size_t compressed_len;
int rc = fpzip_compress_float(data, 1000, 10, 10, 10, 1, buf, sizeof(buf), &compressed_len);
if (rc != 0) {
printf("Error: %s\n", fpzip_error_message(rc));
}
```
## Algorithm
FPZip combines three techniques:
1. **Lorenzo predictor** -- predicts each value from 7 neighbors in a 3D wavefront
2. **Integer mapping** -- bijectively maps IEEE 754 floats to unsigned integers preserving ordering, with configurable bit precision
3. **Adaptive arithmetic coding** -- range coder with quasi-static probability model
The predictor formula in integer domain:
```
p = f[1,0,0] - f[0,1,1] + f[0,1,0] - f[1,0,1] + f[0,0,1] - f[1,1,0] + f[1,1,1]
```
Only the residual (actual - predicted) is entropy coded. For wide alphabets (precision > 8 bits), residuals are split into an exponent symbol and verbatim mantissa bits. For narrow alphabets (precision <= 8 bits), the residual is encoded as a single symbol.
## Compressed Format
The entire compressed stream (header + data) is encoded through an arithmetic range coder, matching the C++ fpzip wire format. The header fields are:
| Magic | 32 | `'f'`, `'p'`, `'z'`, `'\0'` (8 bits each) |
| Major version | 16 | `0x0110` |
| Minor version | 8 | `4` (FPZIP_FP_INT mode) |
| Type | 1 | `0` = float, `1` = double |
| Precision | 7 | Bit precision (0 = full) |
| nx | 32 | X dimension |
| ny | 32 | Y dimension |
| nz | 32 | Z dimension |
| nf | 32 | Number of fields |
Followed immediately by the arithmetic-coded prediction residuals.
## C++ Compatibility
This implementation produces byte-identical output with the C++ fpzip library (FPZIP_FP_INT mode). Compatibility is verified by 18 Jenkins checksum tests covering:
- Float at precision 8, 16, and 32 (lossless)
- Double at precision 16, 32, and 64 (lossless)
- 1D, 2D, and 3D dimension layouts
Data compressed by this library can be decompressed by the C++ fpzip library and vice versa.
## License
MIT -- see [LICENSE](LICENSE).
## Acknowledgments
Based on [FPZip](https://computing.llnl.gov/projects/fpzip) by Peter Lindstrom, Lawrence Livermore National Laboratory.