rust-hdf5
Pure Rust HDF5 library — no C dependencies.
Read and write HDF5 files with contiguous, chunked, and compressed datasets, hierarchical groups, attributes, SWMR streaming, and hyperslab I/O.
Why rust-hdf5?
- Zero C dependencies — no
libhdf5, noh5cc, no system packages. Works anywhere Rust compiles. - Memory safe — Rust's type system prevents buffer overflows, use-after-free, and data races. Minimal
unsafeonly for type reinterpretation. - Simple API — fluent builder pattern instead of C-style opaque handles (
h5d_*,h5g_*, ...) - Batteries included — compression codecs (deflate, LZ4, Zstandard) bundled as Rust crates. No plugin system needed.
- Easy cross-compilation — all dependencies are pure Rust. No cross-compile toolchain for C libraries required.
Features
- Read & write — create new files, open existing files, append datasets
- Chunked storage — extensible array, fixed array, B-tree v2 indices
- Compression — deflate (gzip), shuffle, Fletcher-32, LZ4, Zstandard filters
- Parallel compression — per-chunk compression/decompression via rayon
- Groups — hierarchical group structure with nested object headers
- Attributes — string and numeric attributes on datasets and root
- SWMR — Single Writer / Multiple Reader streaming protocol
- Hyperslab I/O —
read_slice/write_slicefor partial N-dimensional access - Buffered I/O — BufWriter/BufReader with automatic mode switching
- Memory-mapped I/O — optional zero-copy read-only access via
mmapfeature - Thread safety — optional
threadsafefeature (Arc<Mutex>instead ofRc<RefCell>) - Legacy format support — reads v0/v1 superblock and v1 object header files (h5py, HDF5 C library)
- Variable-length strings — reads h5py-style vlen string datasets via global heap
- Compound types — user-defined struct types and complex numbers (
Complex32,Complex64)
Quick start
[]
= "0.1"
Write
use H5File;
let file = create?;
let ds = file.
.shape
.create?;
ds.write_raw?;
file.close?;
Read
use H5File;
let file = open?;
let ds = file.dataset?;
let data = ds.?;
assert_eq!;
Chunked + compressed streaming
use H5File;
let file = create?;
let ds = file.
.shape
.chunk
.max_shape
.deflate
.create?;
for frame in 0..1000u64
ds.extend?;
file.close?;
Groups
use H5File;
let file = create?;
let det = file.create_group?;
let raw = det.create_group?;
let ds = raw.
.shape
.create?;
ds.write_raw?;
file.close?;
// Read back
let file = open?;
let ds = file.dataset?;
assert_eq!;
Hyperslab (slice) I/O
use H5File;
let file = create?;
let ds = file.
.shape
.create?;
ds.write_raw?;
// Overwrite a 2x3 sub-region
ds.write_slice?;
file.close?;
// Read a sub-region
let file = open?;
let ds = file.dataset?;
let region = ds.?;
assert_eq!;
Attributes
use ;
let file = create?;
let ds = file..shape.create?;
ds.write_raw?;
let attr = ds..shape.create?;
attr.write_string?;
file.close?;
// Read back
let file = open?;
let ds = file.dataset?;
let units = ds.attr?;
assert_eq!;
Append mode
use H5File;
// Add datasets to an existing file
let file = open_rw?;
let ds = file..shape.create?;
ds.write_raw?;
file.close?;
SWMR streaming
use ;
// Writer process
let mut writer = create?;
let ds = writer.?;
writer.start_swmr?;
writer.append_frame?;
writer.flush?;
writer.close?;
// Reader process (concurrent)
let mut reader = open?;
reader.refresh?;
let data = reader.?;
Supported types
| Rust type | HDF5 type |
|---|---|
u8, i8 |
8-bit integer |
u16, i16 |
16-bit integer |
u32, i32 |
32-bit integer |
u64, i64 |
64-bit integer |
f32 |
IEEE 754 single |
f64 |
IEEE 754 double |
HBool |
Boolean (enum over u8) |
Complex32 |
Compound {re: f32, im: f32} |
Complex64 |
Compound {re: f64, im: f64} |
CompoundType |
User-defined compound |
VarLenUnicode |
Variable-length UTF-8 string (read) |
Compression filters
| Filter | Feature flag | Dependency |
|---|---|---|
| Deflate (gzip) | deflate (default) |
flate2 |
| Shuffle | built-in | — |
| Fletcher-32 | built-in | — |
| LZ4 | lz4 |
lz4_flex |
| Zstandard | zstd |
rust-zstd |
# Enable LZ4 + Zstandard
[]
= { = "0.1", = ["lz4", "zstd"] }
Feature flags
| Feature | Description |
|---|---|
deflate |
Deflate compression (default) |
lz4 |
LZ4 compression |
zstd |
Zstandard compression |
bzip2 |
BZIP2 compression |
blosc |
Blosc meta-compressor |
all_filters |
All compression filters |
parallel |
Parallel chunk compression via rayon |
threadsafe |
Send + Sync file handles (Arc<Mutex>) |
mmap |
Memory-mapped read-only file access |
HDF5 format support
| Feature | Read | Write |
|---|---|---|
| Superblock v0/v1 (legacy) | Yes | — |
| Superblock v2/v3 | Yes | Yes |
| Object header v1 | Yes | — |
| Object header v2 | Yes | Yes |
| Contiguous storage | Yes | Yes |
| Chunked storage (EA) | Yes | Yes |
| Chunked storage (FA) | Yes | Yes |
| Chunked storage (BT2) | Yes | Yes |
| Compressed chunks | Yes | Yes |
| Hierarchical groups | Yes | Yes |
| Attributes | Yes | Yes |
| SWMR protocol | Yes | Yes |
| Hyperslab selection | Yes | Yes |
| Variable-length strings | Yes | — |
| Compound types | Yes | Yes |
Benchmarks
Benchmarks cover contiguous read/write, chunked write, and compressed write throughput using criterion.
Testing
Tests cover format codec, I/O roundtrips, compression, groups, attributes, SWMR, slice I/O, append mode with dataset resize, scalar datasets, and h5dump validation.
License
MIT
This project is a Rust port inspired by the HDF5 C library, which is licensed under the BSD-3-Clause license. See LICENSE-HDF5 for the original HDF5 copyright notice and license terms.