Crate hidefix

source ·
Expand description

§HIDEFIX

A fast and concurrent reader for HDF5 and NetCDF (v4) files.

This library allows a HDF5 file to be read in a multi-threaded or concurrent (async) way. The chunks of a dataset need to be indexed in advance. Fast in newer versions of HDF5 (see below). The index can be efficiently deserialized with zero-copy through serde.

This allows multiple datasets (variables) to be read at the same time, or even different domains of the same dataset to be read at the same time.

The library is meant to be used in conjunction with the bindings to the official HDF5 library.

§Usage

Create an index, then read the values:

use hidefix::prelude::*;

let idx = Index::index("tests/data/coads_climatology.nc4").unwrap();
let mut r = idx.reader("SST").unwrap();

let values = r.values::<f32, _>(..).unwrap();

println!("SST: {:?}", values);

or convert a hdf5::File or hdf5::Dataset into an index by using try_from or the index method.

or use the IntoIndex trait:

use hidefix::prelude::*;

let i = hdf5::File::open("tests/data/coads_climatology.nc4").unwrap().index().unwrap();
let iv = i.reader("SST").unwrap().values::<f32, _>(..).unwrap();

§NetCDF4 files

NetCDF4 uses HDF5 as their underlying data-format. Hidefix can be used to read the NetCDF variables, though there might be extra decoding necessary. The hidefix-xarray does that for you in the python bindings.

use std::convert::TryInto;
use hidefix::prelude::*;

let f = netcdf::open("tests/data/coads_climatology.nc4").unwrap();
let nv = f.variable("SST").unwrap().get_values::<f32, _>(..).unwrap();

let i: Index = (&f).try_into().unwrap();
let iv = i.reader("SST").unwrap().values::<f32, _>(..).unwrap();

assert_eq!(iv, nv);

or use the IntoIndex trait:

use hidefix::prelude::*;

let i = netcdf::open("tests/data/coads_climatology.nc4").unwrap().index().unwrap();
let iv = i.reader("SST").unwrap().values::<f32, _>(..).unwrap();

It is also possible to stream the values. The streamer is currently optimized for streaming bytes.

§Fast indexing

The indexing can be sped up considerably (about 200x) by using the new interface to iterating over chunks in HDF5. The fast-index feature flag currently requires a patched version of hdf5-rust. You therefore have to use patch to point the hdf5 and hdf5-sys dependencies to the patched versions for now, in your Cargo.toml:

[patch.crates-io]
hdf5 = { git = "https://github.com/magnusuMET/hdf5-rust", branch = "hidefix_jul_2023" }
hdf5-sys = { git = "https://github.com/magnusuMET/hdf5-rust", branch = "hidefix_jul_2023" }
hdf5-src = { git = "https://github.com/magnusuMET/hdf5-rust", branch = "hidefix_jul_2023" }

Modules§