Crate zarrs

source ·
Expand description

zarrs is Rust library for the Zarr storage format for multidimensional arrays and metadata. It supports:

A changelog can be found here. Correctness issues with past versions are detailed here.

Developed at the Department of Materials Physics, Australian National University, Canberra, Australia.

§Getting Started

  • Review the implementation status.
  • View the examples.
  • Read the documentation. array::Array, storage, and metadata are good places to start.
  • Check out zarrs_tools for various tools built upon this crate. Includes:
    • A reencoder that can change codecs, chunk shape, convert Zarr V2 to V3, etc.
    • Create a Zarr V3 OME-Zarr hierarchy from a Zarr array.
    • Transform arrays: crop, rescale, downsample, gradient magnitude, gaussian, noise filtering, etc.
    • Benchmarking tools and performance benchmarks of zarrs.

§Implementation Status

Zarr Enhancement ProposalStatusZarrs
ZEP0001: Zarr specification version 3AcceptedFull support
ZEP0002: Sharding codecAcceptedFull support
Draft ZEP0003: Variable chunkingzarr-developers #52Full support
Draft ZEP0007: Stringszarr-developers/zeps #47Prototype

zarrs has first-class Zarr V3 support and additionally supports a compatible subset of Zarr V2 data that:

  • can be converted to V3 with only a metadata change, and
  • uses array metadata that is recognised and supported for encoding/decoding.

An existing V2 or V3 array can be opened with Array::open. A new array can be created from V2 or V3 metadata with Array::new_with_metadata. The ArrayBuilder only supports V3 array creation.

zarrs supports forward conversion of Zarr V2 data to V3. See “Metadata Convert Version” and “Metadata Erase Version” for information about manipulating the version of array/group metadata.

§Array Support

Data Types

† Experimental data types are recommended for evaluation only.

Codecs
Codec TypeCodecZEPV3V2Feature Flag*
Array to ArraytransposeZEP0001transpose
Array to BytesbytesZEP0001
sharding_indexedZEP0002sharding
Bytes to BytesbloscZEP0001blosc
gzipZEP0001gzip
crc32cZEP0002crc32c
zstdzarr-specs #256zstd

* Bolded feature flags are part of the default set of features.

Codecs (Experimental)

Experimental codecs are recommended for evaluation only.

Codec TypeCodecZEPV3V2Feature Flag
Array to Arraybitroundbitround
Array to Byteszfp
zfpy (V2)
zfp
pcodecpcodec
vlen
vlen_v2
vlen-* (V2)
Bytes to Bytesbz2bz2
gdeflategdeflate

By default, the "name" of of experimental codecs in array metadata links the codec documentation in this crate. This is configurable with Config::experimental_codec_names_mut.

Chunk Grids
Chunk GridZEPV3V2Feature Flag
regularZEP0001
rectangularZEP0003
Chunk Key Encodings
Chunk Key EncodingZEPV3V2Feature Flag
defaultZEP0001
v2ZEP0001
Storage Transformers

Zarr V3 does not currently define any storage transformers.

zarrs supports two internal storage transformers for debugging: usage log and performance metrics.

§Storage Support

zarrs supports a huge range of storage backends through the opendal and object_store crates.

Stores and Storage Adapters
Store/Storage AdapterZEPReadWriteListSyncAsyncFeature Flag
FilesystemStoreZEP0001
MemoryStore
HTTPStorehttp
OpendalStore✓*✓*✓*opendal
AsyncOpendalStore✓*✓*✓*opendal
AsyncObjectStore✓*✓*✓*object_store
ZipStorageAdapterzip

* Support depends on the opendal BlockingOperator/Operator or object_store store.

§Examples

use zarrs::array::{ArrayBuilder, DataType, FillValue, ZARR_NAN_F32};
use zarrs::array::codec::GzipCodec; // requires gzip feature
use zarrs::array_subset::ArraySubset;
use zarrs::storage::{ReadableWritableListableStorage, store::FilesystemStore};

// Create a filesystem store
let store_path: PathBuf = "/path/to/store".into();
let store: ReadableWritableListableStorage =
    Arc::new(FilesystemStore::new(&store_path)?);

// Create a new V3 array using the array builder
let array = ArrayBuilder::new(
    vec![3, 4], // array shape
    DataType::Float32,
    vec![2, 2].try_into()?, // regular chunk shape (non-zero elements)
    FillValue::from(ZARR_NAN_F32),
)
.bytes_to_bytes_codecs(vec![
    Box::new(GzipCodec::new(5)?),
])
.dimension_names(["y", "x"].into())
.attributes(serde_json::json!({"Zarr V3": "is great"}).as_object().unwrap().clone())
.build(store.clone(), "/group/array")?; // /path/to/store/group/array

// Store the array metadata
array.store_metadata()?;
println!("{}", serde_json::to_string_pretty(array.metadata())?);
// {
//     "zarr_format": 3,
//     "node_type": "array",
//     ...
// }

// Perform some operations on the chunks
array.store_chunk_elements::<f32>(
    &[0, 1], // chunk index
    &[0.2, 0.3, 1.2, 1.3]
)?;
array.store_array_subset_ndarray::<f32, _>(
    &[1, 1], // array index
    ndarray::array![[-1.1, -1.2], [-2.1, -2.2]]
)?;
array.erase_chunk(&[1, 1])?;

// Retrieve all array elements as an ndarray
let array_subset_all = ArraySubset::new_with_shape(array.shape().to_vec());
let array_ndarray = array.retrieve_array_subset_ndarray::<f32>(&array_subset_all)?;
println!("{array_ndarray:4}");
// [[ NaN,  NaN,  0.2,  0.3],
//  [ NaN, -1.1, -1.2,  1.3],
//  [ NaN, -2.1,  NaN,  NaN]]

Examples can be run with cargo run --example <EXAMPLE_NAME>.

  • Add -- --usage-log to see storage API calls during example execution.
  • Some examples require non-default features, which can be enabled with --all-features or --features <FEATURES>.
§Sync API Examples

array_write_read, array_write_read_ndarray, sharded_array_write_read, rectangular_array_write_read, zip_array_write_read, http_array_read.

§Async API Examples

async_array_write_read, async_http_array_read_object_store, async_http_array_read_opendal.

§Crate Features

§Default
  • ndarray: ndarray utility functions for Array.
  • Codecs: blosc, gzip, transpose, zstd, sharding, crc32c.
§Non-Default
  • async: an experimental asynchronous API for stores, Array, and Group.
    • The async API is runtime-agnostic. This has some limitations that are detailed in the Array docs.
    • The async API is not as performant as the sync API.
  • Codecs: bitround, bz2, pcodec, zfp, zstd.
  • Stores: http, object_store, opendal, zip.

§zarrs Ecosystem

  • zarrs_tools: Various tools for creating and manipulating Zarr V3 data.
  • zarrs_ffi: A subset of zarrs exposed as a C API.

§Licence

zarrs is licensed under either of

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Modules§