Expand description
zarrs
is Rust library for the Zarr storage format for multidimensional arrays and metadata. It supports:
- Zarr V3, and
- A V3 compatible subset of Zarr V2.
A changelog can be found here. Correctness issues with past versions are detailed here.
Developed at the Department of Materials Physics, Australian National University, Canberra, Australia.
§Getting Started
- Review the implementation status.
- View the examples.
- Read the documentation.
array::Array
,storage
, andmetadata
are good places to start. - Check out zarrs_tools for various tools built upon this crate. Includes:
- A reencoder that can change codecs, chunk shape, convert Zarr V2 to V3, etc.
- Create a Zarr V3 OME-Zarr hierarchy from a Zarr array.
- Transform arrays: crop, rescale, downsample, gradient magnitude, gaussian, noise filtering, etc.
- Benchmarking tools and performance benchmarks of
zarrs
.
§Implementation Status
Zarr Enhancement Proposal | Status | Zarrs |
---|---|---|
ZEP0001: Zarr specification version 3 | Accepted | Full support |
ZEP0002: Sharding codec | Accepted | Full support |
Draft ZEP0003: Variable chunking | zarr-developers #52 | Full support |
Draft ZEP0007: Strings | zarr-developers/zeps #47 | Prototype |
zarrs
has first-class Zarr V3 support and additionally supports a compatible subset of Zarr V2 data that:
- can be converted to V3 with only a metadata change, and
- uses array metadata that is recognised and supported for encoding/decoding.
An existing V2 or V3 array can be opened with Array::open
.
A new array can be created from V2 or V3 metadata with Array::new_with_metadata
.
The ArrayBuilder
only supports V3 array creation.
zarrs
supports forward conversion of Zarr V2 data to V3.
See “Metadata Convert Version” and “Metadata Erase Version” for information about manipulating the version of array/group metadata.
§Array Support
Data Types
Data Type† | ZEP | V3 | V2 | Feature Flag |
---|---|---|---|---|
bool int8 int16 int32 int64 uint8 uint16 uint32 uint64 float16 float32 float64 complex64 complex128 | ZEP0001 | ✓ | ✓ | |
r* (raw bits) | ZEP0001 | ✓ | ||
bfloat16 | zarr-specs #130 | ✓ | ||
string (experimental) | ZEP0007 (draft) | ✓ | ||
binary (experimental) | ZEP0007 (draft) | ✓ |
† Experimental data types are recommended for evaluation only.
Codecs
Codec Type | Codec | ZEP | V3 | V2 | Feature Flag* |
---|---|---|---|---|---|
Array to Array | transpose | ZEP0001 | ✓ | transpose | |
Array to Bytes | bytes | ZEP0001 | ✓ | ||
sharding_indexed | ZEP0002 | ✓ | sharding | ||
Bytes to Bytes | blosc | ZEP0001 | ✓ | ✓ | blosc |
gzip | ZEP0001 | ✓ | ✓ | gzip | |
crc32c | ZEP0002 | ✓ | crc32c | ||
zstd | zarr-specs #256 | ✓ | ✓ | zstd |
* Bolded feature flags are part of the default set of features.
Codecs (Experimental)
Experimental codecs are recommended for evaluation only.
Codec Type | Codec | ZEP | V3 | V2 | Feature Flag |
---|---|---|---|---|---|
Array to Array | bitround | ✓ | ✓ | bitround | |
Array to Bytes | zfp zfpy (V2) | ✓ | ✓ | zfp | |
pcodec | ✓ | ✓ | pcodec | ||
vlen | ✓ | ||||
vlen_v2 vlen-* (V2) | ✓ | ✓ | |||
Bytes to Bytes | bz2 | ✓ | ✓ | bz2 | |
gdeflate | ✓ | gdeflate |
By default, the "name"
of of experimental codecs in array metadata links the codec documentation in this crate.
This is configurable with Config::experimental_codec_names_mut
.
Chunk Grids
Chunk Grid | ZEP | V3 | V2 | Feature Flag |
---|---|---|---|---|
regular | ZEP0001 | ✓ | ✓ | |
rectangular | ZEP0003 | ✓ |
Storage Transformers
Zarr V3 does not currently define any storage transformers.
zarrs
supports two internal storage transformers for debugging: usage log and performance metrics.
§Storage Support
zarrs
supports a huge range of storage backends through the opendal
and object_store
crates.
Stores and Storage Adapters
Store/Storage Adapter | ZEP | Read | Write | List | Sync | Async | Feature Flag |
---|---|---|---|---|---|---|---|
FilesystemStore | ZEP0001 | ✓ | ✓ | ✓ | ✓ | ||
MemoryStore | ✓ | ✓ | ✓ | ✓ | |||
HTTPStore | ✓ | ✓ | http | ||||
OpendalStore | ✓* | ✓* | ✓* | ✓ | opendal | ||
AsyncOpendalStore | ✓* | ✓* | ✓* | ✓ | opendal | ||
AsyncObjectStore | ✓* | ✓* | ✓* | ✓ | object_store | ||
ZipStorageAdapter | ✓ | ✓ | ✓ | zip |
* Support depends on the opendal
BlockingOperator
/Operator
or object_store
store.
§Examples
use zarrs::array::{ArrayBuilder, DataType, FillValue, ZARR_NAN_F32};
use zarrs::array::codec::GzipCodec; // requires gzip feature
use zarrs::array_subset::ArraySubset;
use zarrs::storage::{ReadableWritableListableStorage, store::FilesystemStore};
// Create a filesystem store
let store_path: PathBuf = "/path/to/store".into();
let store: ReadableWritableListableStorage =
Arc::new(FilesystemStore::new(&store_path)?);
// Create a new V3 array using the array builder
let array = ArrayBuilder::new(
vec![3, 4], // array shape
DataType::Float32,
vec![2, 2].try_into()?, // regular chunk shape (non-zero elements)
FillValue::from(ZARR_NAN_F32),
)
.bytes_to_bytes_codecs(vec![
Box::new(GzipCodec::new(5)?),
])
.dimension_names(["y", "x"].into())
.attributes(serde_json::json!({"Zarr V3": "is great"}).as_object().unwrap().clone())
.build(store.clone(), "/group/array")?; // /path/to/store/group/array
// Store the array metadata
array.store_metadata()?;
println!("{}", serde_json::to_string_pretty(array.metadata())?);
// {
// "zarr_format": 3,
// "node_type": "array",
// ...
// }
// Perform some operations on the chunks
array.store_chunk_elements::<f32>(
&[0, 1], // chunk index
&[0.2, 0.3, 1.2, 1.3]
)?;
array.store_array_subset_ndarray::<f32, _>(
&[1, 1], // array index
ndarray::array![[-1.1, -1.2], [-2.1, -2.2]]
)?;
array.erase_chunk(&[1, 1])?;
// Retrieve all array elements as an ndarray
let array_subset_all = ArraySubset::new_with_shape(array.shape().to_vec());
let array_ndarray = array.retrieve_array_subset_ndarray::<f32>(&array_subset_all)?;
println!("{array_ndarray:4}");
// [[ NaN, NaN, 0.2, 0.3],
// [ NaN, -1.1, -1.2, 1.3],
// [ NaN, -2.1, NaN, NaN]]
Examples can be run with cargo run --example <EXAMPLE_NAME>
.
- Add
-- --usage-log
to see storage API calls during example execution. - Some examples require non-default features, which can be enabled with
--all-features
or--features <FEATURES>
.
§Sync API Examples
array_write_read
,
array_write_read_ndarray
,
sharded_array_write_read
,
rectangular_array_write_read
,
zip_array_write_read
,
http_array_read
.
§Async API Examples
async_array_write_read
,
async_http_array_read_object_store
,
async_http_array_read_opendal
.
§Crate Features
§Default
ndarray
:ndarray
utility functions forArray
.- Codecs:
blosc
,gzip
,transpose
,zstd
,sharding
,crc32c
.
§Non-Default
async
: an experimental asynchronous API forstores
,Array
, andGroup
.- The async API is runtime-agnostic. This has some limitations that are detailed in the
Array
docs. - The async API is not as performant as the sync API.
- The async API is runtime-agnostic. This has some limitations that are detailed in the
- Codecs:
bitround
,bz2
,pcodec
,zfp
,zstd
. - Stores:
http
,object_store
,opendal
,zip
.
§zarrs
Ecosystem
- zarrs_tools: Various tools for creating and manipulating Zarr V3 data.
- zarrs_ffi: A subset of
zarrs
exposed as a C API.
§Licence
zarrs
is licensed under either of
- the Apache License, Version 2.0 LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0 or
- the MIT license LICENSE-MIT or http://opensource.org/licenses/MIT, at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Modules§
- Zarr arrays.
- Array subsets.
- Byte ranges.
zarrs
global configuration options.- Zarr groups.
- Zarr metadata.
- Zarr nodes.
- Zarr V3 extension points utilities.
- Zarr storage (stores and storage transformers).
zarrs
version information.