Expand description
zarrs is Rust library for the Zarr storage format for multidimensional arrays and metadata.
If you are a Python user, check out zarrs-python.
It includes a high-performance codec pipeline for the reference zarr-python implementation.
zarrs supports Zarr V3 and a V3 compatible subset of Zarr V2.
It is fully up-to-date and conformant with the Zarr 3.1 specification with support for:
- all core extensions (data types, codecs, chunk grids, chunk key encodings, storage transformers),
- all accepted Zarr Enhancement Proposals (ZEPs) and several draft ZEPs:
- ZEP 0003: Variable chunking
- ZEP 0007: Strings
- ZEP 0009: Zarr Extension Naming
- various registered extensions from zarr-developers/zarr-extensions/,
- experimental codecs intended for future registration, and
- user-defined custom extensions and stores.
A changelog can be found here. Correctness issues with past versions are detailed here.
Developed at the Department of Materials Physics, Australian National University, Canberra, Australia.
Β§Getting Started
- Review the Zarr version support, array extension support (codecs, data types, etc.), storage support, and the
zarrsecosystem. - View the the examples below.
- Read the documentation and The
zarrsBook.
Β§Zarr Version Support
zarrs has first-class Zarr V3 support and additionally supports a compatible subset of Zarr V2 data that:
- can be converted to V3 with only a metadata change, and
- uses array metadata that is recognised and supported for encoding/decoding.
zarrs supports forward conversion from Zarr V2 to V3. See Converting Zarr V2 to V3 in The zarrs Book, or try the zarrs_reencode CLI tool.
Β§Array Extension Support
Extensions are grouped into three categories:
- Core: defined in the Zarr V3 specification and are fully supported.
- Registered: specified at https://github.com/zarr-developers/zarr-extensions/
- Registered extensions listed in the below tables are fully supported unless otherwise indicated.
- Experimental: indicated by π§ in the tables below and recommended for evaluation only.
- Experimental extensions are either pending registration or have no formal specification outside of the
zarrsdocs. - Experimental extensions may be unrecognised or incompatible with other Zarr implementations.
- Experimental extensions may change in future releases without maintaining backwards compatibility.
- Experimental extensions are either pending registration or have no formal specification outside of the
- Deprecated: indicated by
strikethroughin the tables below- Deprecated aliases will not be removed, but are not recommended for use in new arrays.
- Deprecated extensions may be removed in future releases.
zarrs will persist extension names if opening an existing array of creating an array from metadata.
Β§Data Types
β Additional features (e.g. microfloat / float8) may be required to parse floating point fill values. All subfloat types support hex string fill values.
Β§Codecs
| Codec Type | V3 name | V2 id | Feature Flag* |
|---|---|---|---|
| Array to Array | transpose | (implicit with "order": "F") | transpose |
π§reshape | - | ||
π§numcodecs.fixedscaleoffset | fixedscaleoffset | ||
bitround | bitround | bitround | |
π§zarrs.squeeze | - | ||
| Array to Bytes | bytes | (implicit array-to-bytes) | |
π§zarrs.optional | - | ||
sharding_indexed | - | sharding | |
π§vlen-array | vlen-array | ||
vlen-bytes | vlen-bytes | ||
vlen-utf8 | vlen-utf8 | ||
packbits | - | ||
π§numcodecs.pcodec | pcodec | pcodec | |
π§numcodecs.zfpy | zfpy | zfp | |
π§zarrs.vlen | - | ||
π§zarrs.vlen_v2 | - | ||
zfp | - | zfp | |
| Bytes to Bytes | blosc | blosc | blosc |
crc32c | crc32c | crc32c | |
gzip | gzip | gzip | |
zstd | zstd | zstd | |
π§numcodecs.adler32 | adler32 | adler32 | |
π§numcodecs.bz2 | bz2 | bz2 | |
π§numcodecs.fletcher32 | fletcher32 | fletcher32 | |
π§numcodecs.shuffle | shuffle | ||
π§numcodecs.zlib | zlib | zlib | |
π§zarrs.gdeflate | - | gdeflate |
* Bolded feature flags are part of the default set of features.
zarrs supports arrays created with zarr-python 3.0.0+ and numcodecs 0.15.1+ with various numcodecs.zarr3 codecs.
Β§Chunk Grids
| Chunk Grid | ZEP | V3 | V2 | Feature Flag |
|---|---|---|---|---|
regular | ZEP0001 | β | β | |
rectilinear | β | |||
π§rectangular | ZEP0003 (draft) | β | ||
π§zarrs.regular_bounded | β |
Β§Chunk Key Encodings
| Chunk Key Encoding | ZEP | V3 | V2 | Feature Flag |
|---|---|---|---|---|
default | ZEP0001 | β | ||
v2 | ZEP0001 | β | β | |
π§zarrs.default_suffix | β |
Β§Storage Transformers
Zarr V3 does not currently define any storage transformers.
Β§Storage Support
zarrs supports a huge range of stores (including custom stores) via the zarrs_storage API.
| Store/Storage Adapter | ZEP | Read | Write | List | Sync | Async | Crate |
|---|---|---|---|---|---|---|---|
| MemoryStore | β | β | β | β | zarrs_storageβ | ||
| FilesystemStore | 0001 | β | β | β | β | zarrs_filesystemβ‘ | |
| AsyncOpendalStore | β* | β* | β* | β | zarrs_opendal | ||
| AsyncObjectStore | β* | β* | β* | β | zarrs_object_store | ||
| AsyncIcechunkStore | β* | β* | β* | β | zarrs_icechunk | ||
| HTTPStore | β | β | zarrs_http | ||||
| AsyncToSyncStorageAdapter | β | β | β | β | zarrs_storageβ | ||
| SyncToAsyncStorageAdapter | β | β | β | β | zarrs_storageβ | ||
| UsageLogStorageAdapter | β | β | β | β | β | zarrs_storageβ | |
| PerformanceMetricsStorageAdapter | β | β | β | β | β | zarrs_storageβ | |
| ZipStorageAdapter | β | β | β | β | zarrs_zip |
β Re-exported in the zarrs::storage module.
β‘ Re-exported as the zarrs::filesystem module.
* Support depends on the underlying store.
The opendal and object_store crates are popular Rust storage backends that are fully supported via zarrs_opendal and zarrs_object_store.
These backends provide more feature complete HTTP stores than zarrs_http.
zarrs_icechunk implements the Icechunk transactional storage engine, a storage specification for Zarr that supports object_store stores.
The AsyncToSyncStorageAdapter enables some async stores to be used in a sync context.
Β§Logging
zarrs logs information and warnings using the log crate.
A logging implementation must be enabled to capture logs.
See the log crate documentation for more details.
Β§Examples
Β§Create and Read a Zarr Hierarchy
// Create a filesystem store
let store_path: PathBuf = "/path/to/hierarchy.zarr".into();
let store: zarrs::storage::ReadableWritableListableStorage = Arc::new(
// zarrs::filesystem requires the filesystem feature
zarrs::filesystem::FilesystemStore::new(&store_path)?
);
// Write the root group metadata
zarrs::group::GroupBuilder::new()
.build(store.clone(), "/")?
// .attributes(...)
.store_metadata()?;
// Create a new sharded V3 array using the array builder
let array = zarrs::array::ArrayBuilder::new(
vec![3, 4], // array shape
vec![2, 2], // regular chunk (shard) shape
zarrs::array::data_type::float32(),
0.0f32, // fill value
)
.subchunk_shape(vec![2, 1]) // subchunk (inner chunk) shape
.bytes_to_bytes_codecs(vec![
// GzipCodec requires the gzip feature
Arc::new(zarrs::array::codec::GzipCodec::new(5)?),
])
.dimension_names(["y", "x"].into())
.attributes(serde_json::json!({"Zarr V3": "is great"}).as_object().unwrap().clone())
.build(store.clone(), "/array")?; // /path/to/hierarchy.zarr/array
// Store the array metadata
array.store_metadata()?;
println!("{}", serde_json::to_string_pretty(array.metadata())?);
// {
// "zarr_format": 3,
// "node_type": "array",
// ...
// }
// Perform some write operations on the chunks
array.store_chunk(
&[0, 1], // chunk index
&[0.2f32, 0.3, 1.2, 1.3]
)?;
array.store_array_subset(
&[1..3, 1..3], // array indices
&ndarray::array![[-1.1f32, -1.2], [-2.1, -2.2]]
)?;
array.erase_chunk(&[1, 1])?;
// Retrieve all array elements as an ndarray
let array_all: ndarray::Array2<f32> = array.retrieve_array_subset(&[0..3, 0..4])?;
println!("{array_all:4}");
// [[ NaN, NaN, 0.2, 0.3],
// [ NaN, -1.1, -1.2, 1.3],
// [ NaN, -2.1, NaN, NaN]]
// Retrieve a chunk directly
let array_chunk: ndarray::Array2<f32> = array.retrieve_chunk(
&[0, 1], // chunk index
)?;
println!("{array_chunk:4}");
// [[ 0.2, 0.3],
// [ -1.2, 1.3]]
// Retrieve a subchunk
use zarrs::array::ArrayShardedReadableExt;
let shard_index_cache = zarrs::array::ArrayShardedReadableExtCache::new(&array);
let array_subchunk: ndarray::Array2<f32> = array.retrieve_subchunk_opt(
&shard_index_cache,
&[0, 3], // subchunk index
&zarrs::array::CodecOptions::default(),
)?;
println!("{array_subchunk:4}");
// [[ 0.3],
// [ 1.3]]Β§Additional Examples
Various examples can be found in the examples/ directory of the zarrs repository that demonstrate:
- creating and manipulating zarr hierarchies with various stores (sync and async), codecs, etc,
- converting between Zarr V2 and V3, and
- creating custom data types.
Examples can be run with cargo run --example <EXAMPLE_NAME>.
- Some examples require non-default features, which can be enabled with
--all-featuresor--features <FEATURES>. - Some examples support a
-- --usage-logargument to print storage API calls during execution.
Β§Crate Features
Β§Default
filesystem: Re-exportzarrs_filesystemaszarrs::filesystem.ndarray:ndarrayutility functions forArray.- Codecs:
blosc,crc32c,gzip,sharding,transpose,zstd.
Β§Non-Default
async: an experimental asynchronous API forstores,Array, andGroup.- The async API is runtime-agnostic. This has some limitations that are detailed in the
Arraydocs. - The async API is not as performant as the sync API.
- The async API is runtime-agnostic. This has some limitations that are detailed in the
- Codecs:
adler32,bitround,bz2,fletcher32,gdeflate,pcodec,zfp,zlib. dlpack: adds convenience methods forDLPacktensor interop toArray.- Additional
Element/ElementOwnedimplementations:float8: add support forfloat8subfloat data types.microfloat: add support formicrofloatsubfloat data types.jiff: add support forjifftime data types.chrono: add support forchronotime data types.
Β§zarrs Ecosystem
The Zarr specification is inherently unstable. It is under active development and new extensions are continually being introduced.
The zarrs crate has been split into multiple crates to:
- allow external implementations of stores and extensions points to target a relatively stable API compatible with a range of
zarrsversions, - enable automatic backporting of metadata compatibility fixes and changes due to standardisation,
- stay up-to-date with unstable public dependencies (e.g.
opendal,object_store,icechunk, etc) without impacting the release cycle ofzarrs, and - improve compilation times.
A hierarchical overview of these crates can be found in the The zarrs Book.
Β§Core
zarrs: The core library for manipulating Zarr hierarchies.zarrs_metadata: Zarr metadata support.- Re-exports
zarrs::metadata.
- Re-exports
zarrs_metadata_ext: Zarr extensions metadata support.- Re-exports
zarrs::metadata_ext.
- Re-exports
zarrs_plugin: The plugin API.- Re-exports
zarrs::plugin.
- Re-exports
zarrs_storage: The storage API.- Re-exports
zarrs::storage.
- Re-exports
zarrs_chunk_grid: The chunk grid extension API.- Re-exports items under
zarrs::arrayandzarrs::array::chunk_grid.
- Re-exports items under
zarrs_chunk_key_encoding: The chunk key encoding extension API.- Re-exports items under
zarrs::arrayandzarrs::array::chunk_key_encoding.
- Re-exports items under
zarrs_codec: The codec extension API.- Re-exports items under
zarrs::arrayandzarrs::array::codec.
- Re-exports items under
zarrs_data_type: The data type extension API.- Re-exports items under
zarrs::arrayandzarrs::array::data_type.
- Re-exports items under
Β§Stores
zarrs_filesystem: A filesystem store.- Re-exports
zarrs::filesystem.
- Re-exports
zarrs_object_store:object_storestore support.zarrs_opendal:opendalstore support.zarrs_http: A synchronous http store.zarrs_zip: A storage adapter for zip files.zarrs_icechunk:icechunkstore support.git-like version control for Zarr hierachies.- Read βvirtual Zarr datacubesβ of archival formats (e.g.,
netCDF4,HDF5, etc.) created byVirtualiZarrand backed byicechunk.
Β§Bindings
zarrs-python: A high-performance codec pipeline forzarr-python.zarrs_ffi: A subset ofzarrsexposed as a C/C++ API.
Β§Zarr Metadata Conventions
ome_zarr_metadata: A library for OME-Zarr (previously OME-NGFF) metadata.
Β§Tools
zarrs_tools: Various tools for creating and manipulating Zarr V3 data with thezarrsrust crate- A reencoder that can change codecs, chunk shape, convert Zarr V2 to V3, etc.
- Create an OME-Zarr hierarchy from a Zarr array.
- Transform arrays: crop, rescale, downsample, gradient magnitude, gaussian, noise filtering, etc.
Β§Benchmarks
zarr_benchmarks: Benchmarks of various Zarr V3 implementations:zarrs,zarr-python,tensorstore
Β§Licence
zarrs is licensed under either of
- the Apache License, Version 2.0 LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0 or
- the MIT license LICENSE-MIT or http://opensource.org/licenses/MIT, at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Re-exportsΒ§
pub use zarrs_filesystem as filesystem;filesystempub use zarrs_metadata as metadata;pub use zarrs_metadata_ext as metadata_ext;pub use zarrs_plugin as plugin;pub use zarrs_storage as storage;