Expand description
zarrs
is Rust library for the Zarr storage format for multidimensional arrays and metadata.
If you are a Python user, check out zarrs-python
.
It includes a high-performance codec pipeline for the reference zarr-python
implementation.
zarrs
supports Zarr V3 and a V3 compatible subset of Zarr V2.
It is fully up-to-date and conformant with the Zarr 3.1 specification with support for:
- all core extensions (data types, codecs, chunk grids, chunk key encodings, storage transformers),
- all accepted Zarr Enhancement Proposals (ZEPs) and several draft ZEPs:
- ZEP 0003: Variable chunking
- ZEP 0007: Strings
- ZEP 0009: Zarr Extension Naming
- various registered extensions from zarr-developers/zarr-extensions/,
- experimental codecs and data types intended for future registration, and
- user-defined custom extensions and stores.
A changelog can be found here. Correctness issues with past versions are detailed here.
Developed at the Department of Materials Physics, Australian National University, Canberra, Australia.
§Getting Started
- Review the implementation status which summarises zarr version support, array support (codecs, data types, etc.) and storage support.
- Read The
zarrs
Book. - View the examples and the example below.
- Read the documentation.
- Check out the
zarrs
ecosystem.
§Implementation Status
§Zarr Version Support
zarrs
has first-class Zarr V3 support and additionally supports a compatible subset of Zarr V2 data that:
- can be converted to V3 with only a metadata change, and
- uses array metadata that is recognised and supported for encoding/decoding.
zarrs
supports forward conversion from Zarr V2 to V3. See “Converting Zarr V2 to V3” in The zarrs
Book, or try the zarrs_reencode
CLI tool.
§Array Support
Data Types
Codecs
Codec Type | Default codec name | Specification | Feature Flag* |
---|---|---|---|
Array to Array | transpose | Zarr V3.0 Transpose | transpose |
numcodecs.fixedscaleoffset | Experimental | ||
numcodecs.bitround † | Experimental | bitround | |
zarrs.squeeze | Experimental | ||
Array to Bytes | bytes | Zarr V3.0 Bytes | |
sharding_indexed | Zarr V3.0 Sharding | sharding | |
vlen-array | Experimental | ||
vlen-bytes | zarr-extensions/codecs/vlen-bytes | ||
vlen-utf8 | zarr-extensions/codecs/vlen-utf8 | ||
numcodecs.pcodec | Experimental | pcodec | |
numcodecs.zfpy | Experimental | zfp | |
packbits | zarr-extensions/codecs/packbits | ||
zarrs.vlen | Experimental | ||
zarrs.vlen_v2 | Experimental | ||
zfp | zarr-extensions/codecs/zfp | zfp | |
Bytes to Bytes | blosc | Zarr V3.0 Blosc | blosc |
crc32c | Zarr V3.0 CRC32C | crc32c | |
gzip | Zarr V3.0 Gzip | gzip | |
zstd | zarr-extensions/codecs/zstd | zstd | |
numcodecs.bz2 | Experimental | bz2 | |
numcodecs.fletcher32 | Experimental | fletcher32 | |
numcodecs.shuffle | Experimental | ||
numcodecs.zlib | Experimental | zlib | |
zarrs.gdeflate | Experimental | gdeflate |
* Bolded feature flags are part of the default set of features.
† numcodecs.bitround
supports additional data types not supported by zarr-python
/numcodecs
Codecs have three potential statuses:
- Core: These are defined in the Zarr V3 specification and are fully supported.
- Registered: These are specified at https://github.com/zarr-developers/zarr-extensions/ and are fully supported unless otherwise indicated.
- Experimental: These are recommended for evaluation only.
- These codecs may have no formal specification or are pending registration at https://github.com/zarr-developers/zarr-extensions/.
- These codecs may change in future releases without maintaining backwards compatibility.
Codec name
s and aliases are configurable with Config::codec_aliases_v3_mut
and Config::codec_aliases_v2_mut
.
zarrs
will persist codec names if opening an existing array of creating an array from metadata.
zarrs
supports arrays created with zarr-python
3.x.x with various numcodecs.zarr3
codecs.
However, arrays must be written with numcodecs
0.15.1+.
Chunk Grids
Chunk Grid | ZEP | V3 | V2 | Feature Flag |
---|---|---|---|---|
regular | ZEP0001 | ✓ | ✓ | |
rectangular (experimental) | ZEP0003 (draft) | ✓ |
Storage Transformers
Zarr V3 does not currently define any storage transformers.
§Storage Support
zarrs
supports a huge range of stores (including custom stores) via the zarrs_storage
API.
Stores
Store/Storage Adapter | ZEP | Read | Write | List | Sync | Async | Crate |
---|---|---|---|---|---|---|---|
MemoryStore | ✓ | ✓ | ✓ | ✓ | zarrs_storage† | ||
FilesystemStore | 0001 | ✓ | ✓ | ✓ | ✓ | zarrs_filesystem‡ | |
OpendalStore | ✓* | ✓* | ✓* | ✓ | zarrs_opendal | ||
AsyncOpendalStore | ✓* | ✓* | ✓* | ✓ | zarrs_opendal | ||
AsyncObjectStore | ✓* | ✓* | ✓* | ✓ | zarrs_object_store | ||
AsyncIcechunkStore | ✓* | ✓* | ✓* | ✓ | zarrs_icechunk | ||
HTTPStore | ✓ | ✓ | zarrs_http | ||||
AsyncToSyncStorageAdapter | ✓ | ✓ | ✓ | ✓ | ✓ | zarrs_storage† | |
UsageLogStorageAdapter | ✓ | ✓ | ✓ | ✓ | ✓ | zarrs_storage† | |
PerformanceMetricsStorageAdapter | ✓ | ✓ | ✓ | ✓ | ✓ | zarrs_storage† | |
ZipStorageAdapter | ✓ | ✓ | ✓ | zarrs_zip |
† Re-exported in the zarrs::storage
module.
‡ Re-exported as the zarrs::filesystem
module.
* Support depends on the underlying store.
The opendal
and object_store
crates are popular Rust storage backends that are fully supported via zarrs_opendal
and zarrs_object_store
.
These backends provide more feature complete HTTP stores than zarrs_http
.
zarrs_icechunk
implements the Icechunk transactional storage engine, a storage specification for Zarr that supports object_store
stores.
The AsyncToSyncStorageAdapter
enables some async stores to be used in a sync context.
§Examples
§Create and Read a Zarr Hierarchy
use zarrs::group::GroupBuilder;
use zarrs::array::{ArrayBuilder, DataType, FillValue, ZARR_NAN_F32};
use zarrs::array::codec::GzipCodec; // requires gzip feature
use zarrs::array_subset::ArraySubset;
use zarrs::storage::ReadableWritableListableStorage;
use zarrs::filesystem::FilesystemStore; // requires filesystem feature
// Create a filesystem store
let store_path: PathBuf = "/path/to/hierarchy.zarr".into();
let store: ReadableWritableListableStorage =
Arc::new(FilesystemStore::new(&store_path)?);
// Write the root group metadata
GroupBuilder::new()
.build(store.clone(), "/")?
// .attributes(...)
.store_metadata()?;
// Create a new V3 array using the array builder
let array = ArrayBuilder::new(
vec![3, 4], // array shape
DataType::Float32,
vec![2, 2].try_into()?, // regular chunk shape (non-zero elements)
FillValue::from(ZARR_NAN_F32),
)
.bytes_to_bytes_codecs(vec![
Arc::new(GzipCodec::new(5)?),
])
.dimension_names(["y", "x"].into())
.attributes(serde_json::json!({"Zarr V3": "is great"}).as_object().unwrap().clone())
.build(store.clone(), "/array")?; // /path/to/hierarchy.zarr/array
// Store the array metadata
array.store_metadata()?;
println!("{}", serde_json::to_string_pretty(array.metadata())?);
// {
// "zarr_format": 3,
// "node_type": "array",
// ...
// }
// Perform some operations on the chunks
array.store_chunk_elements::<f32>(
&[0, 1], // chunk index
&[0.2, 0.3, 1.2, 1.3]
)?;
array.store_array_subset_ndarray::<f32, _>(
&[1, 1], // array index (start of subset)
ndarray::array![[-1.1, -1.2], [-2.1, -2.2]]
)?;
array.erase_chunk(&[1, 1])?;
// Retrieve all array elements as an ndarray
let array_ndarray = array.retrieve_array_subset_ndarray::<f32>(&array.subset_all())?;
println!("{array_ndarray:4}");
// [[ NaN, NaN, 0.2, 0.3],
// [ NaN, -1.1, -1.2, 1.3],
// [ NaN, -2.1, NaN, NaN]]
§More examples
Various examples can be found in the examples directory that demonstrate:
- creating and manipulating zarr hierarchies with various stores (sync and async), codecs, etc,
- converting between Zarr V2 and V3, and
- creating custom data types.
Examples can be run with cargo run --example <EXAMPLE_NAME>
.
- Some examples require non-default features, which can be enabled with
--all-features
or--features <FEATURES>
. - Some examples support a
-- --usage-log
argument to print storage API calls during execution.
§Crate Features
§Default
filesystem
: Re-exportzarrs_filesystem
aszarrs::filesystem
ndarray
:ndarray
utility functions forArray
.- Codecs:
blosc
,crc32c
,gzip
,sharding
,transpose
,zstd
.
§Non-Default
async
: an experimental asynchronous API forstores
,Array
, andGroup
.- The async API is runtime-agnostic. This has some limitations that are detailed in the
Array
docs. - The async API is not as performant as the sync API.
- The async API is runtime-agnostic. This has some limitations that are detailed in the
dlpack
: adds convenience methods forDLPack
tensor interop toArray
- Codecs:
bitround
,bz2
,fletcher32
,gdeflate
,pcodec
,zfp
,zlib
.
§zarrs
Ecosystem
The Zarr specification is inherently unstable. It is under active development and new extensions are continually being introduced.
The zarrs
crate has been split into multiple crates to:
- allow external implementations of stores and extensions points to target a relatively stable API compatible with a range of
zarrs
versions, - enable automatic backporting of metadata compatibility fixes and changes due to standardisation,
- stay up-to-date with unstable public dependencies (e.g.
opendal
,object_store
,icechunk
, etc) without impacting the release cycle ofzarrs
, and - improve compilation times.
A hierarchical overview of these crates can be found in the The zarrs
Book.
§Core
zarrs
: The core library for manipulating Zarr hierarchies.zarrs_metadata
: Zarr metadata support (re-exported aszarrs::metadata
).zarrs_metadata_ext
: Zarr extensions metadata support (re-exported aszarrs::metadata_ext
).zarrs_data_type
: The data type extension API forzarrs
(re-exported inzarrs::array::data_type
).zarrs_storage
: The storage API forzarrs
(re-exported aszarrs::storage
).zarrs_plugin
: The plugin API forzarrs
(re-exported aszarrs::plugin
).zarrs_registry
: The Zarr extension point registry forzarrs
(re-exported aszarrs::registry
).
§Stores
zarrs_filesystem
: A filesystem store (re-exported aszarrs::filesystem
).zarrs_object_store
:object_store
store support.zarrs_opendal
:opendal
store support.zarrs_http
: A synchronous http store.zarrs_zip
: A storage adapter for zip files.zarrs_icechunk
:icechunk
store support.git
-like version control for Zarr hierachies.- Read “virtual Zarr datacubes” of archival formats (e.g.,
netCDF4
,HDF5
, etc.) created byVirtualiZarr
and backed byicechunk
.
§Bindings
zarrs-python
: A high-performance codec pipeline forzarr-python
.zarrs_ffi
: A subset ofzarrs
exposed as a C/C++ API.
§Zarr Metadata Conventions
ome_zarr_metadata
: A library for OME-Zarr (previously OME-NGFF) metadata.
§Tools
zarrs_tools
: Various tools for creating and manipulating Zarr V3 data with thezarrs
rust crate- A reencoder that can change codecs, chunk shape, convert Zarr V2 to V3, etc.
- Create an OME-Zarr hierarchy from a Zarr array.
- Transform arrays: crop, rescale, downsample, gradient magnitude, gaussian, noise filtering, etc.
§Benchmarks
zarr_benchmarks
: Benchmarks of various Zarr V3 implementations:zarrs
,zarr-python
,tensorstore
§Licence
zarrs
is licensed under either of
- the Apache License, Version 2.0 LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0 or
- the MIT license LICENSE-MIT or http://opensource.org/licenses/MIT, at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Re-exports§
pub use zarrs_metadata as metadata;
pub use zarrs_metadata_ext as metadata_ext;
pub use zarrs_plugin as plugin;
pub use zarrs_registry as registry;
pub use zarrs_storage as storage;
pub use zarrs_filesystem as filesystem;
filesystem
Modules§
- array
- Zarr arrays.
- array_
subset - Array subsets.
- byte_
range - Byte ranges.
- config
zarrs
global configuration options.- group
- Zarr groups.
- node
- Zarr nodes.
- version
zarrs
version information.