Expand description
zarrs is Rust library for the Zarr storage format for multidimensional arrays and metadata.
If you are a Python user, check out zarrs-python.
It includes a high-performance codec pipeline for the reference zarr-python implementation.
zarrs supports Zarr V3 and a V3 compatible subset of Zarr V2.
It is fully up-to-date and conformant with the Zarr 3.1 specification with support for:
- all core extensions (data types, codecs, chunk grids, chunk key encodings, storage transformers),
- all accepted Zarr Enhancement Proposals (ZEPs) and several draft ZEPs:
- ZEP 0003: Variable chunking
- ZEP 0007: Strings
- ZEP 0009: Zarr Extension Naming
- various registered extensions from zarr-developers/zarr-extensions/,
- experimental codecs intended for future registration, and
- user-defined custom extensions and stores.
A changelog can be found here. Correctness issues with past versions are detailed here.
Developed at the Department of Materials Physics, Australian National University, Canberra, Australia.
Β§Getting Started
- Review the Zarr version support, array extension support (codecs, data types, etc.), storage support, and the
zarrsecosystem. - View the the examples below.
- Read the documentation and The
zarrsBook.
Β§Zarr Version Support
zarrs has first-class Zarr V3 support and additionally supports a compatible subset of Zarr V2 data that:
- can be converted to V3 with only a metadata change, and
- uses array metadata that is recognised and supported for encoding/decoding.
zarrs supports forward conversion from Zarr V2 to V3. See Converting Zarr V2 to V3 in The zarrs Book, or try the zarrs_reencode CLI tool.
Β§Array Extension Support
Extensions are grouped into three categories:
- Core: defined in the Zarr V3 specification and are fully supported.
- Registered: specified at https://github.com/zarr-developers/zarr-extensions/
- Registered extensions listed in the below tables are fully supported unless otherwise indicated.
- Experimental: indicated by π§ in the tables below and recommended for evaluation only.
- Experimental extensions are either pending registration or have no formal specification outside of the
zarrsdocs. - Experimental extensions may be unrecognised or incompatible with other Zarr implementations.
- Experimental extensions may change in future releases without maintaining backwards compatibility.
- Experimental extensions are either pending registration or have no formal specification outside of the
- Deprecated: indicated by
strikethroughin the tables below- Deprecated aliases will not be removed, but are not recommended for use in new arrays.
- Deprecated extensions may be removed in future releases.
Extension names and aliases are configurable with Config::codec_aliases_v3_mut and similar methods for data types and Zarr V2.
zarrs will persist extension names if opening an existing array of creating an array from metadata.
Β§Data Types
DataType | V3 data_type name | V2 dtype | ElementOwned / Element(Feature Flag) |
|---|---|---|---|
Bool | bool | |b1 | bool |
Int2 | int2 | i8 | |
Int4 | int4 | i8 | |
Int8 | int8 | |i1 | i8 |
Int16 | int16 | >i2 <i2 | i16 |
Int32 | int32 | >i4 <i4 | i32 |
Int64 | int64 | >i8 <i8 | i64 |
UInt2 | uint2 | u8 | |
UInt4 | uint4 | u8 | |
UInt8 | uint8 | |u1 | u8 |
UInt16 | uint16 | >u2 <u2 | u16 |
UInt32 | uint32 | >u4 <u4 | u32 |
UInt64 | uint64 | >u8 <u8 | u64 |
Float4E2M1FNβ | float4_e2m1fn | ||
Float6E2M3FNβ | float6_e2m3fn | ||
Float6E3M2FNβ | float6_e3m2fn | ||
Float8E3M4β | float8_e3m4 | ||
Float8E4M3β | float8_e4m3 | float8::F8E4M3 (float8) | |
Float8E4M3B11FNUZβ | float8_e4m3b11fnuz | ||
Float8E4M3FNUZβ | float8_e4m3fnuz | ||
Float8E5M2β | float8_e5m2 | float8::F8E5M2 (float8) | |
Float8E5M2FNUZβ | float8_e5m2fnuz | ||
Float8E8M0FNUβ | float8_e8m0fnu | ||
BFloat16 | bfloat16 | half::bf16 | |
Float16 | float16 | >f2 <f2 | half::f16 |
Float32 | float32 | >f4 <f4 | f32 |
Float64 | float64 | >f8 <f8 | f64 |
Complex64 | complex64 | >c8 <c8 | Complex<f32> |
Complex128 | complex128 | >c16 <c16 | Complex<f64> |
ComplexBFloat16 | complex_bfloat16 | Complex<half::bf16> | |
ComplexFloat16 | complex_float16 | Complex<half::f16> | |
ComplexFloat32 | complex_float32 | Complex<f32> | |
ComplexFloat64 | complex_float64 | Complex<f64> | |
ComplexFloat4E2M1FNβ | complex_float4_e2m1fn | ||
ComplexFloat6E2M3FNβ | complex_float6_e2m3fn | ||
ComplexFloat6E3M2FNβ | complex_float6_e3m2fn | ||
ComplexFloat8E3M4β | complex_float8_e3m4 | ||
ComplexFloat8E4M3β | complex_float8_e4m3 | Complex<float8::F8E4M3> (float8) | |
ComplexFloat8E4M3B11FNUZβ | complex_float8_e4m3b11fnuz | ||
ComplexFloat8E4M3FNUZβ | complex_float8_e4m3fnuz | ||
ComplexFloat8E5M2β | complex_float8_e5m2 | Complex<float8::F8E5M2> (float8) | |
ComplexFloat8E5M2FNUZβ | complex_float8_e5m2fnuz | ||
ComplexFloat8E8M0FNUβ | complex_float8_e8m0fnu | ||
RawBits | r* | [u8; N] / &[u8; N] | |
String | string | |O | String / &str |
Bytes | bytesbinaryπ§ variable_length_bytes | |VX | Vec<u8> / &[u8] |
NumpyDateTime64 | numpy.datetime64 | i64chrono::DateTime<Utc> (chrono)jiff::Timestamp (jiff) | |
NumpyTimeDelta64 | numpy.timedelta64 | i64chrono::TimeDelta (chrono)jiff::SignedDuration (jiff) |
β Additional features (e.g. float8) may be required to parse floating point fill values. All subfloat types support hex string fill values.
Β§Codecs
| Codec Type | V3 name | V2 id | Feature Flag* |
|---|---|---|---|
| Array to Array | transpose | transpose | transpose |
π§reshape | - | ||
π§numcodecs.fixedscaleoffset | fixedscaleoffset | ||
bitround | bitround | bitround | |
π§zarrs.squeeze | - | ||
| Array to Bytes | bytes | - | |
sharding_indexed | - | sharding | |
π§vlen-array | vlen-array | ||
vlen-bytes | vlen-bytes | ||
vlen-utf8 | vlen-utf8 | ||
packbits | packbits | ||
π§numcodecs.pcodec | pcodec | pcodec | |
π§numcodecs.zfpy | zfpy | zfp | |
π§zarrs.vlen | - | ||
π§zarrs.vlen_v2 | - | ||
zfp | - | zfp | |
| Bytes to Bytes | blosc | blosc | blosc |
crc32c | crc32c | crc32c | |
gzip | gzip | gzip | |
zstd | zstd | zstd | |
π§numcodecs.adler32 | adler32 | adler32 | |
π§numcodecs.bz2 | bz2 | bz2 | |
π§numcodecs.fletcher32 | fletcher32 | fletcher32 | |
π§numcodecs.shuffle | shuffle | ||
π§numcodecs.zlib | zlib | zlib | |
π§zarrs.gdeflate | - | gdeflate |
* Bolded feature flags are part of the default set of features.
zarrs supports arrays created with zarr-python 3.0.0+ and numcodecs 0.15.1+ with various numcodecs.zarr3 codecs.
Β§Chunk Grids
| Chunk Grid | ZEP | V3 | V2 | Feature Flag |
|---|---|---|---|---|
regular | ZEP0001 | β | β | |
π§rectangular | ZEP0003 (draft) | β | ||
π§zarrs.regular_bounded | β |
Β§Chunk Key Encodings
| Chunk Key Encoding | ZEP | V3 | V2 | Feature Flag |
|---|---|---|---|---|
default | ZEP0001 | β | ||
v2 | ZEP0001 | β | β | |
π§zarrs.default_suffix | β |
Β§Storage Transformers
Zarr V3 does not currently define any storage transformers.
Β§Storage Support
zarrs supports a huge range of stores (including custom stores) via the zarrs_storage API.
| Store/Storage Adapter | ZEP | Read | Write | List | Sync | Async | Crate |
|---|---|---|---|---|---|---|---|
| MemoryStore | β | β | β | β | zarrs_storageβ | ||
| FilesystemStore | 0001 | β | β | β | β | zarrs_filesystemβ‘ | |
| AsyncOpendalStore | β* | β* | β* | β | zarrs_opendal | ||
| AsyncObjectStore | β* | β* | β* | β | zarrs_object_store | ||
| AsyncIcechunkStore | β* | β* | β* | β | zarrs_icechunk | ||
| HTTPStore | β | β | zarrs_http | ||||
| AsyncToSyncStorageAdapter | β | β | β | β | zarrs_storageβ | ||
| SyncToAsyncStorageAdapter | β | β | β | β | zarrs_storageβ | ||
| UsageLogStorageAdapter | β | β | β | β | β | zarrs_storageβ | |
| PerformanceMetricsStorageAdapter | β | β | β | β | β | zarrs_storageβ | |
| ZipStorageAdapter | β | β | β | zarrs_zip |
β Re-exported in the zarrs::storage module.
β‘ Re-exported as the zarrs::filesystem module.
* Support depends on the underlying store.
The opendal and object_store crates are popular Rust storage backends that are fully supported via zarrs_opendal and zarrs_object_store.
These backends provide more feature complete HTTP stores than zarrs_http.
zarrs_icechunk implements the Icechunk transactional storage engine, a storage specification for Zarr that supports object_store stores.
The AsyncToSyncStorageAdapter enables some async stores to be used in a sync context.
Β§Logging
zarrs logs information and warnings using the log crate.
A logging implementation must be enabled to capture logs.
See the log crate documentation for more details.
Β§Examples
Β§Create and Read a Zarr Hierarchy
// Create a filesystem store
let store_path: PathBuf = "/path/to/hierarchy.zarr".into();
let store: zarrs::storage::ReadableWritableListableStorage = Arc::new(
// zarrs::filesystem requires the filesystem feature
zarrs::filesystem::FilesystemStore::new(&store_path)?
);
// Write the root group metadata
zarrs::group::GroupBuilder::new()
.build(store.clone(), "/")?
// .attributes(...)
.store_metadata()?;
// Create a new sharded V3 array using the array builder
let array = zarrs::array::ArrayBuilder::new(
vec![3, 4], // array shape
vec![2, 2], // regular chunk (shard) shape
zarrs::array::DataType::Float32,
0.0f32, // fill value
)
.array_to_bytes_codec(Arc::new(
// The sharding codec requires the sharding feature
zarrs::array::codec::ShardingCodecBuilder::new(
[2, 1].try_into()? // inner chunk shape
)
.bytes_to_bytes_codecs(vec![
// GzipCodec requires the gzip feature
Arc::new(zarrs::array::codec::GzipCodec::new(5)?),
])
.build()
))
.dimension_names(["y", "x"].into())
.attributes(serde_json::json!({"Zarr V3": "is great"}).as_object().unwrap().clone())
.build(store.clone(), "/array")?; // /path/to/hierarchy.zarr/array
// Store the array metadata
array.store_metadata()?;
println!("{}", serde_json::to_string_pretty(array.metadata())?);
// {
// "zarr_format": 3,
// "node_type": "array",
// ...
// }
// Perform some write operations on the chunks
array.store_chunk_elements::<f32>(
&[0, 1], // chunk index
&[0.2, 0.3, 1.2, 1.3]
)?;
array.store_array_subset_ndarray::<f32, _>(
&[1, 1], // array index (start of subset)
ndarray::array![[-1.1, -1.2], [-2.1, -2.2]]
)?;
array.erase_chunk(&[1, 1])?;
// Retrieve all array elements as an ndarray
let array_all = array.retrieve_array_subset_ndarray::<f32>(&array.subset_all())?;
println!("{array_all:4}");
// [[ NaN, NaN, 0.2, 0.3],
// [ NaN, -1.1, -1.2, 1.3],
// [ NaN, -2.1, NaN, NaN]]
// Retrieve a chunk directly
let array_chunk = array.retrieve_chunk_ndarray::<f32>(
&[0, 1], // chunk index
)?;
println!("{array_chunk:4}");
// [[ 0.2, 0.3],
// [ -1.2, 1.3]]
// Retrieve an inner chunk
use zarrs::array::ArrayShardedReadableExt;
let shard_index_cache = zarrs::array::ArrayShardedReadableExtCache::new(&array);
let array_inner_chunk = array.retrieve_inner_chunk_ndarray_opt::<f32>(
&shard_index_cache,
&[0, 3], // inner chunk index
&zarrs::array::codec::CodecOptions::default(),
)?;
println!("{array_inner_chunk:4}");
// [[ 0.3],
// [ 1.3]]Β§Additional Examples
Various examples can be found in the examples/ directory of the zarrs repository that demonstrate:
- creating and manipulating zarr hierarchies with various stores (sync and async), codecs, etc,
- converting between Zarr V2 and V3, and
- creating custom data types.
Examples can be run with cargo run --example <EXAMPLE_NAME>.
- Some examples require non-default features, which can be enabled with
--all-featuresor--features <FEATURES>. - Some examples support a
-- --usage-logargument to print storage API calls during execution.
Β§Crate Features
Β§Default
filesystem: Re-exportzarrs_filesystemaszarrs::filesystem.ndarray:ndarrayutility functions forArray.- Codecs:
blosc,crc32c,gzip,sharding,transpose,zstd.
Β§Non-Default
async: an experimental asynchronous API forstores,Array, andGroup.- The async API is runtime-agnostic. This has some limitations that are detailed in the
Arraydocs. - The async API is not as performant as the sync API.
- The async API is runtime-agnostic. This has some limitations that are detailed in the
- Codecs:
adler32,bitround,bz2,fletcher32,gdeflate,pcodec,zfp,zlib. dlpack: adds convenience methods forDLPacktensor interop toArray.- Additional
Element/ElementOwnedimplementations:
Β§zarrs Ecosystem
The Zarr specification is inherently unstable. It is under active development and new extensions are continually being introduced.
The zarrs crate has been split into multiple crates to:
- allow external implementations of stores and extensions points to target a relatively stable API compatible with a range of
zarrsversions, - enable automatic backporting of metadata compatibility fixes and changes due to standardisation,
- stay up-to-date with unstable public dependencies (e.g.
opendal,object_store,icechunk, etc) without impacting the release cycle ofzarrs, and - improve compilation times.
A hierarchical overview of these crates can be found in the The zarrs Book.
Β§Core
zarrs: The core library for manipulating Zarr hierarchies.zarrs_metadata: Zarr metadata support (re-exported aszarrs::metadata).zarrs_metadata_ext: Zarr extensions metadata support (re-exported aszarrs::metadata_ext).zarrs_data_type: The data type extension API forzarrs(re-exported inzarrs::array::data_type).zarrs_storage: The storage API forzarrs(re-exported aszarrs::storage).zarrs_plugin: The plugin API forzarrs(re-exported aszarrs::plugin).zarrs_registry: The Zarr extension point registry forzarrs(re-exported aszarrs::registry).
Β§Stores
zarrs_filesystem: A filesystem store (re-exported aszarrs::filesystem).zarrs_object_store:object_storestore support.zarrs_opendal:opendalstore support.zarrs_http: A synchronous http store.zarrs_zip: A storage adapter for zip files.zarrs_icechunk:icechunkstore support.git-like version control for Zarr hierachies.- Read βvirtual Zarr datacubesβ of archival formats (e.g.,
netCDF4,HDF5, etc.) created byVirtualiZarrand backed byicechunk.
Β§Bindings
zarrs-python: A high-performance codec pipeline forzarr-python.zarrs_ffi: A subset ofzarrsexposed as a C/C++ API.
Β§Zarr Metadata Conventions
ome_zarr_metadata: A library for OME-Zarr (previously OME-NGFF) metadata.
Β§Tools
zarrs_tools: Various tools for creating and manipulating Zarr V3 data with thezarrsrust crate- A reencoder that can change codecs, chunk shape, convert Zarr V2 to V3, etc.
- Create an OME-Zarr hierarchy from a Zarr array.
- Transform arrays: crop, rescale, downsample, gradient magnitude, gaussian, noise filtering, etc.
Β§Benchmarks
zarr_benchmarks: Benchmarks of various Zarr V3 implementations:zarrs,zarr-python,tensorstore
Β§Licence
zarrs is licensed under either of
- the Apache License, Version 2.0 LICENSE-APACHE or http://www.apache.org/licenses/LICENSE-2.0 or
- the MIT license LICENSE-MIT or http://opensource.org/licenses/MIT, at your option.
Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.
Re-exportsΒ§
pub use zarrs_metadata as metadata;pub use zarrs_metadata_ext as metadata_ext;pub use zarrs_plugin as plugin;pub use zarrs_registry as registry;pub use zarrs_storage as storage;pub use zarrs_filesystem as filesystem;filesystem