OxiGDAL Zarr
Pure Rust implementation of the Zarr v2/v3 storage specification for cloud-optimized, chunked, N-dimensional arrays. Support for multiple storage backends (filesystem, S3, HTTP, memory) and compression codecs (Zstd, Gzip, LZ4). Part of the OxiGDAL geospatial data access library.
Features
- Zarr Versions: Full support for Zarr v2 and v3 specifications
- Storage Backends: Filesystem, S3-compatible, HTTP, and in-memory storage
- Compression Codecs: Zstd, Gzip, LZ4, and Blosc filters
- Data Filters: Shuffle, Delta, and Scale-offset filters for preprocessing
- Async I/O: Async support for cloud storage backends with Tokio runtime
- Parallel Operations: Multi-threaded chunk processing with Rayon
- LRU Caching: Optional chunk caching for frequent access patterns
- Consolidation: Metadata consolidation for optimized access
- Sharding: Zarr v3 sharding extension for improved access patterns
- Pure Rust: 100% Pure Rust with no C/Fortran dependencies
- No Unwrap Policy: All fallible operations return
Result<T, E>with descriptive errors
Installation
Add to your Cargo.toml:
[]
= "0.1.0"
Features
Enable optional features for specific capabilities:
[]
# Filesystem support
= { = "0.1.0", = ["filesystem"] }
# Cloud storage with S3
= { = "0.1.0", = ["s3", "async"] }
# HTTP remote access
= { = "0.1.0", = ["http"] }
# Compression codecs
= { = "0.1.0", = ["zstd", "gzip", "lz4"] }
# Parallel processing
= { = "0.1.0", = ["parallel"] }
# Chunk caching
= { = "0.1.0", = ["cache"] }
# All features
= { = "0.1.0", = ["filesystem", "s3", "http", "async", "zstd", "gzip", "lz4", "shuffle", "delta", "scale-offset", "parallel", "cache", "v2", "v3"] }
Quick Start
Reading a Zarr Array
use ;
Writing a Zarr Array
use ;
use ;
use Compressor;
Usage
Storage Backends
Filesystem Storage
Access Zarr arrays stored on local or network filesystems:
use FilesystemStore;
// Open existing array
let store = open?;
// Create new array
let store = create?;
S3-Compatible Storage
Access arrays stored in cloud object storage (requires s3 feature):
use S3Storage;
async
HTTP Access
Read arrays from HTTP/HTTPS URLs (requires http feature):
use HttpStorage;
let store = new?;
let reader = open_v3?;
In-Memory Storage
Perfect for testing and temporary data:
use MemoryStore;
let store = new;
// Use like any other store
Compression Codecs
The library supports multiple compression algorithms:
Zstd (Default)
Fast, high-compression ratio. Recommended for most use cases:
use Compressor;
let compressor = Zstd ;
Gzip
Standard deflate compression, widely compatible:
let compressor = Gzip ;
LZ4
Very fast compression, lower ratio:
let compressor = LZ4 ;
Chunk Management
Reading Chunks
use ZarrReader;
let reader = /* ... */;
// Read specific chunk by coordinates
let chunk = reader.read_chunk?;
// Read slice across multiple chunks
let slice = reader.read_slice?;
Chunk Grid Configuration
use ChunkGrid;
// Regular grid (same chunk size everywhere)
let grid = regular?;
// Irregular grid with different chunk sizes
let grid = irregular?;
Advanced Usage - Parallel Processing
Enable parallel chunk processing for bulk operations (requires parallel feature):
use ;
let store = create?;
let mut writer = create_v2?;
// Parallel chunk writing (internally uses Rayon)
// Multiple chunks can be written in parallel
Advanced Usage - Caching
Enable LRU caching to improve performance for repeated chunk access:
use ;
let inner_store = open?;
let cached_store = new?; // Cache 100 chunks
Advanced Usage - Metadata Consolidation
For datasets with many arrays, consolidate metadata for faster access:
use consolidate_metadata;
let store = open?;
let consolidated = consolidate_metadata?;
Advanced Usage - Zarr v3
Use the latest Zarr v3 specification with enhanced features:
use ZarrV3Reader;
let store = open?;
let reader = open?;
Error Handling
All operations return Result<T, ZarrError> following the "no unwrap" policy:
use ;
API Overview
| Module | Description |
|---|---|
reader |
Reading Zarr arrays (v2, v3) |
writer |
Writing Zarr arrays (v2, v3) |
storage |
Backend storage implementations |
metadata |
Array and group metadata structures |
codecs |
Compression and encoding implementations |
filters |
Data filtering (shuffle, delta, scale-offset) |
chunk |
Chunk management and grid definitions |
dimension |
Dimension and shape utilities |
consolidation |
Metadata consolidation for groups |
sharding |
Zarr v3 sharding extension |
error |
Error types and conversions |
Examples
See the examples directory for complete working examples:
- create_test_zarr_samples: Generate realistic geospatial Zarr datasets for demonstration
Run examples with:
# Create sample Zarr datasets
Performance Characteristics
OxiGDAL Zarr is optimized for cloud-native geospatial data access:
- Chunk Size: Optimal performance with 64-256 MB chunks
- Compression: Zstd provides best compression/speed ratio
- Parallel I/O: ~N× speedup with N parallel chunk operations
- Caching: LRU cache provides 10-100× speedup for repeated access
Benchmark on modern hardware (RTX 4090, 1TB NVMe):
- Sequential read: ~5 GB/s
- Random chunk access: ~200K chunks/s
- Compression overhead: ~10-20% for Zstd level 3
Pure Rust
This library is 100% Pure Rust with no C/Fortran dependencies:
- All compression algorithms use pure Rust implementations (flate2, zstd-rs)
- No external system libraries required
- Cross-platform: Linux, macOS, Windows, WASM, embedded systems
- Safe by default: No unsafe code except where explicitly required for performance
Zarr Format Support
Zarr v2
Full compliance with Zarr v2 specification:
- Array metadata (
.zarray) - Group metadata (
.zgroup) - All standard codecs
- Attributes and custom metadata
Zarr v3
Early support for Zarr v3 specification:
- Enhanced metadata format
- Sharding extension
- Flexible codec pipelines
- Improved interoperability
Documentation
- API Docs: Full API documentation at docs.rs
- Zarr Spec: Official Zarr specification
- OxiGDAL Docs: OxiGDAL documentation
- Examples: See examples directory
Integration with OxiGDAL Ecosystem
OxiGDAL Zarr is fully integrated with the OxiGDAL geospatial library:
use Dataset;
use FilesystemStore;
// Convert between formats
let dataset = open?;
// ... process ...
// Export to Zarr
Contributing
Contributions are welcome! Please follow these guidelines:
- Fork the repository
- Create a feature branch
- Follow COOLJAPAN policies:
- No
unwrap()calls in production code - 100% Pure Rust (use feature gates for C dependencies)
- Comprehensive error handling
- File size limit of 2000 lines (use
splitrsfor refactoring)
- No
- Write tests and examples
- Run
cargo test --all-featuresandcargo clippy - Submit a pull request
License
Licensed under the Apache License, Version 2.0 (LICENSE or http://www.apache.org/licenses/LICENSE-2.0).
Related Projects
- OxiGDAL - Geospatial data access library
- OxiGDAL GeoTIFF - Cloud-optimized GeoTIFF driver
- OxiGDAL NetCDF - NetCDF data format support
- OxiGDAL GeoParquet - GeoParquet columnar format
- GDAL - Geospatial Data Abstraction Library (reference implementation)
- Zarr - Official Zarr specification
COOLJAPAN Ecosystem
OxiGDAL Zarr is part of the COOLJAPAN ecosystem of pure Rust libraries:
- OxiGDAL: Geospatial data access
- OxiBLAS: Pure Rust BLAS operations
- OxiFFT: Fast Fourier Transform (replaces rustfft)
- OxiCode: Serialization (bincode alternative)
- SciRS2: Scientific computing primitives
- NumRS2: Numerical computing (NumPy-like)
Maintained by: COOLJAPAN OU (Team Kitasan) Repository: https://github.com/cool-japan/oxigdal Issue Tracker: https://github.com/cool-japan/oxigdal/issues Discussions: https://github.com/cool-japan/oxigdal/discussions