Expand description
General design:
- Most things are async even if they don’t need to be. Async propagates unfortunately. If something can be async sometimes it needs to be async always. In our example: fetching from storage.
- There is a high level interface that knows about arrays, groups, user attributes, etc. This
is the
repository::Repositorytype. - There is a low level interface that speaks zarr keys and values, and is used to provide the
zarr store that will be used from python. This is the [
zarr::Store] type. - There is a translation language between low and high levels. When user writes to a zarr key,
we need to convert that key to the language of arrays and groups. This is implemented it the
[
zarr] module - There is an abstract type for loading and saving of the Arrow datastructures.
This is the
Storagetrait. It knows how to fetch and write arrow. We have:- an in memory implementation
- an s3 implementation that writes to parquet
- a caching wrapper implementation
- The datastructures are represented by concrete types in the
formatmodules. These datastructures use Arrow RecordBatches for representation.
Re-exports§
pub use config::ObjectStoreConfig;pub use config::RepositoryConfig;pub use repository::Repository;pub use storage::ObjectStorage;pub use storage::Storage;pub use storage::StorageError;pub use storage::new_in_memory_storage;pub use storage::new_local_filesystem_storage;pub use storage::new_s3_storage;pub use store::Store;