Skip to main content

Crate atlas

Crate atlas 

Source
Expand description

ATLAS (Aggregated Tensor Large Array Store) is a directory-based store for thousands of named datasets.

Each dataset is a virtual collection of named N-dimensional arrays with per-dataset and per-array attributes, backed by the array-format crate. Datasets sharing an array name are co-located in the same physical file, keyed by dataset name.

§Layout

my_store/
├── atlas.json          <- dataset registry + per-dataset attributes
├── temperature/
│   └── data.af         <- ArrayFile: one named array per dataset
└── latitude/
    └── data.af

§Quick start

use atlas::{Atlas, Attr, StoreConfig};
use ndarray::Array2;

let tmp = tempfile::tempdir().unwrap();

// Create — codec persists in atlas.json so `open_path` doesn't need it.
let mut s = Atlas::create_path(tmp.path(), StoreConfig::default()).await.unwrap();
{
    let mut ds = s.create_dataset("jan_2024").await.unwrap();
    ds.define_array::<f32>(
        "temperature",
        vec!["lat".into(), "lon".into()],
        vec![4, 8],
        None,        // chunk_shape — defaults to full shape (one chunk)
        None,        // fill_value
    ).await.unwrap();
    let data = Array2::<f32>::from_elem([4, 8], 20.0).into_dyn();
    ds.write_array("temperature", vec![0, 0], data.view()).await.unwrap();
    ds.set_attribute("month", Attr::Int64(1));
}
s.flush().await.unwrap();   // single durability boundary

// Reopen — no config needed.
let s2 = Atlas::open_path(tmp.path()).await.unwrap();
let ds2 = s2.open_dataset("jan_2024").await.unwrap();
let temp = ds2.read_array::<f32>("temperature", vec![], vec![]).await.unwrap().unwrap();
assert_eq!(temp.shape(), &[4, 8]);
assert_eq!(temp[[0, 0]], 20.0);

§Thread safety

Atlas and DatasetView are Send + Sync. Each physical array file is guarded by a tokio::sync::RwLock: concurrent reads (read_array, array_stats) proceed in parallel without contention, while writes (write_array, define_array, flush, compact, …) take an exclusive lock. The cache map uses a parking_lot::RwLock that is never held across an await point.

§Durability

atlas.json is loaded once when the store is opened or created; from then on every mutation (create_dataset, define_array, set_attribute, …) only touches the in-memory StoreMeta. The store does not persist until Atlas::flush is called. Dropping an Atlas without flushing abandons every pending in-memory write.

Structs§

ArraySchema
Schema for a single named array within a dataset.
ArrayStats
Aggregate statistics for a single array covering all its chunks.
Atlas
Handle to an opened or newly created atlas store.
DatasetMeta
Metadata for a single dataset: array schemas and per-dataset attributes. Both maps preserve insertion order (via IndexMap) so on-disk layouts and Python-side dict iteration mirror the order arrays/attributes were added.
DatasetView
A borrowed handle to one dataset within an Atlas.
DeltaCache
Two-level cache shared across all delta layers in an ArrayFile.
MergedArrayMeta
Array metadata visible to the caller after merging all delta layers.
StoreConfig
Configuration for opening or creating an Atlas.
TimestampNs
Nanoseconds since the Unix epoch (1970-01-01 00:00:00 UTC).

Enums§

Attr
A per-dataset attribute value stored in atlas.json.
Codec
Compression codec applied when writing new array blocks.
DType
Describes the element type of an array.
Error
Every error returned by this crate. Each variant carries enough context to identify what failed; Display (via thiserror) renders the same message shown in the /// line above each variant.
FillValue
A scalar fill value for an array.
MetaFormat
On-disk encoding for the store’s metadata file.
StatValue
A typed min or max value.

Traits§

ArrayElement
Unified element type for all array operations.

Type Aliases§

Result
Convenience alias for Result<T, atlas::Error> returned by every fallible operation in the crate.