Skip to main content

Atlas

Struct Atlas 

Source
pub struct Atlas { /* private fields */ }
Expand description

Handle to an opened or newly created atlas store.

Owns the object_store backend, the in-memory store metadata, a per-array file cache, and the chosen array / metadata codecs. All mutations (create_dataset, delete_dataset, and everything that flows through a DatasetView) update in-memory state only — nothing reaches disk until Atlas::flush.

Atlas is Send + Sync and safe to share across tasks; each array file is independently guarded by a tokio::sync::RwLock.

Implementations§

Source§

impl Atlas

Source

pub async fn open(store: Arc<dyn ObjectStore>, prefix: Path) -> Result<Self>

Open an existing store at prefix within store.

Reads atlas.json exactly once. Subsequent mutations only touch the in-memory meta until Atlas::flush is called.

Source

pub async fn create( store: Arc<dyn ObjectStore>, prefix: Path, config: StoreConfig, ) -> Result<Self>

Create a new store at prefix within store.

Source

pub async fn open_path(path: impl AsRef<Path>) -> Result<Self>

Open an existing store at the given local filesystem path.

The metadata format (atlas.json / atlas.msgpack / …zst / …lz4) and array codec are auto-detected from the on-disk files — no StoreConfig needed on reopen.

§Examples
use atlas::{Atlas, StoreConfig};
let tmp = tempfile::tempdir().unwrap();
// Create + flush a store so there's something to open.
{
    let mut s = Atlas::create_path(tmp.path(), StoreConfig::default()).await.unwrap();
    s.create_dataset("ds1").await.unwrap();
    s.flush().await.unwrap();
}
let s = Atlas::open_path(tmp.path()).await.unwrap();
assert!(s.dataset_exists("ds1"));
Source

pub async fn create_path( path: impl AsRef<Path>, config: StoreConfig, ) -> Result<Self>

Create a new store at the given local filesystem path. The directory is created (recursively, like mkdir -p) if it does not already exist.

§Examples
use atlas::{Atlas, StoreConfig};
let tmp = tempfile::tempdir().unwrap();
let s = Atlas::create_path(tmp.path(), StoreConfig::default()).await.unwrap();
assert!(s.list_datasets().is_empty());
Source

pub async fn create_dataset(&mut self, name: &str) -> Result<DatasetView>

Create a new dataset in this store and return a DatasetView for populating it. Errors with Error::DatasetAlreadyExists if a dataset with this name is already registered, or Error::InvalidName if name violates the naming rules (non-empty, no /, no leading _, not . or ..).

Source

pub async fn open_dataset(&self, name: &str) -> Result<DatasetView>

Return a DatasetView for an existing dataset. Errors with Error::DatasetNotFound if no dataset with this name exists. Cheap — reads the in-memory metadata, never touches disk.

Source

pub async fn delete_dataset(&mut self, name: &str) -> Result<()>

Remove a dataset from this store. Tombstones the dataset’s entries inside every shared array file but does not flush — call Atlas::flush to persist the deletion, and optionally Atlas::compact afterwards to reclaim the storage. Errors with Error::DatasetNotFound if no dataset with this name exists.

Source

pub fn list_datasets(&self) -> Vec<String>

All dataset names currently registered in this store, in insertion order. Reads from the in-memory store metadata — no disk I/O.

Source

pub fn dataset_exists(&self, name: &str) -> bool

true if a dataset with this name is registered. O(1) hash lookup in the in-memory store metadata.

Source

pub fn list_arrays(&self) -> Vec<String>

Distinct array names across all datasets in this store, sorted. One entry per physical .af file — datasets sharing an array name (the common case) collapse to a single entry here.

Source

pub fn array_dtype(&self, array: &str) -> Option<DType>

Returns the dtype of array if any dataset in this store declares it. Used by read_array_across’s Python binding to pick the generic instantiation without round-tripping through a DatasetView.

Source

pub async fn read_array_across<T: ArrayElement + Send + Sync + 'static>( &self, array: &str, dataset_names: &[String], start: Vec<usize>, shape: Vec<usize>, ) -> Result<Vec<Option<ArcArray<T, IxDyn>>>>

Bulk read the same slice of array from many datasets that share its physical file. Runs at most num_cpus reads concurrently — matching what a well-tuned dask threadpool would do — to keep tokio::task::spawn_blocking’s decompression pool from oversubscribing the actual CPU cores.

This exists because open_as_many_xarray_dataset over N datasets used to incur N separate Python → Rust → tokio::block_on transitions plus Python-side dask graph overhead. One call here replaces all of that and gets the same parallelism dask was providing — but in pure Rust, with no GIL involvement until the results return.

start and shape follow the same conventions as DatasetView::read_array: empty start + empty shape mean the full array. Per-dataset entries that don’t declare array are returned as None.

Source

pub async fn read_array_across_stacked<T: ArrayElement + Send + Sync + Clone + 'static>( &self, array: &str, dataset_names: &[String], start: Vec<usize>, shape: Vec<usize>, ) -> Result<Array<T, IxDyn>>

Like Atlas::read_array_across but returns one stacked (len(dataset_names), *per_dataset_shape) ndarray::Array instead of a Vec of per-dataset arrays.

The output buffer is pre-allocated once; each parallel read writes its row in as the task completes, overlapping the serial copy with the remaining in-flight reads. Saves the ~5.7 GiB of memory copies that the Python-side np.stack on the per-dataset list would do on a 1000-dataset gridded workload.

Errors if any listed dataset doesn’t declare array — the stacked representation has no positional “missing” sentinel.

Source

pub async fn flush(&mut self) -> Result<()>

Flush every known array file’s pending writes AND persist the in-memory atlas.json. This is the single durability boundary for the store.

Force-initializes every array referenced in meta, even ones never touched by a DatasetView (lazy-init wins are on the read path, not on flush).

Source

pub async fn compact(&mut self) -> Result<()>

Compact every known array file in place (reclaims tombstoned space). Force-initializes every array referenced in meta.

Auto Trait Implementations§

§

impl !RefUnwindSafe for Atlas

§

impl !UnwindSafe for Atlas

§

impl Freeze for Atlas

§

impl Send for Atlas

§

impl Sync for Atlas

§

impl Unpin for Atlas

§

impl UnsafeUnpin for Atlas

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> ArchivePointee for T

Source§

type ArchivedMetadata = ()

The archived version of the pointer metadata for this type.
Source§

fn pointer_metadata( _: &<T as ArchivePointee>::ArchivedMetadata, ) -> <T as Pointee>::Metadata

Converts some archived metadata to the pointer metadata for itself.
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> LayoutRaw for T

Source§

fn layout_raw(_: <T as Pointee>::Metadata) -> Result<Layout, LayoutError>

Returns the layout of the type.
Source§

impl<T, N1, N2> Niching<NichedOption<T, N1>> for N2
where T: SharedNiching<N1, N2>, N1: Niching<T>, N2: Niching<T>,

Source§

unsafe fn is_niched(niched: *const NichedOption<T, N1>) -> bool

Returns whether the given value has been niched. Read more
Source§

fn resolve_niched(out: Place<NichedOption<T, N1>>)

Writes data to out indicating that a T is niched.
Source§

impl<T> Pointable for T

Source§

const ALIGN: usize

The alignment of pointer.
Source§

type Init = T

The type for initializers.
Source§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
Source§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
Source§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
Source§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
Source§

impl<T> Pointee for T

Source§

type Metadata = ()

The metadata type for pointers and references to this type.
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more