Skip to main content

Crate array_format

Crate array_format 

Source
Expand description

§array-format

A block-backed, footer-indexed container for storing many n-dimensional arrays in a single file.

The format uses a delta/overlay architecture: each flush produces a self-describing sidecar file that stacks on top of the base, recording only the chunks that changed. Reads fall through to older layers for unchanged chunks, and layers can be merged back into a single file with compact.

§Features

  • Store many arrays in one object (or a small set of related sidecar files).
  • Append arrays and update individual chunks without rewriting the whole file.
  • Per-block compression (LZ4, Zstd, or none) recorded in the block table, so readers need no configuration to decode a file.
  • Chunked or single-chunk layouts with coordinate-addressed reads.
  • Logical deletes with periodic compaction to reclaim space.
  • Works with any object_store-compatible backend (local filesystem, S3, GCS, Azure) via ObjectStoreBackend, plus an InMemoryStorage backend for tests.

§Quick start

use array_format::{ArrayFile, FileConfig, Lz4Codec};
use ndarray::Array;

// An in-memory file; use `ArrayFile::create(store, path, config)` for on-disk.
let mut file = ArrayFile::create_memory(FileConfig::new(Lz4Codec)).await?;

// Define and write a 1-D f32 array.
file.define_array::<f32>("signal", vec!["t".into()], vec![4], None, None)?;
let data = Array::from_vec(vec![1.0f32, 2.0, 3.0, 4.0]).into_dyn();
file.write_array("signal", vec![0], data.view()).await?;

// Read it back — `vec![], vec![]` means "the whole array".
let out = file.read_array::<f32>("signal", vec![], vec![]).await?;
assert_eq!(out.len(), 4);

§Architecture

The crate is organized in four layers:

LayerPurposeKey types
0 — CorePrimitivesDType, ChunkAddress, BlockId, Error
1 — MetadataFooter modelBlockMeta, Footer
2 — TraitsExtension pointsCompressionCodec, Storage
3 — RuntimeRead / write / compactArrayFile

The CompressionCodec and Storage traits are the extension points: implement them to plug in custom compression algorithms or storage backends.

Re-exports§

pub use array::ArrayElement;
pub use codec::CompressionCodec;
pub use codec::Lz4Codec;
pub use codec::NoCompression;
pub use codec::ZstdCodec;
pub use delta::DeltaCache;
pub use dtype::DType;
pub use error::Error;
pub use error::Result;
pub use file::ArrayFile;
pub use file::DEFAULT_BLOCK_TARGET_SIZE;
pub use file::DEFAULT_CACHE_CAPACITY;
pub use file::DEFAULT_IO_CACHE_CAPACITY;
pub use file::FileConfig;
pub use file::MergedArrayMeta;
pub use layout::AttributeValue;
pub use layout::FillValue;
pub use stats::ArrayStats;
pub use stats::StatValue;
pub use stats::StatsFile;
pub use storage::InMemoryStorage;
pub use timestamp::TimestampNs;

Modules§

address
Block identifiers and chunk address types.
array
block
Block metadata stored in the footer.
codec
Compression codec trait and built-in implementations.
delta
dtype
Data type definitions for array elements.
error
Error types for the array-format crate.
file
footer
File footer: the index that maps array names to block addresses.
layout
Array layout definitions and array metadata.
ndarray_ext
Optional integration with the ndarray crate.
stats
storage
Storage backend trait and implementations.
timestamp
Nanosecond-precision timestamp wrapper.