pub struct HDF5Dataset { /* private fields */ }Expand description
An HDF5 dataset — a typed, shaped, chunked (or contiguous) array of data.
Provides synchronous access to parsed metadata (shape, dtype, chunk shape, filters, fill value) and async access to the chunk index (byte offsets of all chunks in the file).
Implementations§
Source§impl HDF5Dataset
impl HDF5Dataset
Sourcepub fn new(
name: String,
header: ObjectHeader,
reader: Arc<dyn AsyncFileReader>,
raw_reader: Arc<dyn AsyncFileReader>,
superblock: Arc<Superblock>,
) -> Result<Self>
pub fn new( name: String, header: ObjectHeader, reader: Arc<dyn AsyncFileReader>, raw_reader: Arc<dyn AsyncFileReader>, superblock: Arc<Superblock>, ) -> Result<Self>
Create a new dataset by parsing metadata messages from its object header.
Sourcepub fn element_size(&self) -> u32
pub fn element_size(&self) -> u32
Element size in bytes.
Sourcepub fn chunk_shape(&self) -> Option<&[u64]>
pub fn chunk_shape(&self) -> Option<&[u64]>
Chunk shape (None for contiguous or compact storage).
Sourcepub fn filters(&self) -> &FilterPipeline
pub fn filters(&self) -> &FilterPipeline
Filter pipeline.
Sourcepub fn fill_value(&self) -> Option<&[u8]>
pub fn fill_value(&self) -> Option<&[u8]>
Fill value bytes (interpretation depends on dtype).
Sourcepub fn layout(&self) -> &StorageLayout
pub fn layout(&self) -> &StorageLayout
Storage layout.
Sourcepub fn is_null_dataspace(&self) -> bool
pub fn is_null_dataspace(&self) -> bool
Whether this dataset has a null dataspace (no data, type 2).
Sourcepub fn has_external_storage(&self) -> bool
pub fn has_external_storage(&self) -> bool
Whether this dataset uses external data files (msg type 0x0007).
Sourcepub fn header(&self) -> &ObjectHeader
pub fn header(&self) -> &ObjectHeader
Access the object header.
Sourcepub async fn attributes(&self) -> Vec<Attribute>
pub async fn attributes(&self) -> Vec<Attribute>
Get all attributes attached to this dataset, resolving vlen data.
Sourcepub async fn chunk_index(&self) -> Result<&ChunkIndex>
pub async fn chunk_index(&self) -> Result<&ChunkIndex>
Extract the chunk index, caching the result after first resolution.
For chunked datasets, traverses the B-tree to enumerate all chunks. For contiguous datasets, returns a single-entry index. For compact datasets, returns an empty index (data is inline in the header).
Sourcepub async fn batch_get_chunks(
&self,
chunk_indices: &[Vec<u64>],
) -> Result<Vec<Option<Bytes>>>
pub async fn batch_get_chunks( &self, chunk_indices: &[Vec<u64>], ) -> Result<Vec<Option<Bytes>>>
Fetch multiple chunks in a single batched I/O call.
Looks up byte ranges from the chunk index and fetches them all via
raw_reader.get_byte_ranges(), bypassing the BlockCache.
Returns one entry per input index, in the same order. Chunks that
are not present in the index (unallocated) are returned as None.
Sourcepub async fn batch_fetch_ranges(
&self,
ranges: &[(u64, u64)],
) -> Result<Vec<Bytes>>
pub async fn batch_fetch_ranges( &self, ranges: &[(u64, u64)], ) -> Result<Vec<Bytes>>
Fetch multiple byte ranges in a single batched I/O call.
Unlike batch_get_chunks, this takes pre-resolved (offset, length)
pairs — no chunk index lookup is performed. Use this when the caller
has already resolved the chunk index and wants to avoid re-parsing.