Expand description
Alimentar Dataset Format (.ald)
A binary format for secure, verifiable dataset distribution.
See docs/specifications/dataset-format-spec.md for full specification.
§Format Structure
┌─────────────────────────────────────────┐
│ Header (32 bytes, fixed) │
├─────────────────────────────────────────┤
│ Metadata (variable, MessagePack) │
├─────────────────────────────────────────┤
│ Schema (variable, Arrow IPC) │
├─────────────────────────────────────────┤
│ Payload (variable, Arrow IPC + zstd) │
├─────────────────────────────────────────┤
│ Checksum (4 bytes, CRC32) │
└─────────────────────────────────────────┘§Example
ⓘ
use alimentar::format::{save, load, SaveOptions, DatasetType};
// Save dataset
save(&dataset, DatasetType::Tabular, "data.ald", SaveOptions::default())?;
// Load dataset
let dataset = load("data.ald")?;Modules§
- encryption
- Encryption support for .ald format (§5.1)
- flags
- Header flags (bit positions)
- license
- Commercial licensing support for .ald format (§9)
- piracy
- Piracy detection and watermarking for .ald format (§9.3)
- signing
- Digital signing support for .ald format (§5.2)
- streaming
- Streaming dataset format with lazy chunk loading
Structs§
- Header
- File header (32 bytes, fixed)
- Load
Options - Options for loading datasets
- Loaded
Dataset - Loaded dataset from .ald format
- Metadata
- Dataset metadata (MessagePack-encoded)
- Save
Options - Options for saving datasets
Enums§
- Compression
- Compression algorithm identifiers (§3.3)
- Dataset
Type - Dataset type identifiers (§3.1)
Constants§
- FORMAT_
VERSION_ MAJOR - Current format version major number
- FORMAT_
VERSION_ MINOR - Current format version minor number
- HEADER_
SIZE - Header size in bytes (fixed)
- MAGIC
- Magic bytes: “ALDF” (0x414C4446)
Functions§
- crc32
- CRC32 checksum calculation (IEEE polynomial)
- load
- Load an Arrow dataset from the .ald format (unencrypted only)
- load_
from_ file - Load dataset from a file path
- load_
from_ file_ with_ options - Load dataset from a file path with decryption and verification options
- load_
with_ options - Load an Arrow dataset with decryption and verification options
- save
- Save an Arrow dataset to the .ald format
- save_
to_ file - Save dataset to a file path
- sha256_
hex - Computes SHA-256 hash of data and returns it as a hex string.