Expand description
Core snapshot engine providing block-level compressed storage with random access.
§Overview
hexz-core implements the core logic for creating and reading Hexz snapshots—
compressed, block-indexed archives that support random access, remote streaming,
and incremental updates. This crate contains no UI code; all user interfaces
(CLI, Python, FUSE) are in separate crates.
§Architecture
The crate is organized into several independent modules:
format: On-disk structures (headers, indices) defining the file formatstore: Storage backend abstraction (local files, HTTP, S3)algo: Compression, encryption, hashing, and deduplication algorithmscache: LRU caching for decompressed blocks and index pagesapi: Public API (File) for reading snapshotsops: High-level operations for packing and manipulating snapshots
§Quick Start
use hexz_core::{File, SnapshotStream};
use hexz_core::store::local::FileBackend;
use hexz_core::algo::compression::lz4::Lz4Compressor;
use std::sync::Arc;
// Open a local snapshot file
let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
let compressor = Box::new(Lz4Compressor::new());
let snapshot = File::new(backend, compressor, None)?;
// Read 4KB from disk stream at offset 1MB
let data = snapshot.read_at(SnapshotStream::Disk, 1024 * 1024, 4096)?;
assert_eq!(data.len(), 4096);§File Format
Hexz snapshots consist of:
- A fixed-size header (512 bytes) with metadata
- Compressed data blocks (variable size)
- Hierarchical index pages (serialized with bincode)
- Master index at the end (location stored in header)
The format supports:
- Block-level compression (LZ4, Zstandard)
- Optional AES-256-GCM encryption
- Thin snapshots (parent references)
- Dual streams (separate disk and memory data)
- Content-defined chunking for deduplication
See format module for detailed specification.
§Storage Backends
Storage backends implement the store::StorageBackend trait, enabling reads from:
- Local files (
store::local::FileBackend) - Memory-mapped files (
store::local::MmapBackend) - HTTP/HTTPS URLs (
store::http) - S3 buckets (
store::s3)
All backends provide the same interface—higher layers don’t know where data comes from.
§Compression & Encryption
Compression and encryption are pluggable via traits:
algo::compression::Compressor: LZ4 (algo::compression::lz4) or Zstandard (algo::compression::zstd)algo::encryption::Encryptor: AES-256-GCM (algo::encryption::aes_gcm)
Each block is compressed independently, then optionally encrypted. This enables:
- Parallel decompression (each block is self-contained)
- Random access (only decompress blocks you need)
- Block-level integrity (CRC32 checksums)
§Performance
- Compression: LZ4 ~2GB/s, Zstd ~500MB/s (single-threaded)
- Random Access: ~1ms latency (cold cache), ~0.08ms (warm cache)
- Sequential Read: ~2-3GB/s (NVMe storage, LZ4 decompression)
- Memory: <150MB typical (configurable block cache)
§Thread Safety
File is Send + Sync and can be safely shared across threads via Arc.
Internal caches use Mutex for synchronization. Multiple threads can read
concurrently from the same snapshot with independent cache hits.
§Examples
§Reading from HTTP
use hexz_core::File;
use hexz_core::store::http::HttpBackend;
use hexz_core::algo::compression::lz4::Lz4Compressor;
use std::sync::Arc;
let backend = Arc::new(HttpBackend::new(
"https://example.com/dataset.hxz".to_string(),
false // don't allow restricted IPs
)?);
let compressor = Box::new(Lz4Compressor::new());
let snapshot = File::new(backend, compressor, None)?;
// Stream data without downloading entire file
let data = snapshot.read_at(hexz_core::SnapshotStream::Disk, 0, 1024)?;§Thin Snapshots (Parent References)
Thin snapshots store a path to a base snapshot in their header; opening the thin file automatically loads the parent when needed.
use hexz_core::File;
use hexz_core::store::local::FileBackend;
use hexz_core::algo::compression::lz4::Lz4Compressor;
use std::sync::Arc;
// Open thin snapshot (parent is loaded from header parent_path when present)
let thin_backend = Arc::new(FileBackend::new("incremental.hxz".as_ref())?);
let thin_compressor = Box::new(Lz4Compressor::new());
let thin = File::new(thin_backend, thin_compressor, None)?;
// Reading from thin automatically falls back to base for unchanged blocks
let data = thin.read_at(hexz_core::SnapshotStream::Disk, 0, 4096)?;Re-exports§
pub use api::file::File;pub use api::file::SnapshotStream;
Modules§
- algo
- Algorithms for compression, encryption, hashing, and deduplication.
- api
- Public API surface for reading snapshot files.
- cache
- In-memory caching for decompressed blocks and deserialized index pages.
- format
- On-disk format structures: headers, indices, and serialization.
- ops
- High-level operations for creating, modifying, and analyzing snapshots.
- store
- Storage backend abstraction and implementations.