Hexz
High-performance, deduplicated archive format for large-scale data distribution.
Hexz is a single-file archive format (.hxz) designed for storing and distributing massive datasets, disk images, and binary blobs. Unlike traditional archive formats (tar, zip) or deduplication tools (Git LFS), Hexz archives are natively seekable, allowing you to mount a multi-terabyte archive and access any byte instantly without decompressing the entire file.
Key Features
- Block-Level Deduplication: Uses Content-Defined Chunking (CDC) to identify shared blocks across different versions of the same data, even if offsets shift.
- Random Access (Seekable): O(1) seek time. Mount archives via FUSE to treat them as standard block devices or filesystems.
- Thin Archives (Deltas): Create archives that only store changed blocks, referencing a "base" archive for common data. Ideal for versioning large datasets.
- Git-like Workspaces: Checkout archives into writable workspaces. Hexz creates a standard, transparent directory structure using advanced FUSE passthrough while keeping state securely isolated in global storage (
~/.hexz/workspaces/). - Transparent Compression & Encryption: Supports LZ4 and Zstandard compression, and AES-256-GCM encryption.
- Self-Contained: A single
.hxzfile contains the data, the index, and the metadata. No complex repository structure required. - Cloud Native: Supports byte-range requests for on-demand fetching from S3-compatible object storage.
Why Hexz?
| Feature | Git LFS | Hexz |
|---|---|---|
| Deduplication | File-level | Block-level (CDC) |
| Access Pattern | Full Download | Random Access (Seekable) |
| Update Efficiency | Re-upload whole file | Only upload changed blocks |
| Mountable | No | Yes (FUSE) |
Quick Start
Install
# Pre-built binary (Linux/macOS)
|
# From crates.io
See the tutorial for a full walkthrough.
Pack an archive
Mount and inspect
# Access contents via /mnt/data/disk
Create a thin delta
# Create a new version that dedups against the base
Workspaces (Git-like Workflow)
# Checkout an archive into a writable workspace
# Make changes to the mounted files
# View changes
# Commit changes to a new thin archive (delta)
License
Licensed under Apache 2.0 or MIT.