hexz-cli
Command-line tool for managing Hexz snapshots, datasets, and virtual machines.
Overview
The hexz CLI provides a comprehensive interface for creating, analyzing, and managing Hexz snapshots. It supports dataset packing for AI/ML workflows, VM snapshot management, and diagnostic tools for inspecting snapshot internals.
This is the primary tool for developers and data engineers working with the Hexz format.
Installation
From Source
# Clone the repository
# Install the CLI
# Or install directly with cargo
After installation, the hexz command will be available in your PATH.
Quick Examples
Pack a Dataset
Convert raw files into a compressed, deduplicated Hexz snapshot:
# Pack a directory of images for ML training
# Pack with custom compression and encryption
Inspect a Snapshot
# Show snapshot metadata
# Get JSON output for programmatic access
Boot a VM from Snapshot
# Boot a VM with 4GB RAM (requires FUSE feature)
# Boot without KVM acceleration
Command Reference
The CLI is organized into three main command groups:
Data Commands (hexz data)
Work with datasets and snapshots for ML/AI workflows:
| Command | Description |
|---|---|
pack |
Create a snapshot from raw files/directories |
build |
Build a snapshot with specific profiles (optimized for different workloads) |
info |
Display snapshot metadata and statistics |
diff |
Compare two snapshots (diagnostics feature) |
analyze |
Analyze snapshot structure and compression efficiency (diagnostics feature) |
Example:
# Pack with deduplication
# View detailed info
VM Commands (hexz vm)
Manage virtual machines using Hexz snapshots (requires fuse feature):
| Command | Description |
|---|---|
boot |
Boot a VM from a snapshot |
install |
Install an OS from ISO to create a new snapshot |
snapshot |
Capture running VM state (disk + memory) |
mount |
Mount a snapshot as a FUSE filesystem |
Example:
# Install Ubuntu from ISO
# Boot the installed system
# Snapshot a running VM
System Commands (hexz sys)
Server and infrastructure operations:
| Command | Description |
|---|---|
serve |
Start an HTTP server for streaming snapshots |
Example:
# Serve snapshots over HTTP
Common Options
Compression
Choose compression algorithm with --compression:
lz4- Fast compression (~2GB/s), lower ratio (default)zstd- Better compression (~500MB/s), higher ratio
Content-Defined Chunking (CDC)
Enable deduplication with --cdc:
# Use default CDC settings (FastCDC)
# Custom chunk sizes
Encryption
Encrypt snapshots with --encrypt:
# You'll be prompted for a password
Architecture
hexz-cli/
├── src/
│ ├── main.rs # Entry point
│ ├── args.rs # CLI argument parsing (clap)
│ ├── cmd/ # Command implementations
│ │ ├── data/ # Dataset commands (pack, info, etc.)
│ │ ├── vm/ # VM commands (boot, snapshot, etc.)
│ │ └── sys/ # System commands (serve)
│ └── ui/ # User interface (progress bars, formatters)
└── benches/ # Performance benchmarks
├── macro/ # Macro benchmarks (end-to-end)
├── micro/ # Micro benchmarks (component-level)
└── ai/ # AI/ML-focused benchmarks
Development
All development commands use the project Makefile from the repository root.
Building
# Build CLI (release mode)
# Build and install locally
# Run CLI directly
Testing
# Run all tests
# Run only Rust tests
# Run tests with filter
Benchmarks
The CLI crate includes comprehensive benchmarks:
# Run all benchmarks
# Run specific benchmark category
# Compare against baseline
Benchmark categories:
- macro: End-to-end workflows (read throughput, sparse access, concurrency)
- micro: Component-level (cache, decompression, API comparison)
- ai: ML-specific (dataloader, shuffle, prefetch, multi-worker)
Linting & Formatting
# Format all code
# Check formatting + clippy
# Run clippy with strict lints
See make help for all available commands.
Features
The CLI supports compile-time feature flags:
default:["fuse", "server", "compression-zstd", "encryption", "diagnostics", "signing"]fuse: VM mounting and FUSE filesystem supportserver: HTTP server for snapshot streamingcompression-zstd: Zstandard compressionencryption: AES-256-GCM encryptiondiagnostics: Advanced analysis commands (diff, analyze)signing: Cryptographic signing for snapshotsfirecracker: Firecracker microVM support (experimental)
Build without optional features:
# Minimal build (no VM support)
Performance
The CLI is optimized for high-throughput operations:
- Pack throughput: ~2GB/s (LZ4), ~500MB/s (Zstd)
- Deduplication: FastCDC with parallel processing
- Progress tracking: Real-time progress bars with
indicatif - Zero-copy: Direct memory mapping where possible
See Also
- User Documentation - Tutorials and how-to guides
- CLI Reference - Complete command documentation
- hexz-core - Core engine library
- Python Bindings - PyTorch integration
- Project README - Main project overview