Skip to main content

Module write

Module write 

Source
Expand description

Low-level write operations for Hexz snapshots.

This module provides the foundational building blocks for writing compressed, encrypted, and deduplicated blocks to snapshot files. These functions implement the core write semantics used by higher-level pack operations while remaining independent of the packing workflow.

§Module Purpose

The write operations module serves as the bridge between the high-level packing pipeline and the raw file I/O layer. It encapsulates the logic for:

  • Block Writing: Transform raw chunks into compressed, encrypted blocks
  • Deduplication: Detect and eliminate redundant blocks via content hashing
  • Zero Optimization: Handle sparse data efficiently without storage
  • Metadata Generation: Create BlockInfo descriptors for index building

§Design Philosophy

These functions are designed to be composable, stateless, and easily testable. They operate on raw byte buffers and writers without knowledge of the broader packing context (progress reporting, stream management, index organization).

This separation enables:

  • Unit testing of write logic in isolation
  • Reuse in different packing strategies (single-stream, multi-threaded, streaming)
  • Clear separation of concerns (write vs. orchestration)

§Write Operation Semantics

§Block Transformation Pipeline

Each block undergoes a multi-stage transformation before being written:

Raw Chunk (input)
     ↓
┌────────────────┐
│ Compression    │ → Compress using LZ4 or Zstd
└────────────────┘   (reduces size, increases CPU)
     ↓
┌────────────────┐
│ Encryption     │ → Optional AES-256-GCM with block_idx nonce
└────────────────┘   (confidentiality + integrity)
     ↓
┌────────────────┐
│ Checksum       │ → CRC32 of final data (fast integrity check)
└────────────────┘
     ↓
┌────────────────┐
│ Deduplication  │ → BLAKE3 hash lookup (skip write if duplicate)
└────────────────┘   (disabled for encrypted data)
     ↓
┌────────────────┐
│ Write          │ → Append to output file at current offset
└────────────────┘
     ↓
BlockInfo (metadata: offset, length, checksum)

§Write Behavior and Atomicity

§Single Block Writes

Individual block writes via write_block are atomic with respect to the underlying file system’s write atomicity guarantees:

  • Buffered writes: Data passes through OS page cache
  • No fsync: Writes are not flushed to disk until the writer is closed
  • Partial write handling: Writer’s write_all ensures complete writes or error
  • Crash behavior: Partial blocks may be written if process crashes mid-write

§Deduplication State

The deduplication map is maintained externally (by the caller). This design allows:

  • Flexibility: Caller controls when/if to enable deduplication
  • Memory control: Map lifetime and size managed by orchestration layer
  • Consistency: Map updates are immediately visible to subsequent writes

§Offset Management

The current_offset parameter is updated atomically after each successful write. This ensures:

  • Sequential allocation: Blocks are laid out contiguously in file
  • No gaps: Every byte between header and master index is utilized
  • Predictable layout: Physical offset increases monotonically

§Block Allocation Strategy

Blocks are allocated sequentially in the order they are written:

File Layout:
┌──────────────┬──────────┬──────────┬──────────┬─────────────┐
│ Header (512B)│ Block 0  │ Block 1  │ Block 2  │ Index Pages │
└──────────────┴──────────┴──────────┴──────────┴─────────────┘
 ↑             ↑          ↑          ↑
 0             512        512+len0   512+len0+len1

current_offset advances after each write:
- Initial: 512 (after header)
- After Block 0: 512 + len0
- After Block 1: 512 + len0 + len1
- After Block 2: 512 + len0 + len1 + len2

§Deduplication Impact

When deduplication detects a duplicate block:

  • No physical write: Block is not written to disk
  • Offset reuse: BlockInfo references the existing block’s offset
  • Space savings: Multiple logical blocks share one physical block
  • Transparency: Readers cannot distinguish between deduplicated and unique blocks

Example with deduplication:

Logical Blocks: [A, B, A, C, B]
Physical Blocks: [A, B, C]
                  ↑  ↑     ↑
                  │  │     └─ Block 3 (unique)
                  │  └─ Block 1 (unique)
                  └─ Block 0 (unique)

BlockInfo for logical block 2: offset = offset_of(A), length = len(A)
BlockInfo for logical block 4: offset = offset_of(B), length = len(B)

§Buffer Management

This module does not perform explicit buffer management. All buffers are:

  • Caller-allocated: Input chunks are provided by caller
  • Temporary allocations: Compression/encryption output is allocated, then consumed
  • No pooling: Each operation allocates fresh buffers (GC handles reclamation)

For high-performance scenarios, callers should consider:

  • Reusing chunk buffers across iterations
  • Using buffer pools for compression output (requires refactoring)
  • Batch writes to amortize allocation overhead

§Flush Behavior

Functions in this module do NOT flush data to disk. Flushing is the caller’s responsibility and typically occurs:

  • After writing all blocks and indices (in pack_snapshot)
  • Before closing the output file
  • Never during block writing (to maximize write batching)

This design allows the OS to batch writes for optimal I/O performance.

§Error Handling and Recovery

§Error Categories

Write operations can fail for several reasons:

§I/O Errors

  • Disk full: No space for compressed block (ENOSPC)
  • Permission denied: Writer lacks write permission (EACCES)
  • Device error: Hardware failure, I/O timeout (EIO)

These surface as Error::Io wrapping the underlying std::io::Error.

§Compression Errors

  • Compression failure: Compressor returns error (rare, usually indicates bug)
  • Incompressible data: Not an error; stored with expansion

These surface as Error::Compression.

§Encryption Errors

  • Cipher initialization failure: Invalid state (should not occur in practice)
  • Encryption failure: Crypto operation fails (indicates library bug)

These surface as Error::Encryption.

§Error Recovery

Write operations provide no automatic recovery. On error:

  • Function returns immediately: No cleanup or rollback
  • File state undefined: Partial data may be written
  • Caller responsibility: Must handle error and clean up

Typical error handling pattern in pack operations:

match write_block_simple(...) {
    Ok(info) => {
        // Success: Add info to index, continue
    }
    Err(e) => {
        // Failure: Log error, delete partial output file, return error to caller
        std::fs::remove_file(output)?;
        return Err(e);
    }
}

§Partial Write Handling

The underlying Write::write_all method ensures atomic writes of complete blocks:

  • Success: Entire block written, offset updated
  • Failure: Partial write may occur, but error is returned
  • No retry: Caller must handle retries if desired

§Performance Characteristics

§Write Throughput

Block write performance is dominated by compression:

  • LZ4: ~2 GB/s (minimal overhead)
  • Zstd level 3: ~200-500 MB/s (depends on data)
  • Encryption: ~1-2 GB/s (hardware AES-NI)
  • BLAKE3 hashing: ~3200 MB/s (for deduplication)

Typical bottleneck: Compression CPU time.

§Deduplication Overhead

BLAKE3 hashing adds ~5-10% overhead to write operations:

  • Hash computation: ~3200 MB/s throughput (BLAKE3 tree-hashed)
  • Hash table lookup: O(1) average, ~50-100 ns per lookup
  • Memory usage: ~48 bytes per unique block

For datasets with <10% duplication, deduplication overhead may exceed savings. Consider disabling dedup for unique data.

§Zero Block Detection

is_zero_chunk uses SIMD-optimized comparison on modern CPUs:

  • Throughput: ~10-20 GB/s (memory bandwidth limited)
  • Overhead: Negligible (~5-10 cycles per 64-byte cache line)

Zero detection is always worth enabling for sparse data.

§Memory Usage

Per-block memory allocation:

  • Input chunk: Caller-provided (typically 64 KiB)
  • Compression output: ~1.5× chunk size worst case (incompressible data)
  • Encryption output: compression_size + 28 bytes (AES-GCM overhead)
  • Dedup hash: 32 bytes (BLAKE3 digest)

Total temporary allocation per write: ~100-150 KiB (released immediately after write).

§Examples

See individual function documentation for usage examples.

§Future Enhancements

Potential improvements to write operations:

  • Buffer pooling: Reuse compression/encryption buffers to reduce allocation overhead
  • Async I/O: Use tokio or io_uring for overlapped writes
  • Parallel writes: Write multiple blocks concurrently (requires coordination)
  • Write-ahead logging: Enable atomic commits for crash safety

Functions§

create_zero_block
Creates a zero-block descriptor without writing data to disk.
is_zero_chunk
Checks if a chunk consists entirely of zero bytes.
write_block
Writes a compressed and optionally encrypted block to the output stream.