pub fn create_zero_block(logical_len: u32) -> BlockInfoExpand description
Creates a zero-block descriptor without writing data to disk.
Zero blocks (all-zero chunks) are a special case optimized for space efficiency. Instead of compressing and storing zeros, we create a metadata-only descriptor that signals to the reader to return zeros without performing any I/O.
§Sparse Data Optimization
Many VM disk images and memory dumps contain large regions of zeros:
- Unallocated disk space: File systems often zero-initialize blocks
- Memory pages: Unused or zero-initialized memory
- Sparse files: Holes in sparse file systems
Storing these zeros (even compressed) wastes space:
- LZ4-compressed zeros: ~100 bytes per 64 KiB block (~0.15% of original)
- Uncompressed zeros: 64 KiB per block (100%)
- Metadata-only: 20 bytes per block (~0.03%)
The metadata approach saves 99.97% of space for zero blocks.
§Descriptor Format
Zero blocks are identified by a special BlockInfo signature:
offset = 0: Invalid physical offset (data region starts at ≥512)length = 0: No physical storagelogical_len = N: Original zero block size in byteschecksum = 0: No checksum needed (zeros are deterministic)
Readers recognize this pattern and synthesize zeros without I/O.
§Parameters
logical_len: Size of the zero block in bytes- Typically matches block_size (e.g., 65536 for 64 KiB blocks)
- Can vary with content-defined chunking
- Must be > 0 (zero-length blocks are invalid)
§Returns
BlockInfo descriptor with zero-block semantics:
offset = 0length = 0logical_len = logical_lenchecksum = 0
§Examples
§Detecting and Creating Zero Blocks
use hexz_core::ops::write::{is_zero_chunk, create_zero_block};
use hexz_core::format::index::BlockInfo;
let chunk = vec![0u8; 65536]; // 64 KiB of zeros
if is_zero_chunk(&chunk) {
let info = create_zero_block(chunk.len() as u32);
assert_eq!(info.offset, 0);
assert_eq!(info.length, 0);
assert_eq!(info.logical_len, 65536);
println!("Zero block: No storage required!");
}§Usage in Packing Loop
for (idx, chunk) in chunks.iter().enumerate() {
let info = if is_zero_chunk(chunk) {
// Optimize: No compression, no write
create_zero_block(chunk.len() as u32)
} else {
// Normal path: Compress and write
write_block(&mut out, chunk, idx as u64, &mut offset, None::<&mut StandardHashTable>, &compressor, None, &hasher, &mut hash_buf, &mut compress_buf, &mut encrypt_buf)?
};
// Add info to index page...
}§Performance
- Time complexity: O(1) (no I/O, no computation)
- Space complexity: O(1) (fixed-size struct)
- Typical savings: 99.97% vs. compressed zeros
§Reader Behavior
When a reader encounters a zero block (offset=0, length=0):
- Recognize zero-block pattern from metadata
- Allocate buffer of size
logical_len - Fill buffer with zeros (optimized memset)
- Return buffer to caller
No decompression, decryption, or checksum verification is performed.
§Interaction with Deduplication
Zero blocks do not participate in deduplication:
- They are never written to disk → no physical offset → no dedup entry
- Each zero block gets its own metadata descriptor
- This is fine: Metadata is cheap (20 bytes), and all zero blocks have same content
§Interaction with Encryption
Zero blocks work correctly with encryption:
- They are detected before compression/encryption
- Encrypted snapshots still use zero-block optimization
- Readers synthesize zeros without decryption
This is safe because zeros are public information (no confidentiality lost).
§Validation
IMPORTANT: This function does NOT validate that the original chunk was actually
all zeros. The caller is responsible for calling is_zero_chunk first.
If a non-zero chunk is incorrectly marked as a zero block, readers will return zeros instead of the original data (silent data corruption).