Function is_zero_chunk

Source

pub fn is_zero_chunk(chunk: &[u8]) -> bool

Expand description

Checks if a chunk consists entirely of zero bytes.

This function efficiently detects all-zero chunks to enable sparse block optimization. Zero chunks are common in VM images (unallocated space), memory dumps (zero-initialized pages), and sparse files.

§Algorithm

Uses Rust’s iterator all() combinator, which:

Short-circuits on first non-zero byte (early exit)
Compiles to SIMD instructions on modern CPUs (autovectorization)
Typically processes 16-32 bytes per instruction (AVX2/AVX-512)

§Parameters

chunk: Byte slice to check
- Empty slices return true (vacuous truth)
- Typical size: 16 KiB - 256 KiB (configurable block size)

§Returns

true: All bytes are zero (sparse block, use create_zero_block)
false: At least one non-zero byte (normal block, compress and write)

§Performance

Modern CPUs with SIMD support achieve excellent throughput:

SIMD-optimized: ~10-20 GB/s (memory bandwidth limited)
Scalar fallback: ~1-2 GB/s (without SIMD)
Typical overhead: <1% of total packing time

The check is always worth performing given the massive space savings for zero blocks.

§Examples

§Basic Usage

use hexz_core::ops::write::is_zero_chunk;

let zeros = vec![0u8; 65536];
assert!(is_zero_chunk(&zeros));

let data = vec![0u8, 1u8, 0u8];
assert!(!is_zero_chunk(&data));

let empty: &[u8] = &[];
assert!(is_zero_chunk(empty)); // Empty is considered "all zeros"

§Packing Loop Integration

for (idx, chunk) in chunks.iter().enumerate() {
    let info = if is_zero_chunk(chunk) {
        // Fast path: No compression, no write, just metadata
        create_zero_block(chunk.len() as u32)
    } else {
        // Slow path: Compress, write, create metadata
        write_block(&mut out, chunk, idx as u64, &mut offset, None::<&mut StandardHashTable>, &compressor, None, &hasher, &mut hash_buf, &mut compress_buf, &mut encrypt_buf)?
    };
    index_blocks.push(info);
}

§Benchmarking Zero Detection

use hexz_core::ops::write::is_zero_chunk;
use std::time::Instant;

let chunk = vec![0u8; 64 * 1024 * 1024]; // 64 MiB
let start = Instant::now();

for _ in 0..100 {
    let _ = is_zero_chunk(&chunk);
}

let elapsed = start.elapsed();
let throughput = (64.0 * 100.0) / elapsed.as_secs_f64(); // MB/s
println!("Zero detection: {:.1} GB/s", throughput / 1024.0);

§SIMD Optimization

On x86-64 with AVX2, the compiler typically generates code like:

vpxor    ymm0, ymm0, ymm0    ; Zero register
loop:
  vmovdqu  ymm1, [rsi]        ; Load 32 bytes
  vpcmpeqb ymm2, ymm1, ymm0   ; Compare with zero
  vpmovmskb eax, ymm2         ; Extract comparison mask
  cmp      eax, 0xFFFFFFFF    ; All zeros?
  jne      found_nonzero      ; Early exit if not
  add      rsi, 32            ; Advance pointer
  loop

This processes 32 bytes per iteration (~1-2 cycles on modern CPUs).

§Edge Cases

Empty chunks: Return true (vacuous truth, no non-zero bytes)
Single byte: Works correctly, no special handling needed
Unaligned chunks: SIMD code handles unaligned loads transparently

§Alternative Implementations

Other possible implementations (not currently used):

Manual SIMD: Use std::arch for explicit SIMD (faster but less portable)
Chunked comparison: Process in 8-byte chunks with u64 casts (faster scalar)
Bitmap scan: Use CPU’s bsf/tzcnt to skip zero regions (complex)

Current implementation relies on compiler autovectorization, which works well in practice and maintains portability.

§Correctness

This function is pure and infallible:

No side effects (read-only operation)
No panics (iterator all() is safe for all inputs)
No undefined behavior (all byte patterns are valid)

is_zero_chunk

Function is_zero_chunk Copy item path