Skip to main content

Module prefetch

Module prefetch 

Source
Expand description

Background and anticipatory prefetch logic.

Implements heuristics for reading future blocks ahead of demand to hide storage latency for sequential and patterned workloads. Adaptive block prefetching for sequential I/O optimization.

This module implements a simple yet effective prefetching strategy that reduces read latency for sequential workloads by speculatively loading future blocks into the cache before they are explicitly requested. The prefetcher operates on a fixed-window readahead model, making it predictable and suitable for streaming workloads such as virtual machine disk images, database scans, and media files.

§Architecture

The prefetcher is intentionally minimalist: it maintains a single atomic window size parameter and computes prefetch targets as a contiguous range of block indices following the current access. There is no pattern detection, adaptive resizing, or history tracking—this simplicity ensures low overhead and predictable behavior.

Key design decisions:

  • Fixed window size: The prefetch distance is constant (typically 4-16 blocks) and does not adjust based on observed access patterns
  • No sequential detection: The prefetcher assumes sequential access; the caller (typically File::read_at_into_uninit) is responsible for deciding when to trigger prefetching based on observed workload patterns
  • Atomic configuration: Window size can be updated at runtime without locks, enabling dynamic tuning in response to workload changes
  • Stateless operation: Each call to get_prefetch_targets is independent, with no persistent state about past accesses

§Performance Characteristics

§When Prefetching Helps

Prefetching is highly effective for:

  • Sequential reads: Reading large files or VM disk images linearly
  • Streaming workloads: Media playback, log file processing, database scans
  • High-latency backends: S3, HTTP, or network-attached storage where I/O latency dominates (>10ms per block)
  • Large block sizes: With 64 KiB or larger blocks, prefetch overhead is amortized over substantial data volumes

Measured benefits (sequential reads on S3 backend, 64 KiB blocks):

  • 4-block prefetch: ~40% latency reduction (30ms → 18ms per read)
  • 8-block prefetch: ~60% latency reduction (30ms → 12ms per read)
  • 16-block prefetch: ~70% latency reduction (30ms → 9ms per read)

§When Prefetching Hurts

Prefetching imposes costs and should be avoided for:

  • Random access patterns: Database index lookups, sparse file reads, or non-sequential I/O waste bandwidth and cache capacity on unneeded blocks
  • Low-latency backends: Local SSD or NVMe storage (< 1ms latency) where prefetch overhead exceeds the latency benefit
  • Small block sizes: With 4 KiB blocks, prefetch overhead (thread spawning, cache contention) can exceed the data transfer time
  • Memory-constrained systems: Prefetched blocks evict useful cached data, potentially reducing overall cache hit rate

Measured overheads (random reads on local SSD, 64 KiB blocks):

  • 4-block prefetch: ~15% throughput loss (wasted reads evict useful cache entries)
  • 8-block prefetch: ~30% throughput loss
  • Recommendation: Disable prefetching (window_size = 0) for random workloads

§Resource Usage

§Memory Overhead

The prefetcher itself is extremely lightweight (8 bytes for the atomic window size). However, prefetched blocks consume cache capacity:

  • Per-block overhead: Typically block_size + ~100 bytes for cache metadata
  • Worst-case prefetch buffer: window_size × block_size bytes
  • Example: 8-block window with 64 KiB blocks = 512 KiB prefetch buffer

§CPU Overhead

  • Per-access cost: O(window_size) to compute prefetch targets (typically < 1 µs)
  • Background I/O: Prefetch requests spawn lightweight threads or async tasks (implementation-dependent, managed by File)

§Bandwidth Overhead

  • Sequential access: Near-zero waste (prefetched blocks are used)
  • Random access: Up to 100% waste (prefetched blocks are never accessed)
  • Mixed workloads: Proportional to sequential vs. random ratio

§Configuration Guidelines

§Choosing Window Size

Backend TypeLatencyRecommended WindowRationale
Local SSD/NVMe< 1ms0-2 blocksPrefetch overhead exceeds benefit
Local HDD5-10ms2-4 blocksModerate latency hiding
S3/Cloud Storage20-100ms8-16 blocksHigh latency hiding critical
HTTP/Remote50-200ms16-32 blocksMaximum latency hiding needed

§Tuning Heuristics

A reasonable starting formula:

window_size = ceil(backend_latency_ms / (block_size_kb / throughput_mbps))

Example: S3 backend with 50ms latency, 64 KiB blocks, 100 MB/s throughput:

window_size = ceil(50 / (64 / 100)) ≈ ceil(50 / 0.64) ≈ 78 blocks

(In practice, 16-32 blocks is sufficient due to concurrent I/O)

§Disabling Prefetch

To disable prefetching entirely:

  • Set window_size = 0 when constructing Prefetcher::new(0)
  • Or pass prefetch_window_size = None to File::with_options()

§Examples

§Basic Usage

use hexz_core::cache::prefetch::Prefetcher;

// Configure 8-block readahead for streaming workloads
let prefetcher = Prefetcher::new(8);

// Application reads block 100
let targets = prefetcher.get_prefetch_targets(100);
assert_eq!(targets, vec![101, 102, 103, 104, 105, 106, 107, 108]);

// Background task: fetch blocks 101-108 into cache
// (actual prefetch execution is handled by File)

§Adaptive Tuning

use hexz_core::cache::prefetch::Prefetcher;
use std::sync::Arc;

let prefetcher = Arc::new(Prefetcher::new(4));

// Detect high-latency backend (e.g., S3) and increase window
// (Note: window_size update method would need to be added)
// prefetcher.set_window_size(16);

// Detect random access pattern and disable prefetching
// prefetcher.set_window_size(0);

§Integration with File

let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
let compressor = Box::new(Lz4Compressor::new());

// Enable prefetching at File creation
let snapshot = File::with_cache(
    backend,
    compressor,
    None, // encryptor
    Some(512 * 1024 * 1024), // cache_capacity_bytes (512 MiB)
    Some(8), // prefetch_window_size (8 blocks)
)?;

// Prefetching happens automatically during sequential reads
// (triggered internally by File::read_at_into_uninit)

§Implementation Notes

§Why No Pattern Detection?

This implementation deliberately omits sequential pattern detection (unlike Linux kernel readahead or database buffer managers) for several reasons:

  • Caller context: File has full visibility into access patterns and can make better decisions about when to prefetch
  • Simplicity: No history tracking, no stride detection, no state machine overhead
  • Predictability: Fixed behavior is easier to reason about and debug
  • Composability: Higher-level policies (in File or applications) can layer sophisticated heuristics atop this simple primitive

§Thread Safety

The prefetcher is fully thread-safe via atomic operations. Multiple threads can concurrently call get_prefetch_targets without contention. If runtime window size adjustment is needed, AtomicU32 supports lock-free updates via store(Ordering::Relaxed).

§Future Extensions

Potential enhancements (not currently implemented):

  • Adaptive window sizing: Adjust window based on cache hit rate or latency
  • Stride detection: Support strided access patterns (e.g., reading every Nth block)
  • Multi-stream prefetch: Prefetch from multiple streams (Disk + Memory) simultaneously
  • Priority hints: Mark prefetch requests as low-priority to avoid evicting hot data

Structs§

Prefetcher
Thread-safe prefetch manager with configurable lookahead window.