Expand description
Background and anticipatory prefetch logic.
Implements heuristics for reading future blocks ahead of demand to hide storage latency for sequential and patterned workloads. Adaptive block prefetching for sequential I/O optimization.
This module implements a simple yet effective prefetching strategy that reduces read latency for sequential workloads by speculatively loading future blocks into the cache before they are explicitly requested. The prefetcher operates on a fixed-window readahead model, making it predictable and suitable for streaming workloads such as virtual machine disk images, database scans, and media files.
§Architecture
The prefetcher is intentionally minimalist: it maintains a single atomic window size parameter and computes prefetch targets as a contiguous range of block indices following the current access. There is no pattern detection, adaptive resizing, or history tracking—this simplicity ensures low overhead and predictable behavior.
Key design decisions:
- Fixed window size: The prefetch distance is constant (typically 4-16 blocks) and does not adjust based on observed access patterns
- No sequential detection: The prefetcher assumes sequential access; the caller
(typically
File::read_at_into_uninit) is responsible for deciding when to trigger prefetching based on observed workload patterns - Atomic configuration: Window size can be updated at runtime without locks, enabling dynamic tuning in response to workload changes
- Stateless operation: Each call to
get_prefetch_targetsis independent, with no persistent state about past accesses
§Performance Characteristics
§When Prefetching Helps
Prefetching is highly effective for:
- Sequential reads: Reading large files or VM disk images linearly
- Streaming workloads: Media playback, log file processing, database scans
- High-latency backends: S3, HTTP, or network-attached storage where I/O latency dominates (>10ms per block)
- Large block sizes: With 64 KiB or larger blocks, prefetch overhead is amortized over substantial data volumes
Measured benefits (sequential reads on S3 backend, 64 KiB blocks):
- 4-block prefetch: ~40% latency reduction (30ms → 18ms per read)
- 8-block prefetch: ~60% latency reduction (30ms → 12ms per read)
- 16-block prefetch: ~70% latency reduction (30ms → 9ms per read)
§When Prefetching Hurts
Prefetching imposes costs and should be avoided for:
- Random access patterns: Database index lookups, sparse file reads, or non-sequential I/O waste bandwidth and cache capacity on unneeded blocks
- Low-latency backends: Local SSD or NVMe storage (< 1ms latency) where prefetch overhead exceeds the latency benefit
- Small block sizes: With 4 KiB blocks, prefetch overhead (thread spawning, cache contention) can exceed the data transfer time
- Memory-constrained systems: Prefetched blocks evict useful cached data, potentially reducing overall cache hit rate
Measured overheads (random reads on local SSD, 64 KiB blocks):
- 4-block prefetch: ~15% throughput loss (wasted reads evict useful cache entries)
- 8-block prefetch: ~30% throughput loss
- Recommendation: Disable prefetching (window_size = 0) for random workloads
§Resource Usage
§Memory Overhead
The prefetcher itself is extremely lightweight (8 bytes for the atomic window size). However, prefetched blocks consume cache capacity:
- Per-block overhead: Typically block_size + ~100 bytes for cache metadata
- Worst-case prefetch buffer:
window_size × block_sizebytes - Example: 8-block window with 64 KiB blocks = 512 KiB prefetch buffer
§CPU Overhead
- Per-access cost: O(window_size) to compute prefetch targets (typically < 1 µs)
- Background I/O: Prefetch requests spawn lightweight threads or async tasks
(implementation-dependent, managed by
File)
§Bandwidth Overhead
- Sequential access: Near-zero waste (prefetched blocks are used)
- Random access: Up to 100% waste (prefetched blocks are never accessed)
- Mixed workloads: Proportional to sequential vs. random ratio
§Configuration Guidelines
§Choosing Window Size
| Backend Type | Latency | Recommended Window | Rationale |
|---|---|---|---|
| Local SSD/NVMe | < 1ms | 0-2 blocks | Prefetch overhead exceeds benefit |
| Local HDD | 5-10ms | 2-4 blocks | Moderate latency hiding |
| S3/Cloud Storage | 20-100ms | 8-16 blocks | High latency hiding critical |
| HTTP/Remote | 50-200ms | 16-32 blocks | Maximum latency hiding needed |
§Tuning Heuristics
A reasonable starting formula:
window_size = ceil(backend_latency_ms / (block_size_kb / throughput_mbps))Example: S3 backend with 50ms latency, 64 KiB blocks, 100 MB/s throughput:
window_size = ceil(50 / (64 / 100)) ≈ ceil(50 / 0.64) ≈ 78 blocks(In practice, 16-32 blocks is sufficient due to concurrent I/O)
§Disabling Prefetch
To disable prefetching entirely:
- Set
window_size = 0when constructingPrefetcher::new(0) - Or pass
prefetch_window_size = NonetoFile::with_options()
§Examples
§Basic Usage
use hexz_core::cache::prefetch::Prefetcher;
// Configure 8-block readahead for streaming workloads
let prefetcher = Prefetcher::new(8);
// Application reads block 100
let targets = prefetcher.get_prefetch_targets(100);
assert_eq!(targets, vec![101, 102, 103, 104, 105, 106, 107, 108]);
// Background task: fetch blocks 101-108 into cache
// (actual prefetch execution is handled by File)§Adaptive Tuning
use hexz_core::cache::prefetch::Prefetcher;
use std::sync::Arc;
let prefetcher = Arc::new(Prefetcher::new(4));
// Detect high-latency backend (e.g., S3) and increase window
// (Note: window_size update method would need to be added)
// prefetcher.set_window_size(16);
// Detect random access pattern and disable prefetching
// prefetcher.set_window_size(0);§Integration with File
let backend = Arc::new(FileBackend::new("snapshot.hxz".as_ref())?);
let compressor = Box::new(Lz4Compressor::new());
// Enable prefetching at File creation
let snapshot = File::with_cache(
backend,
compressor,
None, // encryptor
Some(512 * 1024 * 1024), // cache_capacity_bytes (512 MiB)
Some(8), // prefetch_window_size (8 blocks)
)?;
// Prefetching happens automatically during sequential reads
// (triggered internally by File::read_at_into_uninit)§Implementation Notes
§Why No Pattern Detection?
This implementation deliberately omits sequential pattern detection (unlike Linux kernel readahead or database buffer managers) for several reasons:
- Caller context:
Filehas full visibility into access patterns and can make better decisions about when to prefetch - Simplicity: No history tracking, no stride detection, no state machine overhead
- Predictability: Fixed behavior is easier to reason about and debug
- Composability: Higher-level policies (in
Fileor applications) can layer sophisticated heuristics atop this simple primitive
§Thread Safety
The prefetcher is fully thread-safe via atomic operations. Multiple threads can
concurrently call get_prefetch_targets without contention. If runtime window
size adjustment is needed, AtomicU32 supports lock-free updates via
store(Ordering::Relaxed).
§Future Extensions
Potential enhancements (not currently implemented):
- Adaptive window sizing: Adjust window based on cache hit rate or latency
- Stride detection: Support strided access patterns (e.g., reading every Nth block)
- Multi-stream prefetch: Prefetch from multiple streams (Disk + Memory) simultaneously
- Priority hints: Mark prefetch requests as low-priority to avoid evicting hot data
Structs§
- Prefetcher
- Thread-safe prefetch manager with configurable lookahead window.