Expand description
Async userspace read-ahead adapter.
PrefetchReader is a drop-in replacement for std::io::BufReader<File>
that performs asynchronous read-ahead on a dedicated OS thread. It exists to
decouple the blocking read() wait on a file descriptor from the pipeline
worker threads that consume the bytes.
§Motivation
On Linux, the kernel’s per-device read-ahead window (read_ahead_kb) is
128 KB by default. A plain BufReader<File> is synchronous: when its
internal buffer drains, the next refill blocks in the kernel until pages
arrive from disk. During that stall the calling thread is parked and
cannot make progress on other work. In the fgumi unified pipeline, the
reader thread is also a pipeline worker, so a blocked read translates
directly into lost downstream throughput.
PrefetchReader moves the blocking read() onto a dedicated producer
thread that pushes fixed-size chunks through a bounded
crossbeam_channel. Consumers see a normal std::io::Read interface;
internally, Read::read serves bytes out of the currently-held chunk and
only blocks when the producer has not yet delivered the next one. The
upshot is that stalls on the disk become independent of stalls in the
pipeline: the producer overlaps its disk wait with the consumer’s CPU
work.
This is essentially what the kernel’s block-layer read-ahead does, but in
userspace — so it works without root, on any OS, and without having to
tune /sys/block/*/queue/read_ahead_kb.
§Lifecycle
Constructing a PrefetchReader spawns exactly one OS thread, named
fgumi-prefetch. The thread owns the inner reader and exits when any of
the following happen:
- The inner reader signals EOF (
Ok(0)fromread). - The inner reader returns an error (the error is sent through the channel, then the thread exits).
- The
PrefetchReaderis dropped (the consumer-side receiver is destroyed, the producer’s nextsendreturnsDisconnected, and the loop exits).
Drop joins the producer thread, so leaks are impossible on well-behaved
inner readers. If the inner reader is currently parked in a long read
syscall, Drop will wait for it to return.
Structs§
- Prefetch
Reader - A
Readadapter that performs asynchronous userspace prefetch on a dedicated background thread.
Constants§
- DEFAULT_
CHUNK_ SIZE - Default chunk size used by
PrefetchReader::new. - DEFAULT_
PREFETCH_ DEPTH - Default channel depth used by
PrefetchReader::new. The producer will keep up to this many filled chunks buffered ahead of the consumer.