pub struct AsyncUringStream<'a> { /* private fields */ }Expand description
Streaming reader that pushes chunked reads into an io_uring SQ and advances an atomic tail pointer the megakernel observes.
Implementations§
Source§impl<'a> AsyncUringStream<'a>
impl<'a> AsyncUringStream<'a>
Sourcepub fn new(
ring_state: IoUringState,
gpu_buffer: GpuMappedBuffer<'a>,
megakernel_tail: &'a AtomicU32,
) -> Self
pub fn new( ring_state: IoUringState, gpu_buffer: GpuMappedBuffer<'a>, megakernel_tail: &'a AtomicU32, ) -> Self
Create a stream bound to the given ring state, GPU-mapped buffer, and megakernel tail pointer.
Sourcepub fn replace_buffer(&mut self, gpu_buffer: GpuMappedBuffer<'a>)
pub fn replace_buffer(&mut self, gpu_buffer: GpuMappedBuffer<'a>)
Rebind the target mapped buffer for future submissions.
Sourcepub unsafe fn submit_read_to_gpu(
&mut self,
fd: i32,
offset: u64,
len: u32,
chunk_idx: usize,
iovs_storage: &mut [Iovec],
) -> Result<(), PipelineError>
pub unsafe fn submit_read_to_gpu( &mut self, fd: i32, offset: u64, len: u32, chunk_idx: usize, iovs_storage: &mut [Iovec], ) -> Result<(), PipelineError>
Submit a scattered read of len bytes at file offset offset
into the slot at chunk_idx * len within the GPU buffer.
§Errors
PipelineError::QueueFullif the SQ is full OR the destination slot exceeds buffer bounds.- Range errors surface later as
PipelineError::IoUringSyscallonpollif the kernel rejects the SQE.
§Safety
iovs_storage must live until this SQE’s completion is reaped;
the kernel dereferences iov_base at I/O time, not submit time.
Sourcepub unsafe fn submit_read_to_gpu_at(
&mut self,
fd: i32,
offset: u64,
len: u32,
target_offset: u64,
iovs_storage: &mut [Iovec],
) -> Result<(), PipelineError>
pub unsafe fn submit_read_to_gpu_at( &mut self, fd: i32, offset: u64, len: u32, target_offset: u64, iovs_storage: &mut [Iovec], ) -> Result<(), PipelineError>
Submit a read directly into a byte offset inside the mapped buffer.
Unlike AsyncUringStream::submit_read_to_gpu, this method does not
derive the destination from a fixed chunk index. Wrappers that stream
variable-sized shards can place each read contiguously in a staging
buffer without being forced into chunk_idx * len layout.
§Errors
PipelineError::QueueFullif the SQ is full OR the target range exceeds the mapped buffer bounds.
§Safety
iovs_storage must live until this SQE’s completion is reaped.
Sourcepub unsafe fn submit_read_to_gpu_at_with_user_data(
&mut self,
fd: i32,
offset: u64,
len: u32,
target_offset: u64,
user_data: u64,
iovs_storage: &mut [Iovec],
) -> Result<(), PipelineError>
pub unsafe fn submit_read_to_gpu_at_with_user_data( &mut self, fd: i32, offset: u64, len: u32, target_offset: u64, user_data: u64, iovs_storage: &mut [Iovec], ) -> Result<(), PipelineError>
Submit a read into an arbitrary byte offset and preserve caller-defined
user_data for completion correlation.
§Errors
Returns PipelineError::QueueFull when the SQ is full, the iovec
storage is empty, or the target range exceeds the mapped GPU buffer.
§Safety
iovs_storage must live until this SQE’s completion is reaped.
Sourcepub fn flush_submissions(&mut self) -> Result<(), PipelineError>
pub fn flush_submissions(&mut self) -> Result<(), PipelineError>
Submit any queued SQEs to the kernel.
SQPOLL can pick up tail updates on its own, but wrappers that rely on bounded latency should not depend on the polling thread waking promptly. Flushing pending submissions makes progress explicit.
Sourcepub fn poll(&mut self) -> Result<u32, PipelineError>
pub fn poll(&mut self) -> Result<u32, PipelineError>
Reap available completions, advancing the megakernel tail pointer once per success. Returns completions reaped.
§Errors
Returns PipelineError::IoUringSyscall on the first CQE
reporting res < 0. Remaining CQEs are still drained so the
ring does not overflow, but only the first failure is
returned - caller re-polls to pick up subsequent errors or
successes.
Sourcepub fn wait_for_completion(&mut self) -> Result<(), PipelineError>
pub fn wait_for_completion(&mut self) -> Result<(), PipelineError>
Flush pending submissions + wait for at least one completion.
§Errors
Returns PipelineError::IoUringSyscall if io_uring_enter
fails.
Sourcepub unsafe fn submit_nvme_passthrough(
&mut self,
fd: i32,
user_data: u64,
nvme_sqe_bytes: &[u8],
) -> Result<(), PipelineError>
pub unsafe fn submit_nvme_passthrough( &mut self, fd: i32, user_data: u64, nvme_sqe_bytes: &[u8], ) -> Result<(), PipelineError>
Submit an NVMe passthrough command via IORING_OP_URING_CMD.
Requires the uring-cmd-nvme feature and Linux kernel 6.0+.
The NVMe SQE layout is encoded by the caller in nvme_sqe_bytes
(64 bytes) - the SQE is memcpy’d into the addr3+addr slots
the kernel forwards to the NVMe driver. user_data is returned
on the matching CQE so the caller can correlate completions.
§Errors
PipelineError::NvmePassthroughDisabledif theuring-cmd-nvmefeature is not enabled at compile time. This variant is unreachable in this cfg-gated method but remains part of the public error contract shared with the feature-gated implementation.PipelineError::QueueFullif the SQ is full or the NVMe command buffer is malformed (must be exactly 64 bytes).
§Safety
fdmust be an open character device the caller hasIORING_SETUP_CQE32-compatible access to (e.g./dev/ng0n1).nvme_sqe_bytesmust encode a valid NVMe command - kernel rejection returns an errno on the CQE, but a forged payload can still trigger device-level misbehavior.
Sourcepub unsafe fn submit_read_fixed(
&mut self,
fd: i32,
offset: u64,
len: u32,
chunk_idx: usize,
buf_index: u16,
) -> Result<(), PipelineError>
pub unsafe fn submit_read_fixed( &mut self, fd: i32, offset: u64, len: u32, chunk_idx: usize, buf_index: u16, ) -> Result<(), PipelineError>
Submit an IORING_OP_READ_FIXED into a pre-registered buffer.
Requires the caller to have previously called
super::ring::IoUringState::register_buffers with an iovec
slice whose entry buf_index covers the target range. Because
the kernel skips per-SQE iovec validation, this path is 20-40%
lower latency than submit_read_to_gpu on hot loops.
§Errors
PipelineError::QueueFullif the SQ is full or the destination range exceeds the GPU buffer bounds.
§Safety
The buf_index must reference a still-registered iovec whose
region overlaps chunk_idx * len .. (chunk_idx + 1) * len
inside the GpuMappedBuffer. Mis-indexing produces a kernel
DMA into the wrong region - silent data corruption.
Sourcepub unsafe fn submit_read_fixed_at(
&mut self,
fd: i32,
offset: u64,
len: u32,
target_offset: u64,
buf_index: u16,
user_data: u64,
) -> Result<(), PipelineError>
pub unsafe fn submit_read_fixed_at( &mut self, fd: i32, offset: u64, len: u32, target_offset: u64, buf_index: u16, user_data: u64, ) -> Result<(), PipelineError>
Submit an IORING_OP_READ_FIXED into a registered buffer at an
explicit destination offset inside the mapped buffer.
Unlike AsyncUringStream::submit_read_fixed, this variant decouples
the CQE user_data from the destination layout so higher-level
drivers can publish their own slot ids while still using a fixed slot
stride.
§Errors
PipelineError::QueueFullif the SQ is full or the destination range exceeds the GPU buffer bounds.
§Safety
buf_index must reference a still-registered iovec covering the
target range, and user_data must remain meaningful to the caller
until the CQE is reaped.
Sourcepub unsafe fn submit_read_to_gpu_fixed_file(
&mut self,
file_index: i32,
offset: u64,
len: u32,
chunk_idx: usize,
iovs_storage: &mut [Iovec],
) -> Result<(), PipelineError>
pub unsafe fn submit_read_to_gpu_fixed_file( &mut self, file_index: i32, offset: u64, len: u32, chunk_idx: usize, iovs_storage: &mut [Iovec], ) -> Result<(), PipelineError>
Submit a read using a registered-file-table index instead of a
live fd. Use with
super::ring::IoUringState::register_files - avoids the
per-SQE file refcount bump.
§Errors
Same surface as AsyncUringStream::submit_read_to_gpu.
§Safety
file_index must name a still-registered fd.
iovs_storage must outlive the completion. All other
conditions match submit_read_to_gpu.