Struct AsyncUringStream

Source

pub struct AsyncUringStream<'a> { /* private fields */ }

Expand description

Streaming reader that pushes chunked reads into an io_uring SQ and advances an atomic tail pointer the megakernel observes.

Implementations§

Source §

impl<'a> AsyncUringStream<'a>

Source

pub fn new( ring_state: IoUringState, gpu_buffer: GpuMappedBuffer<'a>, megakernel_tail: &'a AtomicU32, ) -> Self

Create a stream bound to the given ring state, GPU-mapped buffer, and megakernel tail pointer.

Source

pub fn replace_buffer(&mut self, gpu_buffer: GpuMappedBuffer<'a>)

Rebind the target mapped buffer for future submissions.

Source

pub unsafe fn submit_read_to_gpu( &mut self, fd: i32, offset: u64, len: u32, chunk_idx: usize, iovs_storage: &mut [Iovec], ) -> Result<(), PipelineError>

Submit a scattered read of len bytes at file offset offset into the slot at chunk_idx * len within the GPU buffer.

§Errors

PipelineError::QueueFull if the SQ is full OR the destination slot exceeds buffer bounds.
Range errors surface later as PipelineError::IoUringSyscall on poll if the kernel rejects the SQE.

§Safety

iovs_storage must live until this SQE’s completion is reaped; the kernel dereferences iov_base at I/O time, not submit time.

Source

pub unsafe fn submit_read_to_gpu_at( &mut self, fd: i32, offset: u64, len: u32, target_offset: u64, iovs_storage: &mut [Iovec], ) -> Result<(), PipelineError>

Submit a read directly into a byte offset inside the mapped buffer.

Unlike AsyncUringStream::submit_read_to_gpu, this method does not derive the destination from a fixed chunk index. Wrappers that stream variable-sized shards can place each read contiguously in a staging buffer without being forced into chunk_idx * len layout.

§Errors

PipelineError::QueueFull if the SQ is full OR the target range exceeds the mapped buffer bounds.

§Safety

iovs_storage must live until this SQE’s completion is reaped.

Source

pub unsafe fn submit_read_to_gpu_at_with_user_data( &mut self, fd: i32, offset: u64, len: u32, target_offset: u64, user_data: u64, iovs_storage: &mut [Iovec], ) -> Result<(), PipelineError>

Submit a read into an arbitrary byte offset and preserve caller-defined user_data for completion correlation.

§Errors

Returns PipelineError::QueueFull when the SQ is full, the iovec storage is empty, or the target range exceeds the mapped GPU buffer.

§Safety

iovs_storage must live until this SQE’s completion is reaped.

Source

pub fn flush_submissions(&mut self) -> Result<(), PipelineError>

Submit any queued SQEs to the kernel.

SQPOLL can pick up tail updates on its own, but wrappers that rely on bounded latency should not depend on the polling thread waking promptly. Flushing pending submissions makes progress explicit.

Source

pub fn poll(&mut self) -> Result<u32, PipelineError>

Reap available completions, advancing the megakernel tail pointer once per success. Returns completions reaped.

§Errors

Returns PipelineError::IoUringSyscall on the first CQE reporting res < 0. Remaining CQEs are still drained so the ring does not overflow, but only the first failure is returned - caller re-polls to pick up subsequent errors or successes.

Source

pub fn wait_for_completion(&mut self) -> Result<(), PipelineError>

Flush pending submissions + wait for at least one completion.

§Errors

Returns PipelineError::IoUringSyscall if io_uring_enter fails.

Source

pub fn inflight(&self) -> u32

Number of submissions still awaiting completion.

Source

pub unsafe fn submit_nvme_passthrough( &mut self, fd: i32, user_data: u64, nvme_sqe_bytes: &[u8], ) -> Result<(), PipelineError>

Submit an NVMe passthrough command via IORING_OP_URING_CMD. Requires the uring-cmd-nvme feature and Linux kernel 6.0+.

The NVMe SQE layout is encoded by the caller in nvme_sqe_bytes (64 bytes) - the SQE is memcpy’d into the addr3+addr slots the kernel forwards to the NVMe driver. user_data is returned on the matching CQE so the caller can correlate completions.

§Errors

PipelineError::NvmePassthroughDisabled if the uring-cmd-nvme feature is not enabled at compile time. This variant is unreachable in this cfg-gated method but remains part of the public error contract shared with the feature-gated implementation.
PipelineError::QueueFull if the SQ is full or the NVMe command buffer is malformed (must be exactly 64 bytes).

§Safety

fd must be an open character device the caller has IORING_SETUP_CQE32-compatible access to (e.g. /dev/ng0n1).
nvme_sqe_bytes must encode a valid NVMe command - kernel rejection returns an errno on the CQE, but a forged payload can still trigger device-level misbehavior.

Source

pub unsafe fn submit_read_fixed( &mut self, fd: i32, offset: u64, len: u32, chunk_idx: usize, buf_index: u16, ) -> Result<(), PipelineError>

Submit an IORING_OP_READ_FIXED into a pre-registered buffer.

Requires the caller to have previously called super::ring::IoUringState::register_buffers with an iovec slice whose entry buf_index covers the target range. Because the kernel skips per-SQE iovec validation, this path is 20-40% lower latency than submit_read_to_gpu on hot loops.

§Errors

PipelineError::QueueFull if the SQ is full or the destination range exceeds the GPU buffer bounds.

§Safety

The buf_index must reference a still-registered iovec whose region overlaps chunk_idx * len .. (chunk_idx + 1) * len inside the GpuMappedBuffer. Mis-indexing produces a kernel DMA into the wrong region - silent data corruption.

Source

pub unsafe fn submit_read_fixed_at( &mut self, fd: i32, offset: u64, len: u32, target_offset: u64, buf_index: u16, user_data: u64, ) -> Result<(), PipelineError>

Submit an IORING_OP_READ_FIXED into a registered buffer at an explicit destination offset inside the mapped buffer.

Unlike AsyncUringStream::submit_read_fixed, this variant decouples the CQE user_data from the destination layout so higher-level drivers can publish their own slot ids while still using a fixed slot stride.

§Errors

PipelineError::QueueFull if the SQ is full or the destination range exceeds the GPU buffer bounds.

§Safety

buf_index must reference a still-registered iovec covering the target range, and user_data must remain meaningful to the caller until the CQE is reaped.

Source

pub unsafe fn submit_read_to_gpu_fixed_file( &mut self, file_index: i32, offset: u64, len: u32, chunk_idx: usize, iovs_storage: &mut [Iovec], ) -> Result<(), PipelineError>

Submit a read using a registered-file-table index instead of a live fd. Use with super::ring::IoUringState::register_files - avoids the per-SQE file refcount bump.