Skip to main content

Megakernel

Struct Megakernel 

Source
pub struct Megakernel { /* private fields */ }
Expand description

Orchestrated persistent-megakernel handle.

Construct with Megakernel::bootstrap (default 256 lanes x 1 workgroup) or Megakernel::bootstrap_sharded for multi-tenant fan-in. Feed bytecode with Megakernel::dispatch.

Implementations§

Source§

impl Megakernel

Source

pub fn dispatch_persistent_handles( &self, handles: MegakernelResidentHandles, ) -> Result<Vec<Vec<u8>>, PipelineError>

Dispatch using backend-resident handles for all megakernel ABI buffers.

This path never falls back to host byte buffers. If the compiled backend pipeline does not implement resident handles, the backend’s structured unsupported-feature error is returned.

§Errors

Returns PipelineError when the backend rejects persistent handles, dispatch fails, or device-loss recovery cannot rebuild the pipeline.

Source

pub fn dispatch_persistent_handles_observed( &self, handles: MegakernelResidentHandles, ) -> Result<MegakernelDispatchOutput, PipelineError>

Dispatch using backend-resident handles and return instrumentation.

§Errors

See Megakernel::dispatch_persistent_handles.

Source

pub fn dispatch_persistent_handles_into( &self, handles: MegakernelResidentHandles, outputs: &mut OutputBuffers, ) -> Result<MegakernelDispatchStats, PipelineError>

Dispatch using backend-resident handles into caller-owned output storage.

This keeps the persistent ABI buffers resident and lets callers retain host readback allocation across repeated megakernel launches.

§Errors

See Megakernel::dispatch_persistent_handles.

Source

pub fn dispatch_persistent_handles_many_observed( &self, handles: &[MegakernelResidentHandles], ) -> Result<MegakernelBatchDispatchOutput, PipelineError>

Dispatch several resident megakernel submissions through the compiled backend batch contract.

This is the many-small-launch path: callers keep every ABI buffer resident, then submit a slice of handle tuples so native backends can record one command buffer or replay one graph batch instead of paying a host submission per item.

§Errors

Returns PipelineError when the backend rejects persistent handles, any item fails, or device-loss recovery cannot rebuild the pipeline.

Source

pub fn dispatch_persistent_handles_many_into( &self, handles: &[MegakernelResidentHandles], batches: &mut Vec<OutputBuffers>, ) -> Result<MegakernelDispatchStats, PipelineError>

Dispatch several resident megakernel submissions into caller-owned nested output storage.

Existing batch rows and output slots are reused when the backend returns the same shape, avoiding nested result-vector churn in many-small-launch hot paths.

§Errors

See Megakernel::dispatch_persistent_handles_many_observed.

Source

pub fn dispatch_persistent_handles_many_with_scratch( &self, handles: &[MegakernelResidentHandles], scratch: &mut MegakernelResidentBatchScratch, ) -> Result<MegakernelDispatchStats, PipelineError>

Dispatch several resident megakernel submissions through reusable resident-batch scratch.

This is the allocation-stable many-small-launch path: resource rows and returned output batches stay owned by scratch across calls.

§Errors

See Megakernel::dispatch_persistent_handles_many_observed.

Source§

impl Megakernel

Source

pub fn dispatch_with_io_queue_readback( &self, control_bytes: Vec<u8>, ring_bytes: Vec<u8>, debug_log_bytes: Vec<u8>, io_queue_bytes: Vec<u8>, ) -> Result<MegakernelReadback, PipelineError>

Dispatch with a caller-supplied IO queue and decode the strict megakernel readback ABI.

§Errors

Returns PipelineError when dispatch fails or returned buffers do not match the compiled megakernel ABI.

Source

pub fn dispatch_with_io_queue_readback_into( &self, control_bytes: Vec<u8>, ring_bytes: Vec<u8>, debug_log_bytes: Vec<u8>, io_queue_bytes: Vec<u8>, readback: &mut MegakernelReadback, outputs: &mut OutputBuffers, ) -> Result<MegakernelDispatchStats, PipelineError>

Dispatch owned buffers with a caller-supplied IO queue and decode the strict megakernel readback ABI into caller-owned storage.

§Errors

Returns PipelineError when dispatch or readback validation fails.

Source

pub fn dispatch_with_io_queue_readback_borrowed( &self, control_bytes: &[u8], ring_bytes: &[u8], debug_log_bytes: &[u8], io_queue_bytes: &[u8], ) -> Result<MegakernelReadback, PipelineError>

Dispatch borrowed buffers with a caller-supplied IO queue and decode the strict megakernel readback ABI.

§Errors

See Megakernel::dispatch_with_io_queue_readback.

Source

pub fn dispatch_with_io_queue_readback_borrowed_observed( &self, control_bytes: &[u8], ring_bytes: &[u8], debug_log_bytes: &[u8], io_queue_bytes: &[u8], ) -> Result<(MegakernelReadback, MegakernelDispatchStats), PipelineError>

Dispatch borrowed buffers with a caller-supplied IO queue, decode the strict megakernel readback ABI, and return dispatch instrumentation.

§Errors

See Megakernel::dispatch_with_io_queue_readback_borrowed.

Source

pub fn dispatch_with_io_queue_readback_borrowed_into( &self, control_bytes: &[u8], ring_bytes: &[u8], debug_log_bytes: &[u8], io_queue_bytes: &[u8], readback: &mut MegakernelReadback, outputs: &mut OutputBuffers, ) -> Result<MegakernelDispatchStats, PipelineError>

Dispatch borrowed buffers with a caller-supplied IO queue, decode the strict megakernel readback ABI into caller-owned storage, and return dispatch instrumentation.

§Errors

Returns PipelineError when dispatch or readback validation fails.

Source§

impl Megakernel

Source

pub fn bootstrap(backend: Arc<dyn VyreBackend>) -> Result<Self, PipelineError>

Default bootstrap: 256 lanes x 1 workgroup, no custom opcodes.

§Errors

Returns PipelineError::Backend if the backend rejects the program.

Source

pub fn bootstrap_with_opcodes( backend: Arc<dyn VyreBackend>, opcodes: Vec<OpcodeHandler>, ) -> Result<Self, PipelineError>

Bootstrap with custom opcodes but default sharding.

§Errors

See Megakernel::bootstrap.

Source

pub fn worker_groups_for_geometry( slot_count: u32, workgroup_size_x: u32, ) -> Result<u32, PipelineError>

Compute worker groups for a megakernel slot geometry without compiling.

§Errors

Returns PipelineError::QueueFull when the geometry cannot map slots to whole workgroups.

Source

pub fn bootstrap_sharded( backend: Arc<dyn VyreBackend>, slot_count: u32, workgroup_size_x: u32, opcodes: Vec<OpcodeHandler>, ) -> Result<Self, PipelineError>

Full bootstrap with sharding and custom opcodes.

§Errors

Returns PipelineError::QueueFull when geometry is invalid or PipelineError::Backend from the underlying compile.

Source

pub fn bootstrap_jit( backend: Arc<dyn VyreBackend>, slot_count: u32, workgroup_size_x: u32, payload_processor: &[Node], ) -> Result<Self, PipelineError>

JIT compiler bootstrap for fused payload processors.

§Errors

See Megakernel::bootstrap_sharded.

Source

pub fn dispatch( &self, control_bytes: Vec<u8>, ring_bytes: Vec<u8>, debug_log_bytes: Vec<u8>, ) -> Result<Vec<Vec<u8>>, PipelineError>

Dispatch a full storage buffer set with an empty IO queue.

§Errors

Returns PipelineError when protocol buffers are malformed, dispatch fails, or device-loss recovery cannot rebuild the compiled pipeline.

Source

pub fn dispatch_borrowed( &self, control_bytes: &[u8], ring_bytes: &[u8], debug_log_bytes: &[u8], ) -> Result<Vec<Vec<u8>>, PipelineError>

Dispatch a borrowed storage buffer set with an empty IO queue.

§Errors

Returns PipelineError when protocol buffers are malformed, dispatch fails, or device-loss recovery cannot rebuild the compiled pipeline.

Source

pub fn dispatch_observed( &self, control_bytes: Vec<u8>, ring_bytes: Vec<u8>, debug_log_bytes: Vec<u8>, ) -> Result<MegakernelDispatchOutput, PipelineError>

Dispatch a full storage buffer set and return runtime instrumentation.

§Errors

See Megakernel::dispatch.

Source

pub fn dispatch_borrowed_observed( &self, control_bytes: &[u8], ring_bytes: &[u8], debug_log_bytes: &[u8], ) -> Result<MegakernelDispatchOutput, PipelineError>

Dispatch borrowed buffers with an empty IO queue and return runtime instrumentation.

§Errors

See Megakernel::dispatch_borrowed.

Source

pub fn dispatch_with_io_queue( &self, control_bytes: Vec<u8>, ring_bytes: Vec<u8>, debug_log_bytes: Vec<u8>, io_queue_bytes: Vec<u8>, ) -> Result<Vec<Vec<u8>>, PipelineError>

Dispatch a full storage buffer set with a caller-supplied io_queue.

§Errors

Returns PipelineError when any protocol buffer is malformed, backend dispatch fails, or device-loss recovery cannot rebuild the pipeline.

Source

pub fn dispatch_with_io_queue_borrowed( &self, control_bytes: &[u8], ring_bytes: &[u8], debug_log_bytes: &[u8], io_queue_bytes: &[u8], ) -> Result<Vec<Vec<u8>>, PipelineError>

Dispatch borrowed buffers with a caller-supplied io_queue.

§Errors

See Megakernel::dispatch_with_io_queue.

Source

pub fn dispatch_with_io_queue_observed( &self, control_bytes: Vec<u8>, ring_bytes: Vec<u8>, debug_log_bytes: Vec<u8>, io_queue_bytes: Vec<u8>, ) -> Result<MegakernelDispatchOutput, PipelineError>

Dispatch with a caller-supplied io_queue and return instrumentation.

§Errors

See Megakernel::dispatch_with_io_queue.

Source

pub fn dispatch_with_io_queue_borrowed_observed( &self, control_bytes: &[u8], ring_bytes: &[u8], debug_log_bytes: &[u8], io_queue_bytes: &[u8], ) -> Result<MegakernelDispatchOutput, PipelineError>

Dispatch borrowed buffers with a caller-supplied io_queue and return instrumentation.

§Errors

See Megakernel::dispatch_with_io_queue.

Source

pub fn dispatch_with_io_queue_borrowed_into( &self, control_bytes: &[u8], ring_bytes: &[u8], debug_log_bytes: &[u8], io_queue_bytes: &[u8], outputs: &mut OutputBuffers, ) -> Result<MegakernelDispatchStats, PipelineError>

Dispatch borrowed buffers with a caller-supplied IO queue, writing backend outputs into caller-owned storage.

§Errors

See Megakernel::dispatch_with_io_queue_borrowed.

Source

pub fn recover_after_device_loss( &self, ) -> Result<MegakernelRecoveryDecision, PipelineError>

Rebuild the compiled pipeline after device-loss symptoms.

This does not mask the failure: if recompilation fails, the structured backend error is returned with the original remediation text intact.

§Errors

Returns PipelineError::Backend when the backend cannot recompile.

Source

pub fn pipeline_id(&self) -> &str

Pipeline id from the backend.

Source

pub const fn slot_count(&self) -> u32

Slot count this kernel was sharded for.

Source

pub const fn workgroup_size_x(&self) -> u32

Workgroup size this kernel was compiled for.

Source

pub fn worker_groups(&self) -> u32

Workgroup count needed to cover every ring slot.

Source§

impl Megakernel

Source

pub fn publish_slot( ring_bytes: &mut [u8], slot_idx: u32, tenant_id: u32, opcode: u32, args: &[u32], ) -> Result<(), PipelineError>

Publish one opcode into ring_bytes[slot_idx].

§Errors

PipelineError::QueueFull when out of bounds, too many args, or the slot is still in flight.

Source

pub fn encode_work_items_ring_into( slot_count: u32, tenant_id: u32, items: &[MegakernelWorkItem], ring_bytes: &mut Vec<u8>, ) -> Result<(), PipelineError>

Reset ring_bytes to an empty ring and publish a contiguous MegakernelWorkItem queue into slots 0..items.len().

This is the hot-path publisher for one-shot megakernel launches. It validates the full batch before mutating ring_bytes, encodes an empty ring once, writes the fixed MegakernelWorkItem ABI directly, and stores slot::PUBLISHED last for each slot.

§Errors

Returns PipelineError::QueueFull when slot_count cannot encode, the queue does not fit in the ring, the slot ABI cannot hold a MegakernelWorkItem, or an item opcode is not publishable.

Source

pub fn publish_work_items( ring_bytes: &mut [u8], start_slot: u32, tenant_id: u32, items: &[MegakernelWorkItem], ) -> Result<u32, PipelineError>

Publish a contiguous fixed-ABI work-item window into an existing ring without resetting unrelated slots.

This is the resident hot path for repeated megakernel queue updates: validate the whole target window first, then write each slot once and store slot::PUBLISHED last. Unlike Megakernel::encode_work_items_ring_into, this does not clear the full ring, so sparse updates scale with items.len() rather than slot_count.

§Errors

Returns PipelineError::QueueFull when the target window is outside the ring, any slot is still in flight, or an item opcode is not publishable.

Source

pub fn encode_work_items_ring_words_into( slot_count: u32, tenant_id: u32, items: &[MegakernelWorkItem], ring_words: &mut Vec<u32>, ) -> Result<(), PipelineError>

Reset ring_words to an empty ring and publish a contiguous MegakernelWorkItem queue as native little-endian u32 words.

This is equivalent to Megakernel::encode_work_items_ring_into but avoids thousands of tiny byte-slice stores on hot dispatch paths. Callers can pass the result to backends as bytes with bytemuck::cast_slice.

§Errors

Returns PipelineError::QueueFull when slot_count cannot encode, the queue does not fit in the ring, the slot ABI cannot hold a MegakernelWorkItem, or an item opcode is not publishable.

Source

pub fn publish_packed_slot<A>( ring_bytes: &mut [u8], slot_idx: u32, tenant_id: u32, ops: &[(u8, A)], ) -> Result<(), PipelineError>
where A: AsRef<[u32]>,

Publish one packed slot containing multiple inner ops.

The inner opcode id is stored as u8; args are packed into the slot’s 12-word payload tail and addressed by per-op arg_offset values.

§Errors

Returns PipelineError::QueueFull when the packed payload exceeds the slot capacity or when the target slot is not publishable.

Source

pub fn batch_publish<A>( ring_bytes: &mut [u8], start_slot: u32, tenant_id: u32, items: &[(u32, A)], batch_tag: u32, ) -> Result<u32, PipelineError>
where A: AsRef<[u32]>,

Publish multiple slots atomically - the final slot is a BATCH_FENCE that signals completion to the host. This is the high-throughput entry point for scanner pipelines: publish N work items + 1 fence in one call.

§Errors

PipelineError::QueueFull if any slot rejects.

Source§

impl Megakernel

Source

pub fn encode_control( shutdown: bool, tenant_count: u32, observable_slots: u32, ) -> Result<Vec<u8>, PipelineError>

Encode a control-buffer payload.

§Errors

Returns PipelineError::QueueFull when the requested observable region cannot fit in process address space.

Source

pub fn try_encode_control( shutdown: bool, tenant_count: u32, observable_slots: u32, ) -> Result<Vec<u8>, PipelineError>

Fallible control-buffer encoder for callers accepting untrusted sizing.

§Errors

Returns PipelineError::QueueFull when the requested observable region cannot fit in process address space.

Source

pub fn try_encode_control_into( shutdown: bool, tenant_count: u32, observable_slots: u32, dst: &mut Vec<u8>, ) -> Result<(), PipelineError>

Fallible control-buffer encoder into caller-owned storage.

§Errors

Returns PipelineError::QueueFull when the requested observable region cannot fit in process address space.

Source

pub fn encode_empty_ring(slot_count: u32) -> Result<Vec<u8>, PipelineError>

Encode an empty ring buffer with slot_count slots.

§Errors

Returns PipelineError::QueueFull when slot_count * SLOT_WORDS * 4 overflows.

Source

pub fn try_encode_empty_ring(slot_count: u32) -> Result<Vec<u8>, PipelineError>

Fallible ring-buffer encoder for callers accepting untrusted slot counts.

§Errors

Returns PipelineError::QueueFull when slot_count * SLOT_WORDS * 4 overflows.

Source

pub fn try_encode_empty_ring_into( slot_count: u32, dst: &mut Vec<u8>, ) -> Result<(), PipelineError>

Fallible ring-buffer encoder into caller-owned storage.

§Errors

Returns PipelineError::QueueFull when slot_count * SLOT_WORDS * 4 overflows.

Source

pub fn encode_empty_debug_log( record_capacity: u32, ) -> Result<Vec<u8>, PipelineError>

Encode an empty PRINTF channel buffer.

§Errors

Returns PipelineError::QueueFull when the record capacity overflows.

Source

pub fn try_encode_empty_debug_log( record_capacity: u32, ) -> Result<Vec<u8>, PipelineError>

Fallible debug-log encoder for callers accepting untrusted capacities.

§Errors

Returns PipelineError::QueueFull when the record capacity overflows.

Source

pub fn try_encode_empty_debug_log_into( record_capacity: u32, dst: &mut Vec<u8>, ) -> Result<(), PipelineError>

Fallible debug-log encoder into caller-owned storage.

§Errors

Returns PipelineError::QueueFull when the record capacity overflows.

Source

pub fn read_done_count(control_bytes: &[u8]) -> u32

Decode the kernel’s done_count from a control buffer.

Source

pub fn try_read_done_count(control_bytes: &[u8]) -> Result<u32, PipelineError>

Strictly decode the kernel’s done_count from a control buffer.

§Errors

Returns PipelineError when the control buffer is malformed or too short to contain the done counter.

Source

pub fn try_count_done_ring_slots( ring_bytes: &[u8], item_count: usize, ) -> Result<u64, PipelineError>

Strictly count DONE slots in a ring-buffer readback.

§Errors

Returns PipelineError when the ring readback is malformed or too short for item_count complete protocol slots.

Source

pub fn read_debug_log(debug_bytes: &[u8]) -> Vec<DebugRecord>

Decode PRINTF records out of the debug-log buffer.

Source

pub fn read_debug_log_into(debug_bytes: &[u8], out: &mut Vec<DebugRecord>)

Decode PRINTF records into caller-owned storage.

Source

pub fn try_read_debug_log( debug_bytes: &[u8], ) -> Result<Vec<DebugRecord>, PipelineError>

Strictly decode PRINTF records out of the debug-log buffer.

§Errors

Returns PipelineError when the debug-log buffer is malformed or the cursor points at a partial record.

Source

pub fn try_read_debug_log_into( debug_bytes: &[u8], out: &mut Vec<DebugRecord>, ) -> Result<(), PipelineError>

Strictly decode PRINTF records into caller-owned storage.

§Errors

Returns PipelineError when the debug-log buffer is malformed or the cursor points at a partial record.

Source

pub fn read_epoch(control_bytes: &[u8]) -> u32

Read the epoch counter from a control buffer. The epoch increments on each BATCH_FENCE execution - the host polls this to detect batch completion without scanning the ring.

Source

pub fn try_read_epoch(control_bytes: &[u8]) -> Result<u32, PipelineError>

Strictly read the epoch counter from a control buffer.

§Errors

Returns PipelineError when the control buffer is malformed or too short to contain the epoch counter.

Source

pub fn read_observable(control_bytes: &[u8], index: u32) -> u32

Read an observable result word from a control buffer. Opcodes like LOAD_U32, COMPARE_SWAP, and BATCH_FENCE write results here.

Source

pub fn try_read_observable( control_bytes: &[u8], index: u32, ) -> Result<u32, PipelineError>

Strictly read an observable result word from a control buffer.

§Errors

Returns PipelineError when the buffer is malformed or the observable index is outside the supplied readback.

Source

pub fn read_metrics(control_bytes: &[u8]) -> Vec<(u32, u32)>

Read per-opcode metrics counters from a control buffer. Returns a map of opcode_id → execution_count for any non-zero counters.

Source

pub fn read_metrics_into(control_bytes: &[u8], out: &mut Vec<(u32, u32)>)

Read per-opcode metrics counters into caller-owned storage.

Source

pub fn try_read_metrics( control_bytes: &[u8], ) -> Result<Vec<(u32, u32)>, PipelineError>

Strictly read per-opcode metrics counters from a control buffer.

§Errors

Returns PipelineError when the buffer is malformed or too short for the fixed metrics window.

Source

pub fn try_read_metrics_into( control_bytes: &[u8], out: &mut Vec<(u32, u32)>, ) -> Result<(), PipelineError>

Strictly read per-opcode metrics counters into caller-owned storage.

§Errors

Returns PipelineError when the buffer is malformed or too short for the fixed metrics window.

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

Source§

impl<T> Instrument for T

Source§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided Span, returning an Instrumented wrapper. Read more
Source§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> Same for T

Source§

type Output = T

Should always be Self
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
Source§

impl<T> WithSubscriber for T

Source§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a WithDispatch wrapper. Read more
Source§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a WithDispatch wrapper. Read more