Skip to main content

Module builder

Module builder 

Source
Expand description

IR program builders - construct the megakernel Program from vyre IR.

Two flavours:

  • Interpreted (build_program_sharded) - If-tree opcode dispatch.
  • JIT (build_program_jit) - payload processor fused directly.

Functions§

build_program
Build the default megakernel IR (256 lanes × 1 workgroup, no custom opcodes).
build_program_jit
Build the JIT Megakernel IR where payload processor logic is fused into the body stream.
build_program_jit_slots
Build the JIT megakernel IR for an explicit number of ring slots.
build_program_priority
Build a priority-aware megakernel IR.
build_program_priority_slots
Build a priority-aware megakernel IR for an explicit ring slot count.
build_program_sharded
Build the megakernel IR with a custom workgroup size and optional custom opcodes.
build_program_sharded_no_io
Build the megakernel IR without the IO polling sidecar.
build_program_sharded_once_slots
Build a finite one-pass sharded megakernel IR for host-submitted batches.
build_program_sharded_once_slots_control_report_shared
Build a finite one-pass megakernel that reports completion through the control buffer only.
build_program_sharded_once_slots_shared
Shared-Arc variant of build_program_sharded_once_slots for hot runtime dispatchers that must not clone the megakernel template every launch.
build_program_sharded_slots
Build the megakernel IR for an explicit number of ring slots.
build_program_sharded_slots_shared
Build the sharded megakernel IR as a shared immutable template.
build_program_sharded_with_io_polling
Build the megakernel IR with the experimental IO polling sidecar.
build_program_sharded_with_workspace_adapter
Build the sharded megakernel IR with a consumer-owned resident workspace.
build_program_with_self_loading_miss_handler
Build the megakernel IR with a self-loading load-miss handler.
persistent_body
The body that runs once per iteration per lane. Exposed for tests and downstream crates that splice additional opcodes.
persistent_body_jit
The JIT body that runs once per iteration per lane.
persistent_body_priority
Priority-aware loop body. Replaces the per-lane 1:1 slot mapping with the scheduler’s priority scan.
persistent_body_priority_slots
Priority-aware loop body for an explicit ring slot count.
try_build_program_with_self_loading_miss_handler
Fallible variant of build_program_with_self_loading_miss_handler.
try_persistent_body
Fallible persistent body builder with explicit staging-allocation reporting.