Expand description
IR program builders - construct the megakernel Program from vyre IR.
Two flavours:
- Interpreted (
build_program_sharded) - If-tree opcode dispatch. - JIT (
build_program_jit) - payload processor fused directly.
Functions§
- build_
program - Build the default megakernel IR (256 lanes × 1 workgroup, no custom opcodes).
- build_
program_ jit - Build the JIT Megakernel IR where payload processor logic is fused into the body stream.
- build_
program_ jit_ slots - Build the JIT megakernel IR for an explicit number of ring slots.
- build_
program_ priority - Build a priority-aware megakernel IR.
- build_
program_ priority_ slots - Build a priority-aware megakernel IR for an explicit ring slot count.
- build_
program_ sharded - Build the megakernel IR with a custom workgroup size and optional custom opcodes.
- build_
program_ sharded_ no_ io - Build the megakernel IR without the IO polling sidecar.
- build_
program_ sharded_ once_ slots - Build a finite one-pass sharded megakernel IR for host-submitted batches.
- build_
program_ sharded_ once_ slots_ control_ report_ shared - Build a finite one-pass megakernel that reports completion through the control buffer only.
- build_
program_ sharded_ once_ slots_ shared - Shared-Arc variant of
build_program_sharded_once_slotsfor hot runtime dispatchers that must not clone the megakernel template every launch. - build_
program_ sharded_ slots - Build the megakernel IR for an explicit number of ring slots.
- build_
program_ sharded_ slots_ shared - Build the sharded megakernel IR as a shared immutable template.
- build_
program_ sharded_ with_ io_ polling - Build the megakernel IR with the experimental IO polling sidecar.
- build_
program_ sharded_ with_ workspace_ adapter - Build the sharded megakernel IR with a consumer-owned resident workspace.
- build_
program_ with_ self_ loading_ miss_ handler - Build the megakernel IR with a self-loading load-miss handler.
- persistent_
body - The body that runs once per iteration per lane. Exposed for tests and downstream crates that splice additional opcodes.
- persistent_
body_ jit - The JIT body that runs once per iteration per lane.
- persistent_
body_ priority - Priority-aware loop body. Replaces the per-lane 1:1 slot mapping with the scheduler’s priority scan.
- persistent_
body_ priority_ slots - Priority-aware loop body for an explicit ring slot count.
- try_
build_ program_ with_ self_ loading_ miss_ handler - Fallible variant of
build_program_with_self_loading_miss_handler. - try_
persistent_ body - Fallible persistent body builder with explicit staging-allocation reporting.