Expand description
PTX text emitter for vyre KernelDescriptor.
Consumes a substrate-neutral vyre_lower::KernelDescriptor and
produces NVRTC-compatible PTX assembly text. The emitter owns only
PTX construction; descriptor shaping and substrate-neutral
analyses stay in vyre-lower.
§Op coverage
Mirrors vyre-emit-naga for parity:
Literal(U32, I32, F32, Bool)LocalInvocationId/GlobalInvocationId/WorkgroupId(axis 0/1/2)LoadGlobal/StoreGlobal(scalar U32/I32/F32/Bool, plus packedv2/v4U32/I32/F32 chains when the descriptor presents unit-stride adjacent accesses)BinOpKindfor the common arithmetic/logic setUnOpKindfor Negate / LogicalNot / BitNotCastbetween scalar typesSelect,FmaStructuredIfThen,StructuredIfThenElse,StructuredBlock,Region,Return, workgroup-scopeBarrier
Out of scope (returns EmitError::UnsupportedOp or
EmitError::InvalidDescriptor): indirect-dispatch (host concern),
MemoryOrdering::GridSync until a native cooperative-grid lowering is
wired, and descriptor forms without a PTX-safe lowering.
§PTX output shape
//
// Generated by vyre-emit-ptx (target sm_70)
//
.version 7.0
.target sm_70
.address_size 64
.visible .entry main(
.param .u64 _arg_<binding_name>
)
{
.reg .pred %p<N>;
.reg .u32 %r<N>;
.reg .s32 %s<N>;
.reg .f32 %f<N>;
.reg .u64 %rd<N>;
<body>
ret;
}Modules§
- patterns
- PTX-specific emit-time patterns.
Structs§
- Compute
Capability - Target compute capability for PTX emit. Defaults to
sm_70(Volta), the broad-compatibility floor for the shipped PTX op set. - PtxEmit
Options - CUDA PTX emission knobs that affect instruction selection but not descriptor semantics.
Enums§
Functions§
- emit
- emit_
optimized - Emit PTX text from a
KernelDescriptorafter running the fullvyre_lower::rewrites::run_alloptimization pipeline. Recommended overemitfor production use - fewer dead instructions, fewer redundant loads, lower register pressure. - emit_
optimized_ with_ stats - Like
emit_optimizedbut also returnsvyre_lower::rewrites::OptimizationStats. - emit_
optimized_ with_ target - Same as
emit_with_targetbut runs the optimization pipeline first. - emit_
optimized_ with_ target_ with_ stats - The full-power variant: optimize first AND target a specific
compute capability AND surface OptimizationStats. Combines
emit_optimized_with_targetandemit_optimized_with_stats. - emit_
with_ options - emit_
with_ target