Skip to main content

Crate vyre_emit_ptx

Crate vyre_emit_ptx 

Source
Expand description

PTX text emitter for vyre KernelDescriptor.

Consumes a substrate-neutral vyre_lower::KernelDescriptor and produces NVRTC-compatible PTX assembly text. The emitter owns only PTX construction; descriptor shaping and substrate-neutral analyses stay in vyre-lower.

§Op coverage

Mirrors vyre-emit-naga for parity:

  • Literal (U32, I32, F32, Bool)
  • LocalInvocationId / GlobalInvocationId / WorkgroupId (axis 0/1/2)
  • LoadGlobal / StoreGlobal (scalar U32/I32/F32/Bool, plus packed v2/v4 U32/I32/F32 chains when the descriptor presents unit-stride adjacent accesses)
  • BinOpKind for the common arithmetic/logic set
  • UnOpKind for Negate / LogicalNot / BitNot
  • Cast between scalar types
  • Select, Fma
  • StructuredIfThen, StructuredIfThenElse, StructuredBlock, Region, Return, workgroup-scope Barrier

Out of scope (returns EmitError::UnsupportedOp or EmitError::InvalidDescriptor): indirect-dispatch (host concern), MemoryOrdering::GridSync until a native cooperative-grid lowering is wired, and descriptor forms without a PTX-safe lowering.

§PTX output shape

//
// Generated by vyre-emit-ptx (target sm_70)
//
.version 7.0
.target sm_70
.address_size 64

.visible .entry main(
    .param .u64 _arg_<binding_name>
)
{
    .reg .pred  %p<N>;
    .reg .u32   %r<N>;
    .reg .s32   %s<N>;
    .reg .f32   %f<N>;
    .reg .u64   %rd<N>;

    <body>

    ret;
}

Modules§

patterns
PTX-specific emit-time patterns.

Structs§

ComputeCapability
Target compute capability for PTX emit. Defaults to sm_70 (Volta), the broad-compatibility floor for the shipped PTX op set.
PtxEmitOptions
CUDA PTX emission knobs that affect instruction selection but not descriptor semantics.

Enums§

EmitError

Functions§

emit
emit_optimized
Emit PTX text from a KernelDescriptor after running the full vyre_lower::rewrites::run_all optimization pipeline. Recommended over emit for production use - fewer dead instructions, fewer redundant loads, lower register pressure.
emit_optimized_with_stats
Like emit_optimized but also returns vyre_lower::rewrites::OptimizationStats.
emit_optimized_with_target
Same as emit_with_target but runs the optimization pipeline first.
emit_optimized_with_target_with_stats
The full-power variant: optimize first AND target a specific compute capability AND surface OptimizationStats. Combines emit_optimized_with_target and emit_optimized_with_stats.
emit_with_options
emit_with_target