pub struct PtxKernel {
pub name: String,
pub params: Vec<PtxParam>,
pub body: Vec<PtxInstruction>,
pub registers: Vec<Register>,
pub shared_decls: Vec<SharedDecl>,
}Expand description
A PTX kernel function (.visible .entry).
Built by constructing parameters, allocating registers, and pushing
instructions. Call set_registers with the
allocator’s output before emission so the kernel knows which .reg
declarations to emit.
Fields§
§name: StringKernel entry point name.
params: Vec<PtxParam>Declared parameters (in signature order).
body: Vec<PtxInstruction>Instruction body.
registers: Vec<Register>All registers used, for .reg declaration emission.
Shared memory declarations (emitted after register declarations).
Implementations§
Source§impl PtxKernel
impl PtxKernel
Sourcepub fn push(&mut self, instr: PtxInstruction)
pub fn push(&mut self, instr: PtxInstruction)
Append an instruction to the kernel body.
Sourcepub fn set_registers(&mut self, regs: Vec<Register>)
pub fn set_registers(&mut self, regs: Vec<Register>)
Set the register list (from super::register::RegisterAllocator::into_allocated).
Add a shared memory declaration to the kernel preamble.
Sourcepub fn stats(&self) -> KernelStats
pub fn stats(&self) -> KernelStats
Compute structural statistics about this kernel’s emitted PTX.
Walks the instruction body and counts instruction types, registers by kind, and declared shared memory. Useful for inspection and comparison between kernel variants.
These are not runtime profiling data — final hardware register allocation and occupancy may differ after CUDA driver compilation.