Skip to main content

compile_ptx_arch

Function compile_ptx_arch 

Source
pub fn compile_ptx_arch<S: AsRef<str>>(source: S) -> Result<Ptx, GpuError>
Expand description

Compile a kernel source string to PTX with the SAME device-keyed NVRTC options PtxModuleCache::get_or_compile uses — crucially the --gpu-architecture pin (#1551), without which NVRTC defaults below sm_60 and rejects atomicAdd(double*, double). Call sites that compile via the bare cudarc::nvrtc::compile_ptx (no options) MUST route through this instead when their kernel uses double atomics, or the device path silently falls back to the CPU.