Skip to main content

batchnorm_ptx

Function batchnorm_ptx 

Source
pub fn batchnorm_ptx() -> &'static str
Expand description

PTX assembly for BatchNorm kernel (training mode).

One block per channel. Each block reduces across the batch dimension to compute per-channel mean and variance, then normalizes.