Expand description
Batch normalization kernel.
Matches batchnorm-kernel-v1.yaml.
Training mode: computes per-channel mean/variance from the batch, normalizes, applies affine transform, and updates running statistics via EMA.
Inference mode: uses running mean/variance directly for normalization.
Input layout: N*C flattened, where N = batch size, C = channels.
Element (n, c) is at index n * c_count + c.
Functions§
- batchnorm_
avx2 ⚠ - AVX2 BatchNorm – delegates to scalar.
- batchnorm_
ptx - PTX assembly for BatchNorm kernel (training mode).
- batchnorm_
scalar - Scalar reference implementation of BatchNorm.