Crate caffe2op_groupnorm
source ·Structs
- | Warning: mu and rsig are for backward usage or | reference. They should NOT be used as forward | activations as they have no direct gradients | computed.
- | Group Normalization (GN) operation: | https://arxiv.org/abs/1803.08494 |
Functions
- | Math: | Y = gamma * (X - mu) * rsig + beta | let s = gamma * rsig | let b = beta - gamma * mu * rsig | Y = s * X + b | let n = K * HxW | dL/dX = dL/dY * dY/dX = dL/dY * (d(s * X)/dX + db/dX) | d(s * X)/dX = s + X * ds/dX = s + gamma * X * drsig/dX | db/dX = -gamma * u * drsig/dX - gamma * rsig * dmu/dX | drsig/dX = -rsig^3 * (X - mu) / n | dmu/dX = 1 / n