Structs

  • | Spatial batch normalization’s gradient, | depending on the various input sizes, | is a bit more complex than usual gradient | operators. |
  • | Applies spatial batch normalization | to the input tensor as described in the | original paper, [Batch | | Normalization: Accelerating Deep | Network Training by Reducing Internal | Covariate Shift] | | (https://arxiv.org/abs/1502.03167). | | Be aware, this operator has two different | output sets, depending on the value | of is_test*. According to the paper, | the primary operation of spatial batch | normalization is: | | $$Y = \frac{X - \mu_x}{\sqrt{\sigma^2_{x} | + \epsilon}}*\gamma + b$$ | | In the equation, $\mu_x$ is the mean, | $X$ is the input data, $\sigma^2_{x}$ | is the var, $\epsilon$ is epsilon, | $\gamma$ is the scale, $b$ is the bias, | and $Y$ is the output data. | | The momentum arg also affects this | calculation in the computation of the | running mean and variance. | | The influence of momentum is as follows: | | $$running_mean = running_mean * | momentum + mean (1 - momentum)$$ | | $$running_var = running_var * momentum | + var (1 - momentum)$$ | | Output when is_test = 0 (train mode): | Y, mean, var, saved_mean, saved_var | | Output when is_test = 1 (test mode): | Y | | Github Links: | | - https://github.com/pytorch/pytorch/blob/master/caffe2/operators/spatial_batch_norm_op.cc | | - https://github.com/pytorch/pytorch/blob/master/caffe2/operators/spatial_batch_norm_op.h |

Functions