| Dropout
takes one input data tensor
| (X
) and produces two tensor outputs,
| Y
and mask
.
|
| If the is_test
argument is zero (default=0),
| the output Y
will be the input with
| random elements zeroed.
|
| The probability that a given element
| is zeroed is determined by the ratio
| argument.
|
| If the is_test
argument is set to non-zero,
| the output Y
is exactly the same as
| the input X
.
|
| ———–
| @note
|
| outputs are scaled by a factor of $\frac{1}{1-ratio}$
| during training, so that during test
| time, we can simply compute an identity
| function. This scaling is important
| because we want the output at test time
| to equal the expected value at training
| time.
|
| Dropout has been proven to be an effective
| regularization technique to prevent
| overfitting during training.
|
| Github Links:
|
| - https://github.com/pytorch/pytorch/blob/master/caffe2/operators/dropout_op.h
|
| - https://github.com/pytorch/pytorch/blob/master/caffe2/operators/dropout_op.cc
|