pub struct BCEWithLogitsLoss { /* private fields */ }
Expand description
This loss combines a Sigmoid layer and the BCELoss in one single class. This version is more numerically stable than using a plain Sigmoid followed by a BCELoss as, by combining the operations into one layer, we take advantage of the log-sum-exp trick for numerical stability.
-y log (1/(1 + exp(-x))) - (1-y) log(1 - 1/(1 + exp(-x)))
Prediction comes first, label comes second.
Implementations
Trait Implementations
The first is the prediction, the second input is the label ORDER IS IMPORTANT, SECOND ARGUMENT WON’T GET GRADEINT.
Given the forward input value and backward output_grad, Update weight gradient. return backward input gradeint.
access weight values
The number of input needs by this op.
The number of output produced by this op.