pub struct BCEWithLogitsLoss {}
Expand description
This loss combines a Sigmoid layer and the BCELoss in one single class. This version is more numerically stable than using a plain Sigmoid followed by a BCELoss as, by combining the operations into one layer, we take advantage of the log-sum-exp trick for numerical stability.
-y log (1/(1 + exp(-x))) - (1-y) log(1 - 1/(1 + exp(-x)))
Prediction comes first, label comes second.
Implementations
Trait Implementations
The first is the prediction, the second input is the label ORDER IS IMPORTANT, SECOND ARGUMENT WON’T GET GRADEINT.
Given the forward input value and backward output_grad, Update weight gradient. return backward input gradeint.
access weight values