| Combined Softmax and Cross-Entropy
| loss operator. The operator first computes
| the softmax normalized values for each
| layer in the batch of the given input,
| then computes cross-entropy loss.
|
| This operator is numerically more stable
| than separate Softmax
and CrossEntropy
| ops. The inputs are a 2-D tensor logits
| of size (batch_size x input_feature_dimensions),
| which represents the unscaled log probabilities,
| and a 1-dimensional integer labels
| tensor for ground truth.
|
| An optional third input blob (weight_tensor
)
| can be used to weight the samples for
| the loss, which is useful if the training
| set is unbalanced.
|
| This operator outputs a softmax
tensor
| which contains the probability for
| each label for each example (same shape
| is logits
input), and a scalar loss
| value, which is the averaged cross-entropy
| loss between the softmax probabilities
| and the ground truth values. Use parameter
| label_prob
=1 to enable inputting
| labels as a probability distribution.
|
| Softmax cross-entropy loss function:
|
| $$loss(x, class) = -\log{\biggl(\frac{\exp(x[class])}{\sum_{j}
| \exp(x[j])}\biggr)} = -x[class] +
| \log{\biggl(\sum_{j} \exp(x[j])\biggr)}$$
|
| or if the weight_tensor
has been passed:
|
| $$loss(x, class) = weight[class]\biggl(-x[class]
| + \log{\biggl(\sum_{j} \exp(x[j])\biggr)}\biggr)$$
|
| The logits
input does not need to explicitly
| be a 2D vector; rather, it will be coerced
| into one. For an arbitrary n-dimensional
| tensor X
in $[a_0, a_1, …, a_{k-1},
| a_k, …, a_{n-1}]$, where k is the axis
| provided, then X
will be coerced into
| a 2-dimensional tensor with dimensions
| $[(a_0 … * a_{k-1}), (a_k * … * a_{n-1})]$.
| For the default case where axis
=1,
| the X
tensor will be coerced into a
| 2D tensor of dimensions $[a_0, (a_1
| * … * a_{n-1})]$, where $a_0$ is often
| the batch size. In this situation, we
| must have $a_0 = N$ and $a_1 * … * a_{n-1}
| = D$. Each of these dimensions must be
| matched correctly, or else the operator
| will throw errors.
|
| Github Links:
|
| - https://github.com/pytorch/pytorch/blob/master/caffe2/operators/softmax_with_loss_op.cc
|