# Crate caffe2op_crossentropy

source ·## Structs

- | This operator computes the cross entropy | between a $NxD$ dimensional input data | tensor $X$ and a $NxD$ dimensional input | label tensor $label$. | | The op produces a single length $N$ output | tensor $Y$. Here, $N$ is considered | the batch size and $D$ is the size of each | element in the batch. In practice, it | is most commonly used at the end of models | as a part of the loss computation, after | the SoftMax operator and before the | AveragedLoss operator. The cross entropy | operation is defined as follows | | $$Y_i = \sum_j (label_{ij} * log(X_{ij}))$$ | | where ($i$, $j$) is the classifier’s | prediction of the $j$th class (the correct | one), and $i$ is the batch size. Each | log has a lower limit for numerical stability. | | Github Links: | | - https://github.com/caffe2/caffe2/blob/master/caffe2/operators/cross_entropy_op.h | | - https://github.com/caffe2/caffe2/blob/master/caffe2/operators/cross_entropy_op.cc |
- | This operator computes the cross entropy | between a $NxD$ dimensional input data | tensor $X$ and a one dimensional input | label tensor $label$. The op produces | a single length $N$ output tensor $Y$. | Here, $N$ is considered the batch size | and $D$ is the size of each element in | the batch. In practice, it is most commonly | used at the end of models as a part of the | loss computation, after the | | SoftMax operator and before the AveragedLoss | operator. The cross entropy operation | is defined as follows | | $$Y_i = -log(X_{ij})$$ | | where ($i$, $j$) is the classifier’s | prediction of the $j$th class (the correct | one), and $i$ is the batch size. Each | log has a lower limit for numerical stability. | | The difference between
*LabelCrossEntropy*| and*CrossEntropy*is how the labels | are specified. | | Here, the labels are a length $N$ list | of integers, whereas in CrossEntropy | the labels are a $NxD$ dimensional matrix | of one hot label vectors. However, the | results of computation should be the | same, as shown in the two examples where | ($i$, $j$) is the classifier’s prediction | of the $j$th class (the correct one), | and $i$ is the batch size. Each log has | a lower limit for numerical stability. | | Github Links: | | - https://github.com/caffe2/caffe2/blob/master/caffe2/operators/cross_entropy_op.h | | - https://github.com/caffe2/caffe2/blob/master/caffe2/operators/cross_entropy_op.cc | - | Given a vector of probabilities, this | operator transforms this into a 2-column | matrix with complimentary probabilities | for binary classification. In explicit | terms, given the vector X, the output | Y is vstack(1 - X, X). | | Hacky: turns a vector of probabilities | into a 2-column matrix with complimentary | probabilities for binary classification |
- | Given two matrices logits and targets, | of same shape, (batch_size, num_classes), | computes the sigmoid cross entropy | between the two. | | Returns a tensor of shape (batch_size,) | of losses for each example. |

- | Given three matrices: logits, targets, | weights, all of the same shape, (batch_size, | num_classes), computes the weighted | sigmoid cross entropy between logits | and targets. Specifically, at each | position r,c, this computes weights[r, | c] * crossentropy(sigmoid(logits[r, | c]), targets[r, c]), and then averages | over each row. | | Returns a tensor of shape (batch_size,) | of losses for each example. |