Classification Loss

Classification loss is used in models that learn to identify a set of one or more classes from a set of classes. For example, determining the digit of a handwritten character in the MNIST dataset is a classification loss problem.

Typically, cross-entropy based loss functions are used for classification problems where cross-entropy measures the difference between two probability distributions.[1]

There are three main types of classification: Binary, Multi-Class and Multi-Label. Raúl Gómez has a great explanation of classification losses in his blog, “Understanding Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss, Softmax Loss, Logistic Loss, Focal Loss and all those confusing names”

Binary Classification

Binary classification models learn to identify one of two classes presented in the input. For example, the model may learn whether or not an item is within a input image or not. For a great visual explanation of binary classification, see “Understanding binary cross entropy/log loss: a visual explanation” by Daniel Godoy.

Binary Cross Entropy Loss – the binary cross entropy loss learns whether an input is within one of two classes.
MyCaffe layer: SoftmaxCrossEntropyLayer
MyCaffe layer: SigmoidCrossEntropyLayer

Softmax Loss – the softmax loss uses a softmax to learn the probability of an input falling into one class or another. A binary loss problem would have two classes. This loss can also be used in multi-class classification.
MyCaffe layer: SoftmaxLossLayer

Hinge Loss (Hinge) – the hinge loss “incorporates a margin or distance from the classification boundary into the cost function.”[2] This loss can also be used in multi-label classification.
MyCaffe layer: HingeLossLayer

See the Binary Classification Loss sample on Github for an example on how these loss layers are used to solve a simple binary classification problem.

Multi-Class Classification

Categorical classification models learn to identify which class an item fits in from within a set of multiple classes. For example, models used to detect handwritten characters in the MNIST dataset are categorical classification models.
MyCaffe layer: SoftmaxLossLayer
MyCaffe layer: SoftmaxCrossEntropyLayer
MyCaffe layer: SigmoidCrossEntropyLayer
MyCaffe layer: HingeLossLayer

See the Multi-Class Classification Loss sample on Github for an example on how these loss layers are used to solve a simple multi-class classification problem.

Multi-Label Classification

Multi-label classification models learn to identify more than one class of items within a given input. For example, such a model may be used to identify cats and dogs within a set of input images that have cats, dogs, cars and trees in the available classes to choose from.
MyCaffe layer: SigmoidCrossEntropyLayer
MyCaffe layer: HingeLossLayer

See the Multi-Label Classification Loss sample on Github for an example on how these loss layers are used to solve a simple multi-label classification problem.

[1] Programmathically, An Introduction to Neural Network Loss Functions, 2021.
[1] Programmathically, Understanding Hinge Loss and the SVM Cost Function, 2021.

How Can We Help?

Classification Loss

Binary Classification

Multi-Class Classification

Multi-Label Classification