# Classification Loss

Classification loss is used in models that learn to identify a set of one or more classes from a set of classes.  For example, determining the digit of a handwritten character in the MNIST dataset is a classification loss problem.

Typically, cross-entropy based loss functions are used for classification problems where cross-entropy measures the difference between two probability distributions.[1]

There are three main types of classification: Binary, Multi-Class and Multi-Label.  Raúl Gómez has a great explanation of classification losses in his blog, “Understanding Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss, Softmax Loss, Logistic Loss, Focal Loss and all those confusing names

##### Binary Classification

Binary classification models learn to identify one of two classes presented in the input.  For example, the model may learn whether or not an item is within a input image or not.  For a great visual explanation of binary classification, see “Understanding binary cross entropy/log loss: a visual explanation” by Daniel Godoy.

Binary Cross Entropy Loss – the binary cross entropy loss learns whether an input is within one of two classes.
MyCaffe layer: SoftmaxCrossEntropyLayer
MyCaffe layer: SigmoidCrossEntropyLayer

Softmax Loss – the softmax loss uses a softmax to learn the probability of an input falling into one class or another.  A binary loss problem would have two classes.  This loss can also be used in multi-class classification.
MyCaffe layer: SoftmaxLossLayer

Hinge Loss (Hinge) – the hinge loss “incorporates a margin or distance from the classification boundary into the cost function.”[2] This loss can also be used in multi-label classification.
MyCaffe layer: HingeLossLayer

See the Binary Classification Loss sample on Github for an example on how these loss layers are used to solve a simple binary classification problem.

##### Multi-Class Classification

Categorical classification models learn to identify which class an item fits in from within a set of multiple classes.  For example, models used to detect handwritten characters in the MNIST dataset are categorical classification models.
MyCaffe layer: SoftmaxLossLayer
MyCaffe layer: SoftmaxCrossEntropyLayer
MyCaffe layer: SigmoidCrossEntropyLayer
MyCaffe layer: HingeLossLayer

See the Multi-Class Classification Loss sample on Github for an example on how these loss layers are used to solve a simple multi-class classification problem.

##### Multi-Label Classification

Multi-label classification models learn to identify more than one class of items within a given input.  For example, such a model may be used to identify cats and dogs within a set of input images that have cats, dogs, cars and trees in the available classes to choose from.
MyCaffe layer: SigmoidCrossEntropyLayer
MyCaffe layer: HingeLossLayer

See the Multi-Label Classification Loss sample on Github for an example on how these loss layers are used to solve a simple multi-label classification problem.

[1] Programmathically, An Introduction to Neural Network Loss Functions, 2021.
[1] Programmathically, Understanding Hinge Loss and the SVM Cost Function, 2021.