Margin-based Losses
Margin-based loss functions are particularly useful for binary classification. In contrast to the distance-based losses, these do not care about the difference between true target and prediction. Instead they penalize predictions based on how well they agree with the sign of the target.
This section lists all the subtypes of MarginLoss that are implemented in this package.
ZeroOneLoss
LossFunctions.ZeroOneLoss — Type
ZeroOneLoss <: MarginLossThe classical classification loss. It penalizes every misclassified observation with a loss of 1 while every correctly classified observation has a loss of 0. It is not convex nor continuous and thus seldom used directly. Instead one usually works with some classification-calibrated surrogate loss, such as L1HingeLoss.
\[L(a) = \begin{cases} 1 & \quad \text{if } a < 0 \\ 0 & \quad \text{if } a >= 0\\ \end{cases}\]
Lossfunction Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
1 │------------┐ │ 1 │ │
│ | │ │ │
│ | │ │ │
│ | │ │_________________________│
│ | │ │ │
│ | │ │ │
│ | │ │ │
0 │ └------------│ -1 │ │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -2 2
y * h(x) y * h(x)PerceptronLoss
LossFunctions.PerceptronLoss — Type
PerceptronLoss <: MarginLossThe perceptron loss linearly penalizes every prediction where the resulting agreement <= 0. It is Lipschitz continuous and convex, but not strictly convex.
\[L(a) = \max \{ 0, -a \}\]
Lossfunction Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
2 │\. │ 0 │ ┌------------│
│ '.. │ │ | │
│ \. │ │ | │
│ '. │ │ | │
L │ '. │ L' │ | │
│ \. │ │ | │
│ '. │ │ | │
0 │ \.____________│ -1 │------------┘ │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -2 2
y ⋅ ŷ y ⋅ ŷL1HingeLoss
LossFunctions.L1HingeLoss — Type
L1HingeLoss <: MarginLossThe hinge loss linearly penalizes every predicition where the resulting agreement < 1 . It is Lipschitz continuous and convex, but not strictly convex.
\[L(a) = \max \{ 0, 1 - a \}\]
Lossfunction Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
3 │'\. │ 0 │ ┌------│
│ ''_ │ │ | │
│ \. │ │ | │
│ '. │ │ | │
L │ ''_ │ L' │ | │
│ \. │ │ | │
│ '. │ │ | │
0 │ ''_______│ -1 │------------------┘ │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -2 2
y ⋅ ŷ y ⋅ ŷSmoothedL1HingeLoss
LossFunctions.SmoothedL1HingeLoss — Type
SmoothedL1HingeLoss <: MarginLossAs the name suggests a smoothed version of the L1 hinge loss. It is Lipschitz continuous and convex, but not strictly convex.
\[L(a) = \begin{cases} \frac{0.5}{\gamma} \cdot \max \{ 0, 1 - a \} ^2 & \quad \text{if } a \ge 1 - \gamma \\ 1 - \frac{\gamma}{2} - a & \quad \text{otherwise}\\ \end{cases}\]
Lossfunction (γ=2) Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
2 │\. │ 0 │ ,r------│
│ '. │ │ ./' │
│ \. │ │ ,/ │
│ '. │ │ ./' │
L │ '. │ L' │ ,' │
│ \. │ │ ,/ │
│ ', │ │ ./' │
0 │ '*-._________│ -1 │______./ │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -2 2
y ⋅ ŷ y ⋅ ŷModifiedHuberLoss
LossFunctions.ModifiedHuberLoss — Type
ModifiedHuberLoss <: MarginLossA special (4 times scaled) case of the SmoothedL1HingeLoss with γ=2. It is Lipschitz continuous and convex, but not strictly convex.
\[L(a) = \begin{cases} \max \{ 0, 1 - a \} ^2 & \quad \text{if } a \ge -1 \\ - 4 a & \quad \text{otherwise}\\ \end{cases}\]
Lossfunction Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
5 │ '. │ 0 │ .+-------│
│ '. │ │ ./' │
│ '\ │ │ ,/ │
│ \ │ │ ,/ │
L │ '. │ L' │ ./ │
│ '. │ │ ./' │
│ \. │ │______/' │
0 │ '-.________│ -5 │ │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -2 2
y ⋅ ŷ y ⋅ ŷDWDMarginLoss
LossFunctions.DWDMarginLoss — Type
DWDMarginLoss <: MarginLossThe distance weighted discrimination margin loss. It is a differentiable generalization of the L1HingeLoss that is different than the SmoothedL1HingeLoss. It is Lipschitz continuous and convex, but not strictly convex.
\[L(a) = \begin{cases} 1 - a & \quad \text{if } a \le \frac{q}{q+1} \\ \frac{1}{a^q} \frac{q^q}{(q+1)^{q+1}} & \quad \text{otherwise}\\ \end{cases}\]
Lossfunction (q=1) Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
2 │ ". │ 0 │ ._r-│
│ \. │ │ ./ │
│ ', │ │ ./ │
│ \. │ │ / │
L │ "\. │ L' │ . │
│ \. │ │ / │
│ ":__ │ │ ; │
0 │ '""---│ -1 │---------------┘ │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -2 2
y ⋅ ŷ y ⋅ ŷL2MarginLoss
LossFunctions.L2MarginLoss — Type
L2MarginLoss <: MarginLossThe margin-based least-squares loss for classification, which penalizes every prediction where agreement != 1 quadratically. It is locally Lipschitz continuous and strongly convex.
\[L(a) = {\left( 1 - a \right)}^2\]
Lossfunction Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
5 │ . │ 2 │ ,r│
│ '. │ │ ,/ │
│ '\ │ │ ,/ │
│ \ │ ├ ,/ ┤
L │ '. │ L' │ ./ │
│ '. │ │ ./ │
│ \. .│ │ ./ │
0 │ '-.____.-' │ -3 │ ./ │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -2 2
y ⋅ ŷ y ⋅ ŷL2HingeLoss
LossFunctions.L2HingeLoss — Type
L2HingeLoss <: MarginLossThe truncated least squares loss quadratically penalizes every predicition where the resulting agreement < 1. It is locally Lipschitz continuous and convex, but not strictly convex.
\[L(a) = \max \{ 0, 1 - a \}^2\]
Lossfunction Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
5 │ . │ 0 │ ,r------│
│ '. │ │ ,/ │
│ '\ │ │ ,/ │
│ \ │ │ ,/ │
L │ '. │ L' │ ./ │
│ '. │ │ ./ │
│ \. │ │ ./ │
0 │ '-.________│ -5 │ ./ │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -2 2
y ⋅ ŷ y ⋅ ŷLogitMarginLoss
LossFunctions.LogitMarginLoss — Type
LogitMarginLoss <: MarginLossThe margin version of the logistic loss. It is infinitely many times differentiable, strictly convex, and Lipschitz continuous.
\[L(a) = \ln (1 + e^{-a})\]
Lossfunction Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
2 │ \. │ 0 │ ._--/""│
│ \. │ │ ../' │
│ \. │ │ ./ │
│ \.. │ │ ./' │
L │ '-_ │ L' │ .,' │
│ '-_ │ │ ./ │
│ '\-._ │ │ .,/' │
0 │ '""*-│ -1 │__.--'' │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -4 4
y ⋅ ŷ y ⋅ ŷExpLoss
LossFunctions.ExpLoss — Type
ExpLoss <: MarginLossThe margin-based exponential loss for classification, which penalizes every prediction exponentially. It is infinitely many times differentiable, locally Lipschitz continuous and strictly convex, but not clipable.
\[L(a) = e^{-a}\]
Lossfunction Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
5 │ \. │ 0 │ _,,---:'""│
│ l │ │ _r/"' │
│ l. │ │ .r/' │
│ ": │ │ .r' │
L │ \. │ L' │ ./ │
│ "\.. │ │ .' │
│ '":,_ │ │ ,' │
0 │ ""---:.__│ -5 │ ./ │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -2 2
y ⋅ ŷ y ⋅ ŷSigmoidLoss
LossFunctions.SigmoidLoss — Type
SigmoidLoss <: MarginLossContinuous loss which penalizes every prediction with a loss within in the range (0,2). It is infinitely many times differentiable, Lipschitz continuous but nonconvex.
\[L(a) = 1 - \tanh(a)\]
Lossfunction Derivative
┌────────────┬────────────┐ ┌────────────┬────────────┐
2 │""'--,. │ 0 │.. ..│
│ '\. │ │ "\. ./" │
│ '. │ │ ', ,' │
│ \. │ │ \ / │
L │ "\. │ L' │ \ / │
│ \. │ │ \. ./ │
│ \, │ │ \. ./ │
0 │ '"-:.__│ -1 │ ',_,' │
└────────────┴────────────┘ └────────────┴────────────┘
-2 2 -2 2
y ⋅ ŷ y ⋅ ŷ