REGRESSION LOSSES - L2 / SQUARED ERROR
L(y,f(x)) =
1/2(y−f(x))
2
Convex, differentiable(gradient no problem in loss minimization)
Derivative is prop. to residual:
∂1/2(y−f(x))
2
∂f(x)
=f(x)−y=−ε
Connection to Gaussian distribution
Tries toreduce large residuals: if residual doubles, loss becomes
4 times as large, henceoutliers in y can become a problem6.65
1.15
1
2
3
4
5
6
7
0 2 4
x
y
Data & Model
6.65
1.15
0
5
10
15
−4 −2 0 2 4
Residuals=y-f(x)
L
(
f(
x
)
,
EXAMPLE: REGRESSION WITH L1 VS L2 LOSS
We could also minimize the L1 loss. This changes the risk and
optimization steps:
Remp(θ) =
n
X
i=1
L
ˇ
y
(i)
,f
ˇ
x
(i)
|θ
ıı
=
n
X
i=1