The partial derivative of the binary Cross-entropy loss function

TobiasRoeschl 5,170 views 2 slides Nov 10, 2020
Slide 1
Slide 1 of 2
Slide 1
1
Slide 2
2

About This Presentation

Mathematical derivation of how to obtain the partial derivative of the binary Cross-entropy loss function used in logistic regression.


Slide Content

The partial derivative of the binary
Cross-entropy loss function
In order to nd the partial derivative of the cost functionJwith respect to a
particular weightwj, we apply the chain rule as follows:
@J
@wj
=
1
N
N
X
i=1
@J
@pi
@pi
@zi
@zi
@wj
with
J=
1
N
N
X
i=1
yiln(pi) + (1yi)ln(1pi)
and
pi=
1
1 +e
zi
and
zi=Xiw
T
+b
withX=
2
6
6
4
x1;1x1;2::: x1;n
x2;1x2;2::: x2;n
.
.
.
.
.
.
...
.
.
.
xN;1xN;2::: xN;n
3
7
7
5
, wherenis representing the number of
independent variables andNthe number of samples,
the weight vectorw=

w0::: wn1

and a scalarbrepresenting the bias term.
Note thatpiis a sigmoid function. The derivative of the sigmoid function is
given by
1
:
@(x)
@x
=(x)(1(x))
And since the derivative of the natural logarithm is
2
:
@ln (x)
@x
=
1
x
we can begin to
solve the equation above:
@J
@wj
=
1
N
N
X
i=1
@J
@pi
@pi
@zi
@zi
@wj
=
1

=
1
N
N
X
i=1
[
yi
pi
+
1yi
1pi
(1)][pi(1pi)]xj=
=
1
N
N
X
i=1
[yi(1pi)(1yi)pi]xj=
=
1
N
N
X
i=1
(yipi)xj=
=
1
N
N
X
i=1
(piyi)xj
The partial derivative of the cost functionJwith respect to the biasbcan
be calculated accordingly. Considering that the mathematical deriviation of the
formula is very similar, except that
@zi
@b
= 1, we can simply write:
@J
@b
=
1
N
N
X
i=1
(piyi)
References:
1) http://www.ai.mit.edu/courses/6.892/lecture8-html/sld015.htm
2) https://www.onlinemathlearning.com/derivative-ln.html
November 10, 2020 T. Roeschl
2
Tags