The detailed derivation of the derivatives in Table 2 of Marginalized Denoising Auto-encoders for...
Click here to load reader
-
Upload
tomonari-masada -
Category
Engineering
-
view
134 -
download
5
description
Transcript of The detailed derivation of the derivatives in Table 2 of Marginalized Denoising Auto-encoders for...
The detailed derivation of the derivatives in Table 2 ofMarginalized Denoising Auto-encoders for Nonlinear Representations
by M. Chen, K. Weinberger, F. Sha, and Y. Bengio
Tomonari MASADA @ Nagasaki University
October 14, 2014
The derivative ∂zh∂x̃d
can be obtained as follows:
z = σ(Wx̃+ b
)=
1
1 + exp(−Wx̃− b)(1)
∴ ∂zh∂x̃d
=∂
∂x̃d
1
1 + exp(−∑
d whdx̃d − bh)
=whd exp(−
∑d whdx̃d − bh)
{1 + exp(−∑
d whdx̃d − bh)}2
=1
1 + exp(−∑
d whdx̃d − bh)·{1− 1
1 + exp(−∑
d whdx̃d − bh)
}· whd
= zh(1− zh)whd . (2)
For the cross-entropy loss, we obtain the following:
ℓ(x, fθ(x̃)
)= −x⊤ log σ(W⊤z+ b′)− (1− x)⊤ log
{1− σ(W⊤z+ b′)
}= −x⊤ log
{1
1 + exp(−W⊤z− b′)
}− (1− x)⊤ log
{exp(−W⊤z− b′)
1 + exp(−W⊤z− b′)
}= x⊤ log{1 + exp(−W⊤z− b′)} − (1− x)⊤(−W⊤z− b′) + (1− x)⊤ log
{1 + exp(−W⊤z− b′)
}= −(1− x)⊤(−W⊤z− b′) + 1⊤ log
{1 + exp(−W⊤z− b′)
}= −
∑d
(1− xd)(−∑h
whdzh − b′d
)+∑d
log
{1 + exp
(−∑h
whdzh − b′d
)}(3)
∴ ∂ℓ
∂zh=
∑d
(1− xd)whd −∑d
whd exp(−∑
h whdzh − b′d)
1 + exp(−∑
h whdzh − b′d)(4)
∴ ∂2ℓ
∂z2h= − ∂
∂zh
∑d
whd exp(−∑
h whdzh − b′d)
1 + exp(−∑
h whdzh − b′d)
=∑d
w2hd exp(−
∑h whdzh − b′d)
1 + exp(−∑
h whdzh − b′d)−∑d
w2hd{exp(−
∑h whdzh − b′d)}2
{1 + exp(−∑
h whdzh − b′d)}2
=∑d
w2hd exp(−
∑h whdzh − b′d)
{1 + exp(−∑
h whdzh − b′d)}2
=∑d
(1
1 + exp(−∑
h whdzh − b′d)
)(1− 1
1 + exp(−∑
h whdzh − b′d)
)w2
hd
=∑d
yd(1− yd)w2hd . (5)
1
For the squared loss, we obtain the following:
ℓ(x, fθ(x̃)
)= ∥x− (W⊤z+ b′)∥2 =
∑d
{xd −
(∑h
whdzh + b′d
)}2
(6)
∴ ∂ℓ
∂zh=
∂
∂zh
∑d
{xd −
(∑h
whdzh + b′d
)}2
= −2∑d
whd
{xd −
(∑h
whdzh + b′d
)}(7)
∴ ∂2ℓ
∂z2h= − ∂
∂zh2∑d
whd
{xd −
(∑h
whdzh + b′d
)}= 2
∑d
w2hd . (8)
2