Wavelet-Based Denoising Using Hidden Markov Models
description
Transcript of Wavelet-Based Denoising Using Hidden Markov Models
Wavelet-Based Denoising Using Hidden Markov Models
M. Jaber Borran and Robert D. Nowak
Rice University
Some properties of DWT
• Primary– Locality Match more signals– Multiresolution
– Compression Sparse DWT’s
• Secondary– Clustering Dependency within scale
– Persistence Dependency across scale
Probabilistic Model for an Individual Wavelet Coefficient
• Compression many small coefficients
few large coefficients
S
W
pS(1)
fW|S(w|1)
pS(2)
fW|S(w|2)
fW (w)
Probabilistic Model for a Wavelet Transform
Ignoring the dependencies
Independent Mixture (IM) Model
Clustering
Hidden Markov Chain Model
Persistence
Hidden Markov Tree Model
t
f
Parameters of HMT Model
• pmf of the root node
• transition probability
• (parameters of the) conditional pdfs
e.g. if Gaussian Mixture is used
)(1
mpS
rmii
,)(,
)|()( |, mwfwfii SWmi
2,, and mimi
: Model Parameter Vector
Dependency between Signs of Wavelet Coefficients
SignalWavelet
0T
T
w1T
w1
0T/2
TT/2
T/2
w2
T/2
w2
T
T/2 w2
T/2 w2
T
New Probabilistic Model for Individual Wavelet Coefficients
S
W
pS(1)
fW|S(w|1)
pS(2)
fW|S(w|2)
fW (w)pS(3)
fW|S(w|3)
pS(4)
fW|S(w|4)
• Use one-sided functions as conditional probability densities
Proposed Mixture PDF
• Use exponential distributions as components of the mixture distribution
0 0
0 )|(
,
,|
w
weswf
wsi
SW
si
ii
0 0
0 )|(
,
,|
w
weswf
wsi
SW
si
ii
m even
m odd
PDF of the Noisy Wavelet Coefficients
yQesyf si
y
siSY
sisi
ii ,2
1
,|
22,,
)|( m even
),0(~ where, 2Nnnwy
yQesyf si
y
siSY
sisi
ii ,2
1
,|
22,,
)|(
Wavelet transform is orthonormal, therefore if the additive noise is white and zero-mean Gaussian process with variance then we have
Noisy wavelet coefficient,
m odd
Training the HMT Model
• y: Observed noisy wavelet coefficients
• s: Vector of hidden states• Model parameter vector
Maximum likelihood parameter estimation:
)|( maximize θyYθ
f
Intractable, because s is unobserved (hidden).
Model Training Using Expectation Maximization Algorithm
• and then,
N
i
ssiiS
N
iiiSY
ii
iispfsyff
ffff
2
,)(,1
1|
)(
1)()|( ,)|(),|(
)|(),|()|,()|(
θsθsy
θsθsyθsyθx
SY
SYYSX
• Define the set of complete data, x = (y,s)
ll fU θyθxθθ XSθ
,|)|(logE),( maximize
EM Algorithm (continued)
Nii
ii
N
M
N
iiiSY
N
i
ssiiS
l
M
ll
syfspf
ffU
,...,1 1|
2
,)(,1|
,...,1|
)|(loglog)(log),|(
)|,(log),|(),(
)(
1
sYS
sYSYS
θys
θsyθysθθ
• State a posteriori probabilities are calculated using Upward-Downward algorithm• Root state a priori pmf and the state transition probabilities are calculated using Lagrange multipliers for maximizing U.• Parameters of the conditional pdf may be calculated analytically or numerically, to maximize the function U.
Denoising
yQ
e
syfswyfswfsywf
wywy
SYWSYSWYSW
2
222
2
)()(
2
1
||||
2
1
)|(),|()|(),|(
• MAP estimate:
2| ),|(maxargˆ ysywfw YSW
ws
s
sYSWSs
ww
sywfsps
ˆ
|
ˆˆ
),|ˆ()(maxargˆ
Denoising (continued)
• Conditional Mean estimate:
s
M
sS wspw ˆ)(ˆ
1
)(2
,|Eˆ 22
12
yy
Q
esyww
y
s
0 0.5 1-100
0
100
200
Orig
inal
0 0.5 1-100
0
100
200
Noi
sy
0 0.5 1-100
0
100
200
4 G
auss
ian,
Haa
r
0 0.5 1-100
0
100
200
4 G
auss
ian,
D8
0 0.5 1-100
0
100
200
4 E
xpon
entia
l, H
aar
0 0.5 1-100
0
100
200
4 E
xpon
entia
l, D
8
Init. MSE = 24.639723 4 mix, Haar 4 mix, D8
Gaussian Mixture 3.078267 7.020152Exponential Mixture 2.326472 7.030970
0 0.5 1-20
0
20
Orig
inal
0 0.5 1-20
0
20
Noi
sy
0 0.5 1-20
0
20
2 G
auss
ian,
D8
0 0.5 1-20
0
20
4 G
auss
ian,
D8
0 0.5 1-20
0
20
2 E
xpon
entia
l, D
8
0 0.5 1-20
0
20
4 E
xpon
entia
l, D
8
Init. MSE = 2.429741 2 mix, D8 4 mix, D8
Gaussian Mixture 0.471568 0.417795Exponential Mixture 0.426488 0.397808
0 0.5 1-200
-100
0
100
Orig
inal
0 0.5 1-200
0
200
Noi
sy
0 0.5 1-200
0
200
2 G
auss
ian,
D4
0 0.5 1-200
-100
0
100
4 G
auss
ian,
D8
0 0.5 1-200
-100
0
100
2 E
xpon
entia
l, D
4
0 0.5 1-200
-100
0
100
4 E
xpon
entia
l, D
8
Init. MSE = 92.907059 2 mix, D4 4 mix, D8
Gaussian Mixture 8.442306 7.873508Exponential Mixture 8.394187 7.862579
Conclusion
• We observed a high correlation between the signs of the wavelet coefficients in adjacent scales.
• We used one-sided distributions as mixture components for individual wavelet coefficients.
• We used hidden Markov tree model to capture the dependencies.
• The proposed method achieves better MSE in denoising and the denoised signals are much smoother.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-100
-50
0
50
100
150Noisy Signal
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-60
-40
-20
0
20
40
60
80
100
120
140Denoised Using 4 Gaussian Mixture and Haar Wavelet
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-60
-40
-20
0
20
40
60
80
100
120
140Denoised Using 4 Exponential Mixture and Haar Wavelet
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-15
-10
-5
0
5
10
15
20Noisy Signal
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-15
-10
-5
0
5
10
15Denoised Using 4 Gaussian Mixture and Daubechy Length 8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-15
-10
-5
0
5
10
15Denoised Using 4 Exponential Mixture and Daubechy Length 8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-200
-150
-100
-50
0
50
100
150Noisy Signal
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-150
-100
-50
0
50
100Denoised Using 4 Gaussian Mixture and Daubechy Length 8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-150
-100
-50
0
50
100Denoised Using 4 Exponential Mixture and Daubechy Length 8