Wavelet-Based Denoising Using Hidden Markov Models

Wavelet-Based Denoising Using Hidden Markov Models

M. Jaber Borran and Robert D. Nowak

Rice University

Some properties of DWT

• Primary– Locality Match more signals– Multiresolution

– Compression Sparse DWT’s

• Secondary– Clustering Dependency within scale

– Persistence Dependency across scale

Probabilistic Model for an Individual Wavelet Coefficient

• Compression many small coefficients

few large coefficients

S

W

pS(1)

fW|S(w|1)

pS(2)

fW|S(w|2)

fW (w)

Probabilistic Model for a Wavelet Transform

Ignoring the dependencies

Independent Mixture (IM) Model

Clustering

Hidden Markov Chain Model

Persistence

Hidden Markov Tree Model

t

f

Parameters of HMT Model

• pmf of the root node

• transition probability

• (parameters of the) conditional pdfs

e.g. if Gaussian Mixture is used

)(1

mpS

rmii

,)(,

)|()( |, mwfwfii SWmi

2,, and mimi

: Model Parameter Vector

Dependency between Signs of Wavelet Coefficients

SignalWavelet

0T

T

w1T

w1

0T/2

TT/2

T/2

w2

T/2

w2

T

T/2 w2

T/2 w2

T

New Probabilistic Model for Individual Wavelet Coefficients

S

W

pS(1)

fW|S(w|1)

pS(2)

fW|S(w|2)

fW (w)pS(3)

fW|S(w|3)

pS(4)

fW|S(w|4)

• Use one-sided functions as conditional probability densities

Proposed Mixture PDF

• Use exponential distributions as components of the mixture distribution

0 0

0 )|(

,

,|

w

weswf

wsi

SW

si

ii

0 0

0 )|(

,

,|

w

weswf

wsi

SW

si

ii

m even

m odd

PDF of the Noisy Wavelet Coefficients

yQesyf si

y

siSY

sisi

ii ,2

1

,|

22,,

)|( m even

),0(~ where, 2Nnnwy

yQesyf si

y

siSY

sisi

ii ,2

1

,|

22,,

)|(

Wavelet transform is orthonormal, therefore if the additive noise is white and zero-mean Gaussian process with variance then we have

Noisy wavelet coefficient,

m odd

Training the HMT Model

• y: Observed noisy wavelet coefficients

• s: Vector of hidden states• Model parameter vector

Maximum likelihood parameter estimation:

)|( maximize θyYθ

f

Intractable, because s is unobserved (hidden).

Model Training Using Expectation Maximization Algorithm

• and then,

N

i

ssiiS

N

iiiSY

ii

iispfsyff

ffff

2

,)(,1

1|

)(

1)()|( ,)|(),|(

)|(),|()|,()|(

θsθsy

θsθsyθsyθx

SY

SYYSX

• Define the set of complete data, x = (y,s)

ll fU θyθxθθ XSθ

,|)|(logE),( maximize

EM Algorithm (continued)

Nii

ii

N

M

N

iiiSY

N

i

ssiiS

l

M

ll

syfspf

ffU

,...,1 1|

2

,)(,1|

,...,1|

)|(loglog)(log),|(

)|,(log),|(),(

)(

1

sYS

sYSYS

θys

θsyθysθθ

• State a posteriori probabilities are calculated using Upward-Downward algorithm• Root state a priori pmf and the state transition probabilities are calculated using Lagrange multipliers for maximizing U.• Parameters of the conditional pdf may be calculated analytically or numerically, to maximize the function U.

Denoising

yQ

e

syfswyfswfsywf

wywy

SYWSYSWYSW

2

222

2

)()(

2

1

||||

2

1

)|(),|()|(),|(

• MAP estimate:

2| ),|(maxargˆ ysywfw YSW

ws

s

sYSWSs

ww

sywfsps

ˆ

|

ˆˆ

),|ˆ()(maxargˆ

Denoising (continued)

• Conditional Mean estimate:

s

M

sS wspw ˆ)(ˆ

1

)(2

,|Eˆ 22

12

yy

Q

esyww

y

s

0 0.5 1-100

0

100

200

Orig

inal

0 0.5 1-100

0

100

200

Noi

sy

0 0.5 1-100

0

100

200

4 G

auss

ian,

Haa

r

0 0.5 1-100

0

100

200

4 G

auss

ian,

D8

0 0.5 1-100

0

100

200

4 E

xpon

entia

l, H

aar

0 0.5 1-100

0

100

200

4 E

xpon

entia

l, D

8

Init. MSE = 24.639723 4 mix, Haar 4 mix, D8

Gaussian Mixture 3.078267 7.020152Exponential Mixture 2.326472 7.030970

0 0.5 1-20

0

20

Orig

inal

0 0.5 1-20

0

20

Noi

sy

0 0.5 1-20

0

20

2 G

auss

ian,

D8

0 0.5 1-20

0

20

4 G

auss

ian,

D8

0 0.5 1-20

0

20

2 E

xpon

entia

l, D

8

0 0.5 1-20

0

20

4 E

xpon

entia

l, D

8

Init. MSE = 2.429741 2 mix, D8 4 mix, D8


0 0.5 1-200

-100

0

100

Orig

inal

0 0.5 1-200

0

200

Noi

sy

0 0.5 1-200

0

200

2 G

auss

ian,

D4

0 0.5 1-200

-100

0

100

4 G

auss

ian,

D8

0 0.5 1-200

-100

0

100

2 E

xpon

entia

l, D

4

0 0.5 1-200

-100

0

100

4 E

xpon

entia

l, D

8

Init. MSE = 92.907059 2 mix, D4 4 mix, D8


Conclusion

• We observed a high correlation between the signs of the wavelet coefficients in adjacent scales.

• We used one-sided distributions as mixture components for individual wavelet coefficients.

• We used hidden Markov tree model to capture the dependencies.

• The proposed method achieves better MSE in denoising and the denoised signals are much smoother.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-100

-50

0

50

100

150Noisy Signal

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-60

-40

-20

0

20

40

60

80

100

120

140Denoised Using 4 Gaussian Mixture and Haar Wavelet

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-60

-40

-20

0

20

40

60

80

100

120

140Denoised Using 4 Exponential Mixture and Haar Wavelet

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-15

-10

-5

0

5

10

15

20Noisy Signal

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-15

-10

-5

0

5

10

15Denoised Using 4 Gaussian Mixture and Daubechy Length 8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-15

-10

-5

0

5

10

15Denoised Using 4 Exponential Mixture and Daubechy Length 8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-200

-150

-100

-50

0

50

100

150Noisy Signal

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-150

-100

-50

0

50

100Denoised Using 4 Gaussian Mixture and Daubechy Length 8

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-150

-100

-50

0

50

100Denoised Using 4 Exponential Mixture and Daubechy Length 8

Wavelet-Based Denoising Using Hidden Markov Models

Documents

Transcript of Wavelet-Based Denoising Using Hidden Markov Models