Speech Enhancement
-
Upload
octavia-stevens -
Category
Documents
-
view
32 -
download
4
description
Transcript of Speech Enhancement
1
Speech Speech Enhancement Enhancement
2
3
4
5
tttttt
ttt
vyaazzyEy
vyz
ˆ
noise additiveFor
1KSpeech Noisy
1K Noise
1KSpeech Clean
t
t
t
z
v
y
Wiener Filtering:A linear estimation of clean signal from the noisy signal Using MMSE criterion
6
iontranspositHermitian:H
:i.e. signal.noisy the toorthogonal is
ˆ
error thesuch that selected is
if minimum is ˆ
Error SquareMean The
:Theorem Projection
2
Htttt
Httt
ttttt
ttttt
tt
vyvyaEvyyE
vyvyay
vyayyy
a
yyE
7
1
tt
:have llThen we'
0 i.e.,
,eduncorrelatandmean-zero
be toyand vAssuming
Htt
Htt
Htt
Htt
Htt
Htt
Htt
Htt
vvyyEyyEa
vvyyaEyyE
yvEvyE
8
1
ttt
t
t
vyy
vH
tt
yH
tt
a
vvE
yyE
Since y and v are zero mean:
This is called the time domain Wiener filter
9
0ˆ
ˆ
ˆ 2
tztytyE
tytyt
tytyE
dhtz
dzthyt
ˆ
We are looking for a frequency-domain Wiener filter, called the non-causal Wiener filter such that:
According to the projection theorem, for the error
to be minimum, the difference
has to be orthogonal to the noisy input
10
dhRR
t
dhtRtR
zdhtzEztyE
zdhtztyE
zzyz
zzyz
:
or
0or
11
)
:(since
:
:
on)(Convoluti *
:
tttt
tttyz
yyyz
yz
zz
zz
yz
zzyz
zzyz
vyEyyE
vyyER
SS
tzandtybetweenSpectrumcrossS
tzafSpectrumS
S
SjH
jHSS
hRR
t
12
zz
vvzz
vvyy
yy
vvyyzz
S
SSjH
SS
SjH
SSS
Popular form of Wiener filter
13
14
Spectral SubtractionSpectral Subtraction
ttt
ttt
VYZ
vyz
15
16
ttt
t
tt
t
zjttt
tt
toldtt
ZHY
Z
VZH
eVZY
VZ
ZnVnV
.ˆ
ˆ
ˆˆ
1ˆˆ
21
2
22
21
22
22
,
2
17
zyEMMSE
zyPMAP
yzPML
TtyyTtzz
snObservatioParametersEMMSE
nsObservatioParametersPMAPPosterioriaMaximum
ParametersnsObservatioPMLlihoodLikeMaximum
tt
:3
:2
:1
1,,1,0,1,,1,0,
:3
)(2
)(1
18
19
zymsPzyP
kyq
Ttkyq
L
M
kyq
SeqWeight
LmTtmm
MsTtss
TtzzRz
TtvvRv
TtyyRy
M
s
L
myy
t
t
tt
tt
tK
t
tK
t
tK
t
vyvy,,,lnmax,lnmax
:,
1,,0,,
1
1
,
.
,,1,1,,1,0,
,,1,1,,1,0,
1,,1,0,
1,,1,0,
1,,1,0,
1 1
MAP Speech Enhancement
20
,
,
1
1,
,
,
,
0.,1
ln1ln
1
1,,ln
,,1
,1,,1,0,
lnmax
v
ttt
ms
ktt
y
H
TtzHkyqky
zkyPzkyP
ky
zkymsP
kymsPky
RkyTtkyky
zyP
vy
vy
vy
21
22
zymsp
zymsp
vyyms
vyyms
,,lnmax
,,,lnmax
,,
,,
zymsp vy
ms,,,lnmax
,
23
tt
ttt
zzz
zygEyg
,,
ˆ
00
0
MMSE Speech Enhancement
We try to optimize the function:
g(.) is a function on Rk and
24
M L N Pt
t
tt
ttttt
tt
tttttt
M L N Pt
tt
zG
zG
zpnmsPzW
pnmszygE
zWyg
1 1 1 10
0
00
1 1 1 10
,,,,
,,,,
,,,,,,
,,,,
.,,,
25
cpcncmcsczbcncp
ccn
csnacscm
ccs
tsa
t
tsts tmtm tntn tptp
tztG
zbcczG
,,,..1
..10
:10 :1
0 :10 :1
0
0,,,,
,,,0||0,,,,0
26
1,,1,0,
,,,,,,,,
det2
21exp
,,,
1
21
,,
2
1
,,
KkkYyg
dypnmszypygpnmszygE
zzpnmszb
tt
tttttttvyttttttt
ptntmtst
k
tptntmtst
Trt
ttttt
The computation of Eqn1 is generally difficult. For some specific functions, Eqn1 has been derived.For instance, when g(.) is defined to be:
Where is the kth coefficient of the DFT of yt ,Eqn1 is equivalent to the popular Wiener filter
)(kYt
27
28
,,,
,,,,,,,,
||
1 1 1 1
1010
t
M L N Pt
tt
t
zbccaa
zGzG
Recursive Formula For G:
29
30
31
32
33
34
35
36
37
38
39
40
Automatic Noise Type Selection
41
42
43
t
mitmi
t
t
ttt
N
gLm
Mi
ca
N
g
NgyK
of Covariance
,,2,1
,,2,1,
,,,, :Parameters HMM-NS
sourceGaussian mean -zero
iidan be to(assumed Residual Stationary :
Function ticDeterminis :
1
,,
Nonstationary State HMM
44
1
,0
,0
,
21
,2
,0
,
21exp
.2
1,,
)orthogonal(usually polynomialorder rth an :
stateith visit the to timestarting The :
,,2,1
,polynomial be toassumed
isfunction ticdeterminis theif example,For
mj
R
rrmjt
TrR
rrmjt
mj
Kt
r
i
mit
R
rirmit
dhrBydhrBy
dmjb
h
MmNthrBy
Nonstationary-State HMM
45
,,
,,,,,,,,,maxarg,,
,,,,,,,,,max,,
sequenceduration :,,,
sequencen observatio:,,,
sequence state:,,,
1010,,,
1010,,,
110
110
110
110
110
vi
yyyddmmjssspdmj
yyyddmmjssspdmj
ddd
yyy
sss
ttttsss
t
ttttsss
t
T
T
T
t
t
Segmentation Algorithm in NS-HMM
46
LmMjTtfor
avimj
mjbcavimj
MjLm
mjbcmj
ijtvi
t
tjmijt
tL
jit
tjmj
1,1,0
.,,maxarg0,,
|,0,,...,,maxmaxmax0,,
state) Markov new a (entering 0dfor Recursion -2
1,1
0,,..0,,
:tionInitializa -1
1,,
|1
1
01
0
Segmentation Algorithm in NS-HMM
47
state) a within changednot is mixture the(assuming
0,1,1,0
1,,,,
|,,...1,,,,
looping) (self 0dfor stepRecursion -3
|
tdLmMjTtfor
dmjdmj
dmjbcadmjdmj
t
tjmjjtt
48
0,,3,2
*,*,**,*,*
ngBacktracki -5
,,maxmaxmaxarg*,*,*
,,maxmaxmax*
nTerminatio -4
1111
1
1
011111
1
1
011
TTtfor
dmsdms
dmidms
dmip
ttttttt
T
T
d
L
m
M
iTTT
T
T
d
L
m
M
i
49
tdPNLMfor
zdG
zdGzdW
ddpnmsygEW
dyddpnmsyfyg
zdWzygE
d
tt
ttt
t
ttttttt
tttttttyt
tM L N P T
dt
tt
1,1,1,1,1
,,,,,
,,,,,,,,,,
,,,,|
,,,,|
.,,,,,|
0
00
01 1 1 1 1
0
Now we generalize MMSE formulae for NS-HMM
50
functions.other than less iscost n computatio the
)ofDFTtheofcomponentthk:(
1,,0,
:for shown thatbeen hasIt specified. be tohas
, ....}|E{g ofn calculatio For the
tt
tt
t
yky
kkkyyg
yg
51
noise. andspeech ofduration
and mixture state, ingcorrespond for thefilter Wiener
theofcomponent kth theis ~
and
of DFT theofcomponent kth theis Where
~:meanith Gaussian w is
,,,,,|
,,,,,|
i.e., Gaussian. is g
ofcomponent kth theofn expectatio theshown that
hascriterion MMSE theusing estimationlinear A
,,,,
,,,,
kH
zkZ
kZkH
kYddpnmszkYfkY
dpnmszkgE
ttttt
ttttt
dpnms
tt
tdpnms
ttttttttvyt
tttttt
52
tddzbcc
aazdGzdG
zbcc
aazdG
zG
t
N Pt
tt
t
t
tM L N P t
dt
tt
1,,,,|...
..,1,,,,,,,,,
:state old in the stayingFor
0,,,,|..
...,,,,,
,0,,,,
:state new a enteringFor
s,constraintduration with G, ofn calculatio Recursive
||
1 1
1010
||
10
1 1 1 1 01
0
53
54
55
56
57
58
59
60