Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept....
-
date post
20-Dec-2015 -
Category
Documents
-
view
219 -
download
5
Transcript of Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept....
![Page 1: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/1.jpg)
Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone
Joseph TabrikianDept. of Electrical and Computer Engineering
Ben-Gurion University of the Negev
Workshop on:Speech Enhancement and Multichannel Audio Processing
Technion 22.2.2007
BGU
![Page 2: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/2.jpg)
Outline
Motivation Single source pitch estimation and tracking Multiple source pitch estimation and tracking Experiments Conclusion
BGU
![Page 3: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/3.jpg)
Motivation Speech enhancement Sensitivity of many audio processing
algorithms to interference. For example: Automatic speech/speaker recognition Speech/music compression
Single microphone blind source separation (BSS)
Karaoke
BGU
![Page 4: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/4.jpg)
Single Source - Modeling Voice frames - harmonic model:
additive Gaussian noise In matrix notation:
BGU
1
( ) cos( ) ( ), 1, ,K
n k n k nk
y t b t v t n N
( ) - nv t
1 1 1 1
2 2 2 2
1 cos cos sin sin
1 cos cos sin sin( )
1 cos cos sin sinN N N N
t K t t K t
t K t t K t
t K t t K t
A
( ) , ~ (0, )N vy A b v v R
0 1 1 T
c c cK s sKb b b b b b
![Page 5: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/5.jpg)
Single Source – Pitch Tracking Maximum Likelihood (ML) estimator:
Pitch tracking: The data vector at the mth frame:
- first-order Markov process: Maximum A-posteriori Probability (MAP) pitch tracking
via the Viterbi algorithm.(Tabrikian-Dubnov-Dickalov 2004)
BGU
( ) , 1, ,m m m m m M y A b v
1/ 2
1/ 2
2
11/ 2 1 1/ 2
ˆ arg max ( )
( ) ( ) ( ) ( ) ( )H H
v
v
R A
v v vR A
P y
P R A A R A A R
1
M
m m
1 11
( , , ) ( | )M
M m mm
f f
![Page 6: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/6.jpg)
Single Source - Voicing Decision Unvoiced model
Colored Gaussian noise model:
Voiced/unvoiced decision by the
Generalized Likelihood Ratio Test (GLRT):
BGU
~ ( , )N yy 0 R
2
2 2, ,
2
max ( | , , ; )GLRT=
max ( | ; ) ( )
voiced
v
unvoiced
Hv voiced
unvoicedH
f H
f H
y
b
y AR
y b y
y R I P y
(Fisher-Tabrikian-Dubnov 2006)
![Page 7: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/7.jpg)
Multiple Sources ML estimator of from under the
model: with unknown signal and unknown (Gaussian) noise covariance:
BGU
j j js y a v
1
J
j jy
12
21
ˆ arg max log max( , )max( , )
, ( ), : ( 1)
Ll
ML ll l
T Tsvd L L
GG
G
A y A A AG T R T T I a a T1
2
1
ˆ0 arg max logL
ll
G
1
1ˆ arg max MVDRT
J L
ya R a
(Harmanci-Tabrikian-Krolik 2000)
![Page 8: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/8.jpg)
Multiple Sources Voiced model:
v includes other interferences. is unknown. Using J overlapping subframes of size Ls
(2K+1<J< Ls):
jth column of :
BGU
1
ˆ arg max log ( ) ,
1( ) ( ) ( ) , ( ) ( ) ( ) ( )
J
ML jj
T T T
J
G
A A A A AG Y I U U Y A U Λ V
1 T
JyR YY
Y 1 1, , ,T
j j j N Jy y y
( ) , ~ (0, )N vy A b v v R
vR
![Page 9: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/9.jpg)
Multiple Sources Pitch tracking:
The data vector at the mth frame:
- first-order Markov process
Maximum A-posteriori Probability (MAP) pitch tracking via the Viterbi algorithm
BGU
( ) , 1, ,m m m m m M y A b v
1
M
m m
![Page 10: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/10.jpg)
Multiple Sources - Voicing Decision Unvoiced model
Colored Gaussian noise model:
Voiced/unvoiced decision by the GLRT:
BGU
~ ( , )N yy 0 R
, ,
1
max ( | , , ; )GLRT=
max ( | ; )
voiced
unvoiced
HJvoiced j
junvoiced jH
f H
f H
yv
yv
v Rb R
v GR
y b R
y R
(Fisher-Tabrikian-Dubnov 2007)
![Page 11: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/11.jpg)
Multiple Source Models Exact ML for the strongest voiced signal, and
“locally ML” for other voiced signals
BGU
1,
ˆ ˆML LML 2,
ˆLML
Lik
elih
ood
fun
ctio
n
![Page 12: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/12.jpg)
Experiments – Single Source
BGU
![Page 13: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/13.jpg)
Experiments - Two Sources
BGU
150 200 250 300 350-90
-80
-70
-60
-50
-40
-30
-20
-10
0
Frequency [Hz]
Nor
mal
ized
log-
likel
ihoo
d
Two voiced sources
![Page 14: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/14.jpg)
Experiments – Voicing Decision
BGU
![Page 15: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/15.jpg)
Experiments - – Voicing Decision
BGU
![Page 16: Multiple Pitch Tracking for Blind Source Separation Using a Single Microphone Joseph Tabrikian Dept. of Electrical and Computer Engineering Ben-Gurion.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d485503460f94a241e3/html5/thumbnails/16.jpg)
Conclusions ML pitch estimation for single and multiple sources
have been developed under the harmonic model for voiced frames.
The derived likelihood functions under the two models allow implementation of the Viterbi algorithm for MAP pitch tracking.
The GLRT for voicing decision is derived under the two models.
Future work: development of multiple hypothesis tracking methods for
single microphone BSS. Adaptive estimation of the number of harmonics
BGU