Factor Analysis of Acoustic Features for Streamed Hidden Markov Modeling

Post on 18-Jan-2016

24 views 0 download

Tags:

description

Factor Analysis of Acoustic Features for Streamed Hidden Markov Modeling. Chuan-Wei Ting Department of Computer Science and Information Engineering, National Cheng Kung University, Tainan, Taiwan. Outline. Introduction Cepstral Factor Analysis FA Streamed Hidden Markov Model Experiments - PowerPoint PPT Presentation

Transcript of Factor Analysis of Acoustic Features for Streamed Hidden Markov Modeling

Factor Analysis of Acoustic Features for Factor Analysis of Acoustic Features for Streamed Hidden Markov ModelingStreamed Hidden Markov Modeling

Chuan-Wei Ting

Department of Computer Science and Information Engineering,

National Cheng Kung University, Tainan, Taiwan

2

Outline

• Introduction

• Cepstral Factor Analysis

• FA Streamed Hidden Markov Model

• Experiments

• Conclusions & Future Works

3

Outline

• IntroductionIntroduction• Stochastic modeling

• Cepstral Factor Analysis

• FA Streamed Hidden Markov Model

• Experiments

• Conclusions & Future Works

4

Introduction

• The objective of constructing acoustic model is to capture the characteristics of speech signal.

• Stochastic modeling

• Hidden Markov model (HMM)

• Multi-Stream HMM

• Factorial HMM

5

Hidden Markov Model

• Topology of HMM

• Constraints

• All features are “tied” together

• Topology

• Transition moment

• Independent assumption

1ts 1tsts

6

Multi-Stream HMM

• Topology of Multi-stream HMM

J

j

M

mjm

mj

mj

J

jjj EpppMp

1 11

)|()|()|()|( YYY

)(mj )(

1mj

)(1Mj

)1(1j

)(Mj

)1(j

7

Simplification of Multi-Stream HMM

• Streams are assumed to be statistical independent

• Weighted log-likelihood approach

J

j

M

m

mj

mj MpMp

1 1

)|(log)|(log YY

J

j

M

m

mj

mj

mj MpMp

1 1

)|(log)|(log YY

8

Factorial HMM

• Topology of FHMM

)2(1ts

)1(1ts

)(1Mts

1ty ty 1ty

)(Mts

)(1Mts

)2(ts

)2(1ts

)1(ts

)1(1ts

9

Outline

• Introduction

• Cepstral Factor AnalysisCepstral Factor Analysis• Features analysis

• Factor analysis

• FA Streamed Hidden Markov Model

• Experiments

• Conclusions & Future Works

10

Cepstral Factor Analysis

• Feature analysis

• Dynamics of different features

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

-15

-10

-5

0

5

10

15

Time (sec)

MF

CC

13th MFCC

1st MFCC

4th MFCC

• Correlations

11

Factor Analysis

• Discover the correlations inherent in observation data.

• Applications

• Data compression

• Signal processing

• Acoustic modeling

12

Mathematical Definition of FA

• FA conducts data analysis of the multivariate observations using the common factors and the specific factors.

• For a dimensional feature vector , the general form of FA model is given by

D TDyy ],[ 1 y

εfWy f

common factor1M factor loading matrix

MD

specific factor1D

),0(~ IN

),0(~ ψN

13

Principal Component Solution

• Find an estimator that will approximate the fundamental expression

• Decompose covariance matrix of observation

• FA parameters can be estimated by

fW

ψWWyy TTER ffy ][

TTTR rrrf2/1

f2/1

ffy VVVVVV

rWfWcVcVcVyyy rfr21

rrf21

ff21

rf

14

Principal Factor Analysis Solution

• Using an initial estimate (diagonal) and then obtain loading matrix by

• Obtain an estimate of by performing a principal component analysis on .

• This process is continued until the communality

estimates converge.

ψ

TR ffyˆˆˆ WWψ

fW

ψy R

M

mdmd w

1

22 ˆ

15

Maximum Likelihood Solution

• When FA is carried out on the correlation matrix

• Where , ,

, , and is a diagonal matrix.

R

Ddw d

M

mdm ,...,1 ,1

1

2

UWψWψψψ~21212121 R

2121 ΣUUR

N

i

Tiin 1

))((1

1yyyyΣ

212111

21 ,..., KKU dmwW U~

UWψWψψψ~ˆˆ 21

021

021

021

0 R

16

Rotation of Loading Matrix

• Rotate loading matrix by an orthogonal matrix

• Where satisfies

WΓH

Γ

TTTTT WWWWΓΓWΓWΓHH ))((

DihqD

jiji ,,1 ,

1

2

H

D

j

D

i i

ijD

i i

ij Dq

hD

q

h

1

2

1

2

1

22

• Varimax rotation

• Let

• can be obtained by maximizing

17

Effectiveness of Rotation

• Obtain greater discriminability

(a) 1st Factor 2nd Factor (b)1st Rotated

Factor

2nd Rotated

Factor1st MFCC 0.842 0.011 1st MFCC -0.892 -0.0044th MFCC -0.312 -0.724 4th MFCC 0.266 0.79113th MFCC 0.896 0.120 13th MFCC -0.933 -0.135

18

Outline

• Introduction

• Cepstral Factor Analysis

• FA Streamed Hidden Markov ModelFA Streamed Hidden Markov Model• Survey of different HMMs

• FASHMM

• Experiments

• Conclusions & Future Works

19

FA Streamed HMM

• Using FA, the processes of observed features and hidden states are represented by common factors and residual factors.

20

Survey of Different HMMs (FAHMM)

• Covariance matrix modeling

• Full vs. diagonal• Sufficient data problem

• FA representation

1f

11ff

111y

ψWWψWWψψ TTIR

• State/latent representation

• Discrete vs. continuous

21

Survey of Different HMMs (Streamed HMM)

• In standard HMM, the joint probability of observation sequence and state sequence was represented by

• Using FHMM, the state at time was extended to

states, i.e. .

},,,{ 21 TY yyy },,,{ 21 TsssS

T

ttttt spsspspspYSp

21111 )|()|()|()(),( yy

t

M )()()1( ,,,, Mt

mttt ssss

• Likelihood combination• Multi-stream HMM

• FHMM

sub-word level

frame level

22

Likelihood Function of FHMM

• State transition probability

• Likelihood function

M

m

mt

mttt sspssp

1

)(1

)(1 )|()|(

, )()(2

1exp

||)2()|(

1

1

1

2/12/

M

mmt

TM

mmt

Dtt sp

yy

y

common covariance matrix

23

Estimation Approaches for FHMM

• Exact inference

• Expectation maximization (EM) algorithm

• Complexity )( 1MTMKO )( 2MTKO

• Approximations

• Gibbs sampling

• Variational inference

)(TMKO

)( 2TMKO

24

FASHMM

• According to FA method, the common factor are associated with some features, which are highly correlated.

• Correlated features are grouped together in a stream and shared by the same FA parameters.

• Observed feature vector can be represented by

mf

TMfff

M]][[

21rfff

rf

21rWwww

rWfWy

25

Topology of FASHMM

• State transition probability

Mfts 1

11fts

r1ts

1ty ty 1ty

rts

r1ts

Mfts

Mfts 1

1fts

11fts

M

m

ft

fttt

tft

ft

ftt

ft

ft

fttt

mm

MM

sspssp

sssssssspssp

11

r1

r

r1111

r1

)|()|(

),,,,|,,,,()|( 2121

26

Outline

• Introduction

• Cepstral Factor Analysis

• FA Streamed Hidden Markov Model

• ExperimentsExperiments• Simulated data setup

• HMM vs. FASHMM

• Recognition results & discussion

• Conclusions & Future Works

27

Experimental Setup

• Simulated data• 4 classes, 5 variables• Training: 100 sentences, 5 “words” per sentence• Testing: 50 utterances, 4 “words” per sentence

• Model structure

• HMM• 7 states each class• Only one Gaussian each state

• FASHMM• 3 states each class• Only one Gaussian each state

28

Class 1

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

CF1 CF2 CF3 CF4 CF5V1 0.9662 0.1598 0.1863 0.0171 0.0775

V2 -0.2655 -0.9526 -0.0807 -0.1246 -0.0046

V3 0.2394 -0.1161 0.9639 0.0108 0.0008

V4 0.9697 0.1644 0.1639 -0.0001 -0.0755

V5 0.0675 0.9565 -0.2565 -0.1212 -0.0045

-20

-15

-10

-5

0

5

10

15

20

25

30

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

CF1

CF2

CF3

CF4

CF5

29

Class 2

-10

-5

0

5

10

15

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

CF1 CF2 CF3 CF4 CF5V1 0.1317 0.9733 -0.1647 -0.0908 0.0008

V2 -0.2007 -0.0951 0.9750 0.0041 -0.0001

V3 -0.9818 -0.1093 0.1515 0.0045 -0.0339

V4 -0.9826 -0.1061 0.1486 -0.0005 0.0337

V5 0.0827 0.9931 0.0077 0.0823 -0.0006

-25

-20

-15

-10

-5

0

5

10

15

20

25

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

CF1

CF2

CF3

CF4

CF5

30

Class 3

-15

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

CF1 CF2 CF3 CF4 CF5V1 0.0324 0.9939 -0.0704 -0.0788 0.0004

V2 -0.1435 -0.1093 0.9836 -0.0004 -0.0003

V3 0.9913 0.0285 -0.1243 -0.0013 0.0321

V4 0.9955 0.0228 -0.0867 0.0006 -0.0314

V5 0.0186 0.9926 -0.0903 0.0792 -0.0002

-30

-20

-10

0

10

20

30

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

CF1

CF2

CF3

CF4

CF5

31

Class 4

-20

-15

-10

-5

0

5

10

15

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

CF1 CF2 CF3 CF4 CF5V1 0.1634 -0.9746 -0.1532 -0.0040 0.0002

V2 0.1133 0.1475 0.9826 0.0013 -0.0001

V3 -0.9876 0.0482 -0.1109 -0.0950 0.0313

V4 -0.9701 0.2110 -0.0517 0.1075 0.0161

V5 0.9887 -0.1165 0.0829 -0.0003 0.0456

-25

-20

-15

-10

-5

0

5

10

15

20

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

CF1

CF2

CF3

CF4

CF5

32

HMM vs. FASHMM

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

4數列

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

2數列

5數列

HMMHMMFASHMMFASHMM

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

3數列

33

Recognition Results

HMM FASHMM# State per HMM 7 3 ( x4 )

Recognition Accuracy 100% 100%

34

Discussion

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

35

Outline

• Introduction

• Cepstral Factor Analysis

• FA Streamed Hidden Markov Model

• Experiments

• Conclusions & Future WorksConclusions & Future Works

36

Conclusions

• We have presented the FA approach

• Extract the common factor and the residual factors in acoustic features

• Separate the Markov chains for these factors.

• Represent the sophisticated dynamics in stochastic process of speech signal.

• A new topology of FA streamed HMM was proposed.

37

Future Works

• More acoustic features

• Model selection• Streams• States• Mixtures

• Large vocabulary continuous speech recognition (LVCSR) task