Factor Analysis of Acoustic Features for Streamed Hidden Markov Modeling

Factor Analysis of Acoustic Features for Factor Analysis of Acoustic Features for Streamed Hidden Markov ModelingStreamed Hidden Markov Modeling

Chuan-Wei Ting

Department of Computer Science and Information Engineering,

National Cheng Kung University, Tainan, Taiwan

2

Outline

• Introduction

• Cepstral Factor Analysis

• FA Streamed Hidden Markov Model

• Experiments

• Conclusions & Future Works

3

Outline

• IntroductionIntroduction• Stochastic modeling



• Experiments


4

Introduction

• The objective of constructing acoustic model is to capture the characteristics of speech signal.

• Stochastic modeling

• Hidden Markov model (HMM)

• Multi-Stream HMM

• Factorial HMM

5

Hidden Markov Model

• Topology of HMM

• Constraints

• All features are “tied” together

• Topology

• Transition moment

• Independent assumption

1ts 1tsts

6

Multi-Stream HMM

• Topology of Multi-stream HMM

J

j

M

mjm

mj

mj

J

jjj EpppMp

1 11

)|()|()|()|( YYY

)(mj )(

1mj

)(1Mj

)1(1j

)(Mj

)1(j

7

Simplification of Multi-Stream HMM

• Streams are assumed to be statistical independent

• Weighted log-likelihood approach

J

j

M

m

mj

mj MpMp

1 1

)|(log)|(log YY

J

j

M

m

mj

mj

mj MpMp

1 1

)|(log)|(log YY

8

Factorial HMM

• Topology of FHMM

)2(1ts

)1(1ts

)(1Mts

1ty ty 1ty

)(Mts

)(1Mts

)2(ts

)2(1ts

)1(ts

)1(1ts

9

Outline

• Introduction

• Cepstral Factor AnalysisCepstral Factor Analysis• Features analysis

• Factor analysis


• Experiments


10

Cepstral Factor Analysis

• Feature analysis

• Dynamics of different features

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45

-15

-10

-5

0

5

10

15

Time (sec)

MF

CC

13th MFCC

1st MFCC

4th MFCC

• Correlations

11

Factor Analysis

• Discover the correlations inherent in observation data.

• Applications

• Data compression

• Signal processing

• Acoustic modeling

12

Mathematical Definition of FA

• FA conducts data analysis of the multivariate observations using the common factors and the specific factors.

• For a dimensional feature vector , the general form of FA model is given by

D TDyy ],[ 1 y

εfWy f

common factor1M factor loading matrix

MD

specific factor1D

),0(~ IN

),0(~ ψN

13

Principal Component Solution

• Find an estimator that will approximate the fundamental expression

• Decompose covariance matrix of observation

• FA parameters can be estimated by

fW

ψWWyy TTER ffy ][

TTTR rrrf2/1

f2/1

ffy VVVVVV

rWfWcVcVcVyyy rfr21

rrf21

ff21

rf

14

Principal Factor Analysis Solution

• Using an initial estimate (diagonal) and then obtain loading matrix by

• Obtain an estimate of by performing a principal component analysis on .

• This process is continued until the communality

estimates converge.

ψ

TR ffyˆˆˆ WWψ

fW

ψy R

M

mdmd w

1

22 ˆ

15

Maximum Likelihood Solution

• When FA is carried out on the correlation matrix

• Where , ,

, , and is a diagonal matrix.

R

Ddw d

M

mdm ,...,1 ,1

1

2

UWψWψψψ~21212121 R

2121 ΣUUR

N

i

Tiin 1

))((1

1yyyyΣ

212111

21 ,..., KKU dmwW U~

UWψWψψψ~ˆˆ 21

021

021

021

0 R

16

Rotation of Loading Matrix

• Rotate loading matrix by an orthogonal matrix

• Where satisfies

WΓH

Γ

TTTTT WWWWΓΓWΓWΓHH ))((

DihqD

jiji ,,1 ,

1

2

H

D

j

D

i i

ijD

i i

ij Dq

hD

q

h

1

2

1

2

1

22

• Varimax rotation

• Let

• can be obtained by maximizing

17

Effectiveness of Rotation

• Obtain greater discriminability

(a) 1st Factor 2nd Factor (b)1st Rotated

Factor

2nd Rotated

Factor1st MFCC 0.842 0.011 1st MFCC -0.892 -0.0044th MFCC -0.312 -0.724 4th MFCC 0.266 0.79113th MFCC 0.896 0.120 13th MFCC -0.933 -0.135

18

Outline

• Introduction


• FA Streamed Hidden Markov ModelFA Streamed Hidden Markov Model• Survey of different HMMs

• FASHMM

• Experiments


19

FA Streamed HMM

• Using FA, the processes of observed features and hidden states are represented by common factors and residual factors.

20

Survey of Different HMMs (FAHMM)

• Covariance matrix modeling

• Full vs. diagonal• Sufficient data problem

• FA representation

1f

11ff

111y

ψWWψWWψψ TTIR

• State/latent representation

• Discrete vs. continuous

21

Survey of Different HMMs (Streamed HMM)

• In standard HMM, the joint probability of observation sequence and state sequence was represented by

• Using FHMM, the state at time was extended to

states, i.e. .

},,,{ 21 TY yyy },,,{ 21 TsssS

T

ttttt spsspspspYSp

21111 )|()|()|()(),( yy

t

M )()()1( ,,,, Mt

mttt ssss

• Likelihood combination• Multi-stream HMM

• FHMM

sub-word level

frame level

22

Likelihood Function of FHMM

• State transition probability

• Likelihood function

M

m

mt

mttt sspssp

1

)(1

)(1 )|()|(

, )()(2

1exp

||)2()|(

1

1

1

2/12/

M

mmt

TM

mmt

Dtt sp

yy

y

common covariance matrix

23

Estimation Approaches for FHMM

• Exact inference

• Expectation maximization (EM) algorithm

• Complexity )( 1MTMKO )( 2MTKO

• Approximations

• Gibbs sampling

• Variational inference

)(TMKO

)( 2TMKO

24

FASHMM

• According to FA method, the common factor are associated with some features, which are highly correlated.

• Correlated features are grouped together in a stream and shared by the same FA parameters.

• Observed feature vector can be represented by

mf

TMfff

M]][[

21rfff

rf

21rWwww

rWfWy

25

Topology of FASHMM

• State transition probability

Mfts 1

11fts

r1ts

1ty ty 1ty

rts

r1ts

Mfts

Mfts 1

1fts

11fts

M

m

ft

fttt

tft

ft

ftt

ft

ft

fttt

mm

MM

sspssp

sssssssspssp

11

r1

r

r1111

r1

)|()|(

),,,,|,,,,()|( 2121

26

Outline

• Introduction



• ExperimentsExperiments• Simulated data setup

• HMM vs. FASHMM

• Recognition results & discussion


27

Experimental Setup

• Simulated data• 4 classes, 5 variables• Training: 100 sentences, 5 “words” per sentence• Testing: 50 utterances, 4 “words” per sentence

• Model structure

• HMM• 7 states each class• Only one Gaussian each state

• FASHMM• 3 states each class• Only one Gaussian each state

28

Class 1

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

CF1 CF2 CF3 CF4 CF5V1 0.9662 0.1598 0.1863 0.0171 0.0775

V2 -0.2655 -0.9526 -0.0807 -0.1246 -0.0046

V3 0.2394 -0.1161 0.9639 0.0108 0.0008

V4 0.9697 0.1644 0.1639 -0.0001 -0.0755

V5 0.0675 0.9565 -0.2565 -0.1212 -0.0045

-20

-15

-10

-5

0

5

10

15

20

25

30

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

CF1

CF2

CF3

CF4

CF5

29

Class 2

-10

-5

0

5

10

15

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

CF1 CF2 CF3 CF4 CF5V1 0.1317 0.9733 -0.1647 -0.0908 0.0008

V2 -0.2007 -0.0951 0.9750 0.0041 -0.0001

V3 -0.9818 -0.1093 0.1515 0.0045 -0.0339

V4 -0.9826 -0.1061 0.1486 -0.0005 0.0337

V5 0.0827 0.9931 0.0077 0.0823 -0.0006

-25

-20

-15

-10

-5

0

5

10

15

20

25

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

CF1

CF2

CF3

CF4

CF5

30

Class 3

-15

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

CF1 CF2 CF3 CF4 CF5V1 0.0324 0.9939 -0.0704 -0.0788 0.0004

V2 -0.1435 -0.1093 0.9836 -0.0004 -0.0003

V3 0.9913 0.0285 -0.1243 -0.0013 0.0321

V4 0.9955 0.0228 -0.0867 0.0006 -0.0314

V5 0.0186 0.9926 -0.0903 0.0792 -0.0002

-30

-20

-10

0

10

20

30

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

CF1

CF2

CF3

CF4

CF5

31

Class 4

-20

-15

-10

-5

0

5

10

15

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

CF1 CF2 CF3 CF4 CF5V1 0.1634 -0.9746 -0.1532 -0.0040 0.0002

V2 0.1133 0.1475 0.9826 0.0013 -0.0001

V3 -0.9876 0.0482 -0.1109 -0.0950 0.0313

V4 -0.9701 0.2110 -0.0517 0.1075 0.0161

V5 0.9887 -0.1165 0.0829 -0.0003 0.0456

-25

-20

-15

-10

-5

0

5

10

15

20

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

CF1

CF2

CF3

CF4

CF5

32

HMM vs. FASHMM

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

4數列

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

2數列

5數列

HMMHMMFASHMMFASHMM

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

3數列

33

Recognition Results

HMM FASHMM# State per HMM 7 3 ( x4 )

Recognition Accuracy 100% 100%

34

Discussion

-10

-5

0

5

10

15

20

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

1數列

2數列

3數列

4數列

5數列

35

Outline

• Introduction



• Experiments

• Conclusions & Future WorksConclusions & Future Works

36

Conclusions

• We have presented the FA approach

• Extract the common factor and the residual factors in acoustic features

• Separate the Markov chains for these factors.

• Represent the sophisticated dynamics in stochastic process of speech signal.

• A new topology of FA streamed HMM was proposed.

37

Future Works

• More acoustic features

• Model selection• Streams• States• Mixtures

• Large vocabulary continuous speech recognition (LVCSR) task

Factor Analysis of Acoustic Features for Streamed Hidden Markov Modeling

Documents

Transcript of Factor Analysis of Acoustic Features for Streamed Hidden Markov Modeling