Viterbi Training
description
Transcript of Viterbi Training
![Page 1: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/1.jpg)
1
Viterbi Training
• It is like Baum-Welsh. • Instead of the As and Es, the most probable paths for the training sequences are derived using the Viterbi algorithm.• Guaranteed to converge.• Maximize
1
![Page 2: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/2.jpg)
2
Baum-Welsh Example:
Generating model Estimated model from 300 rolls of dice
2
![Page 3: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/3.jpg)
33
Estimated model from 30000 rolls of dice
![Page 4: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/4.jpg)
44
Modeling with labeled sequences
![Page 5: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/5.jpg)
55
CML (conditional maximum likelihood)
![Page 6: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/6.jpg)
6
3.4 HMM model structure
• Fully connected model?– Never works in practice due to local maxima
• In practice, successful models are constructed based on knowledge about the problem
• If we set akl=0, in the Baum-Weltch estimation, akl will remain 0
• How to choose a model with our knowledge?
![Page 7: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/7.jpg)
7
Duration modeling
![Page 8: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/8.jpg)
8
![Page 9: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/9.jpg)
9
Silent States
for 200 states, it requires 200*199/2
transitions
for 200 non-silent states, it requires
around 600 transitions
![Page 10: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/10.jpg)
10
For HMM without loops consisting entirely of silent states,all HMM algorithms in Section 3.2 and 3.3 could be extended.
For forward algorithm:
For HMM with loops consisting entirely of silent states,we could eliminate silent states by calculating the effectivetransition probabilities between real states in the model
![Page 11: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/11.jpg)
11
3.5 Higher Order Markov Chains
2nd-order Markov Chain
11
![Page 12: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/12.jpg)
1212
![Page 13: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/13.jpg)
13NORF: Non-coding Open Reading Frame
13
![Page 14: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/14.jpg)
1414
![Page 15: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/15.jpg)
15
Inhomogeneous Markov Chain
• Use three different Markov Chains to model coding regions
Pr(x) =
• n-th order emission probabilities
15
![Page 16: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/16.jpg)
16
• To avoid underflow error, two ways to deal with the problem– The log transformation
– Scaling of probabilities
3.6 Numerical stability of HMM algorithms
• The log transformation
![Page 17: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/17.jpg)
17
![Page 18: Viterbi Training](https://reader035.fdocuments.in/reader035/viewer/2022081504/56814888550346895db59c23/html5/thumbnails/18.jpg)
18
• Scaling of probabilities