Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B....

47
Pattern Recognition Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory Part 5: Codebook Training

Transcript of Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B....

Page 1: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Pattern Recognition

Gerhard Schmidt

Christian-Albrechts-Universität zu KielFaculty of Engineering Institute of Electrical and Information EngineeringDigital Signal Processing and System Theory

Part 5: Codebook Training

Page 2: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 2

Codebook Training

Contents

❑ Motivation

❑ Application examples

❑ Cost function for the training of a codebook

❑ LBG- and k-means algorithm

❑Basic schemes

❑Extensions

❑ Combination with additional mapping schemes

Page 3: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 3

Codebook Training

Introduction and Motivation – Part 1

Feature vectors and corresponding codebook – feature room 1:

❑ Codebook with 4 entries

❑ 10.000 feature vectors are quantizedFeature vectorsCodebook entries

Feature 1

Feat

ure

2

Page 4: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 4

Codebook Training

Introduction and Motivation – Part 2

Feature vectors and corresponding codebook – feature room 2:

❑ Codebook with 4 entries

❑ 10.000 feature vectors are quantizedFeature vectorsCodebook entries

Feature 1

Feat

ure

2

Page 5: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 5

Codebook Training

Codebooks

Codebook definition:

mitCodebook matrix

Codebook vector

❑ The codebook vectors should be chosen such that they represent a large number of so-called feature vectors with a small average distance.

❑ For feature vectors

and a distance measure

it should follow for the average distance:

Page 6: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 6

Codebook Training

Literature

Codebook training:

❑ L. R. Rabiner, R. W. Schafer: Digital Processing of Speech Signals, Prentice Hall, 1978

❑ C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006

❑ B. Pfister, T. Kaufman: Sprachverarbeitung, Springer, 2008 (in German)

Current version of „Sprachsignalverarbeitung“ is free via the university library …

Page 7: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 7

Codebook Training

Application Examples – Part 1

Basic structure of a codebook search:

Distortion-reducingpreprocessing

(beamforming, echocancellation, and noise reduction)

Feature extraction(MFCCs, cepstral

coeffictients)

Code-book

Vectorquantisation

Feature vector

„Distance“ of thecurrent feature vectorto the „best“codebook entry

Index (address) of the„best“ codebook entry

Matrix of codebook vectors

Page 8: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 8

Codebook Training

Application Examples – Part 2

Speaker recognition - structure:

Distortion-reducing

preprocessing

Featureextraction

Feature vector

Codebook for thefirst speaker

Codebook for thelast speaker

Index of the„most probable“ speaker

Accumulationof the distancesover time

Best distances

Page 9: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 9

Codebook Training

Application Examples – Part 3

Speaker recognition – basic principle:

❑ The current spectral envelope of the signal is compared to the entries of several codebooks and the distance to the „best match“ is computed for each codebook.

❑ The codebooks belong to a speaker that is known in advance and have been trained to his data.

❑ The smallest distances of each codebook are accumulated.

❑ The smallest accumulated distance determines on which speaker it will be decided.

❑Models of the speakers that are known in advance compete against one ore more „universal“ models. A new speaker can be recognized if the „universal“ codebook is better than the best individual one. In this case, a new codebook for the new speaker can be initialized.

❑An update of the „winner-codebook“ is usually done.

Page 10: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 10

Codebook Training

Application Examples – Part 4

Frequency

dB

currentspectralenvelope

Codebook for the second speaker

„Best“ entry in the first codebook

„Best“ entry in the second codebook

Speaker recognition – basic principle:Codebook for the first speaker

Page 11: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 11

Codebook Training

Application Examples – Part 5

Signal reconstruction – structure:

Distortion-reducingpreprocessing

Featureextraction

Code-book

Detection of „good“ and „bad“ SNR conditions

Feature vector

Index (address) of the„best“ codebook entry

Matrix of codebook vectors Estimation of the undistorted spectral envelope

Undistorted spectral envelope

Page 12: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 12

Codebook Training

Application Examples – Part 6

Signal reconstruction – basic principle:

❑ First, the input signal is „conventionally“ noise-reduced. In this processing step it is also determined in which frequency ranges the conventional method does not “open” (attenuation is less than the maximum attenuation).

❑ Based on the conventionally improved signal, the spectral envelope is estimated by, e.g., thelogarithmic melband-signal powers.

❑ In a codebook it is now searched for that envelope that has the smallest distance to the input signal’s envelope within the „allowed“ frequency range.

❑ Because the „allowed“ frequency range is not known a priori, both, the features that are used and the cost function, have to be chosen appropriately.

❑ Finally, the extracted spectral envelope of the input signal and the best codebook envelope are combined such that the codebook envelope is chosen in such areas where the conventional noise reduction fails and the original input envelope is chosen for the remaining frequency range.

Page 13: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 13

Codebook Training

Application Examples – Part 7

Signal reconstruction – basic principle:

Frequency

dB

Current spectral

envelope

„Best“ entry of the codebook

Frequency range in which a conventional noise reduction is not working satisfactory

Envelope of the input signal

Best codebook envelope

dB

Frequency

Combined spectral envelope

Page 14: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 14

Codebook Training

Application Examples – Part 8

Bandwidth extension – structure:

Distortion-reducingpreprocessing

Code-buch 1

Envelope of the signal that islimited to telephone bandwidth

Index (address) of the„best“ codebook entry

Double matrix with pairs of codebook vectors

Combination of the envelope of the broadband codebook and the telephone-bandlimited envelope

Spectral broadband-envelope

Code-buch 2

Feature extraction

Page 15: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 15

Codebook Training

Application Examples – Part 9

Bandwidth extension – basic principle:

❑ For bandwidth extension, two codebooks are trained in parallel: one for the envelpes of the telephone-bandlimited signal and a second one for the envelopes of the broadband signal.

❑During the training it is important that the input vectors (so-called double vectors) are available synchronously, such that identical feature data can be used for the clustering of the narrowband and broadband data.

❑ The search is done in the narrowband-codebook. The suitable broadband-codevector is chosen and combined with the extracted narrowband-entry in such a way, that the original envelope is chosen in the telephone band and the broadband codebook entry for the remaining frequency range.

Page 16: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 16

Codebook Training

Application Examples – Part 10

Bandwidth extension – basic principle:

Frequency

dB

Envelope in the telephone band

„Best“ entry of the codebookCodebook with pairs of narrow-and broadband envelopes

Best broadband codebook envelope

Frequency

dB

Input envelope

Resulting broadbandenvelope

Page 17: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 17

Codebook Training

Application Examples – Part 11

More application examples

❑ Speech coding: The aim is basically to code the spectral envelope with as little bits as possible. Vector quantization is a good choice here.

❑ Classification of speech sound (sibilant sounds, vowels): Several codebook entries are trained for each speech sound. Afterwards, a good classification can be achieved. This can be used, e.g., to select between different preprocessing methods for reducing distortions or to parametrize a universal preprocessing method properly.

❑ (Noise-) environment recognition: Depending on the environment, different hard- and software components should be used (switching between cardioid and omnidirectional microphones, activation or deactivation of dereverberation, etc.). To realize such a classification, codebooks for different environments can be trained and compared during operation.

Page 18: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 18

Codebook Training

Cost Functions for the Codebook Training – Part 1

Considerations regarding complexity and convenience:

❑ For creating a codebook, the same or at least a similar cost function to that used during operation should be used.

❑Usually all codebook entries have to be compared to the input feature vector for every processing frame. Thus, a computationally cheap cost function should be chosen: typically the squared distance or a magnitude distance.

❑ If not all elements of the feature space can be extracted with sufficient quality, a non-negative weighting for the elements can be introduced.

❑ If logarithmic feature elements are used, a zero before taking the logarithm can lead to very large distances. For that reason, a limitation can be introduced.

Page 19: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 19

Codebook Training

Cost Functions for the Codebook Training – Part 2

Ansatz:

❑ Difference between the feature elements:

❑ Limitation:

❑ Squaring and weighting:

with:

Page 20: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 20

Codebook Training

Evaluation of Codebooks

❑ Division of the feature data base in to training data and evaluation data. A 80/20-division could be used (80 % of the data are used for training, 20 % for evaluation)

❑ The codebook training is done iteratively, i.e., a initial start codebook is improved in each step:

❑ A commonly used condition for termination is:

Basic principle:

Page 21: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 21

Codebook Training

k-means Algorithm – Part 1

Given:

❑ A large number of training vectors and evaluation vectors

Wanted:

❑ A codebook C of size K with the lowest possible mean distance

regarding the (training and) evaluation vectors.

Problem:

❑Up to now, there is no method that solved the problem with reasonable computational complexity in an optimal way. Therefore, „suboptimal“ methods are used.

Page 22: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 22

Codebook Training

k-means Algorithm – Part 2

Initzialization:

❑ Selection of K arbitrary, different vectors out of the training data as codebook vectors.

Iteration:

❑ Classification: Assign each training vector to a codebook vector with minimum distance.

❑ Codebook correction: A new codebook vector is generated by averaging over all training vectors that are assigned to the same codebook vector.

❑ Termination condition: By using the evaluation vectors it is checked, whether the termination condition (mentioned before) is met or not.

Page 23: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 23

Codebook Training

LBG-Algorithm – Part 1

Comparison with the k-means method:

❑ The principle of the LBG-algorithm is quite similar to the k-means approach. However, codebooks with increasing size are generated. Usually the size increases in powers of two but it could also change linearly.

❑ „LBG“ stands for the names of the inventors of the algorithm: Linde, Buzo und Gray.

Initialization:

❑ The start codebook consists of only one entry which is chosen as them mean over all training vectors.

Page 24: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 24

Codebook Training

LBG-Algorithm – Part 2

Iteration:

❑ Increasing the codebook size: The codebook size is doubled. The start vectors for the next codebook size are obtained by adding and subtracting vectors with small random entries to the code vectors of the previous code book.

❑ Classification: Assign each training vector to a codebook vector with minimum distance.

❑ Codebook correction: A new codebook vector is generated by averaging over all training vectors that are assigned to the same codebook vector.

❑ Termination condition: By using the evaluation vectors it is checked, whether the termination condition mentioned before is met or not.

Page 25: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 25

Codebook Training

LBG-Algorithm – Example, Part 1

Initialization:

Page 26: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 26

Codebook Training

LBG-Algorithm – Example, Part 2

First codebook split:

Page 27: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 27

Codebook Training

LBG-Algorithm – Example, Part 3

Codebook of size 2, after 1. iteration:

Page 28: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 28

Codebook Training

LBG-Algorithm – Example, Part 4

Codebook of size 2, after 2. iteration:

Page 29: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 29

Codebook Training

LBG-Algorithm – Example, Part 5

Codebook of size 2, after 3. iteration:

Page 30: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 30

Codebook Training

LBG-Algorithm – Example, Part 6

Codebook of size 4, after splitting:

Page 31: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 31

Codebook Training

LBG-Algorithm – Example, Part 7

Codebook of size 4, after 1. iteration:

Page 32: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 32

Codebook Training

LBG-Algorithm – Example, Part 8

Codebook of size 4, after 2. iteration:

Page 33: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 33

Codebook Training

LBG-Algorithm – Example, Part 9

Codebook of size 4, after 3. iteration:

Page 34: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 34

Codebook Training

LBG-Algorithm – Example, Part 10

Codebook of size 8, after splitting:

Page 35: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 35

Codebook Training

LBG-Algorithm – Example, Part 11

Codebook of size 8, after 1. iteration:

Page 36: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 36

Codebook Training

LBG-Algorithm – Example, Part 12

Codebook of size 8, after 2. iteration:

Page 37: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 37

Codebook Training

LBG-Algorithm – Example, Part 13

Codebook of size 8, after 3. iteration:

Page 38: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 38

Codebook Training

LBG-Algorithm – Example, Part 14

Codebook of size 8, after 4. iteration:

Page 39: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 39

Codebook Training

LBG-Algorithm – Example, Part 15

Codebook of size 8, after 5. iteration:

Page 40: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 40

Codebook Training

LBG-Algorithm – Example, Part 16

Codebook of size 8, after 6. iteration:

Page 41: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 41

Codebook Training

LBG-Algorithm – Example, Part 17

Codebook of size 8, after 7. iteration:

Page 42: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 42

Codebook Training

„Intermezzo“

Partner exercise:

❑ Please answer (in groups of two people) the questions that you will get during the lecture!

Page 43: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 43

Codebook Training

Extensions for the Codebook Generation

Extensions:

❑ In a next step, all codebook entries can be replaced by the nearest training or evaluation vector. This ensures that the codebook entries are „valid“ vectors (e.g., stable filter coefficients)

❑ An alternative for doubling all codebook entries in the LBG algorithm is to split only that vector that contributes the „biggest“ part to the average distance.

❑ If codebook pairs should be trained, the two feature vectors are appended first. By choosing the weighting matrix Gproperly, either of the features can dominate the codebook generation. Alternatively, also a weighted sum can be used (both feature vectors contribute to the clustering)

Page 44: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 44

Codebook Training

Combination of Codebook Approaches With Other Mappings – Part 1

Affine-linear mappings:

❑ Global approach:

❑ Piecewise defined mapping:

This can be seen as a generalized version ofthe codebook approach. The codebook approachwould use a matrix of zeros for the matrix M andthe y-mean vector would be the best codebookentry.

Page 45: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 45

Codebook Training

Combination of Codebook Approaches With Other Mappings – Part 2

Example for the relation betweeninput and output features

Approximation bycodebook pairs

Example for a locally optimized linear mapping

Esti

mat

ed o

utp

ut

feat

ure

Esti

mat

ed o

utp

ut

feat

ure

Tru

e o

utp

ut

feat

ure

Input feature 1 Input feature 2

Input feature 2

Input feature 2Input feature 1

Input feature 1

Page 46: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 46

Codebook Training

Combination of Codebook Approaches With Other Mappings – Part 3

Extraction ofpredictor coefficients

Conversion intocepstral coefficients

Subtract mean valueof the input vector

Matrix-vector multiplication

Add mean value ofthe output vector

Codebook searchfor matrix or

vector selection

Conversion intopredictor coefficients

Stability check

Page 47: Pattern Recognition...C. Bishop: Pattern Recognition and Machine Learning, Springer, 2006 B. Pfister, T. Kaufman: Sprachverarbeitung , Springer, 2008 (in German) Current version of

Digital Signal Processing and System Theory | Pattern Recognition | Codebook Training Slide 47

Codebook Training

Summary and Outlook

Summary:

❑ Motivation

❑ Application examples

❑ Cost functions for the training of a codebook

❑ LBG- and k-means algorithm

❑ Basic schemes

❑ Extensions

❑ Combinations with additional mapping schemes

Next week:

❑ Bandwidth extension