Project 1 : Eigen-Faces Applied to Speech Style Classification

Project 1 :Eigen-Faces Applied to

Speech Style ClassificationBrad Keserich, Senior, Computer Engineering

College of Engineering and Applied Science; University of Cincinnati; Cincinnati, Ohio

Suryadip Chakraborty, School of Computing Sciences and Informatics

Dr. Dharma Agrawal, Professor, School of Computing Sciences and Informatics

1

Sponsored ByThe National Science Foundation Grant ID No.: DUE-0756921

Introduction

• Speech recognition• Voice disorders

– Stuttering– Pausing– Other less known forms

• Research group focus on Parkinson’s Patients

2

Techniques

• Previous work– Good results using Neural Network

classifiers using Fuzzy values– Wavelet Transformations are effective

• For this project– Eigen-faces method adapted to audio

3

Goals

Investigate the usefulness of the

eigen-faces method for speech classification

4

Objectives

• Acquire data

• Extract salient features

• Analyze Eigen-faces effectiveness

5

Eigen-faces for audio

6

w1 w2 w3 w4 w5

t

wi vi =

f1f2:::::fr

Classifiers using Abstract Features

• Training– Training set of feature vectors– Convert to Zero-mean truth set– Top k principle components (using principle

component analysis (PCA))

• Classifying– Project new vectors onto eigenbasis– Residuals indicate closeness to a class

7

Data• Recorded word: “Ta-Be-Mo-No”

– Consonant + vowel sounds– Easy to do segmentation– Use “Ta” portion only

• Use voice acting for data collection– Same person– Vary the way the word is spoken

• Variance of speaking style– Stuttering– Pausing– Pace– Pitch inflections

8

9

Pipeline

Segmentation and Labeling• Automation

– Works well for slow clear cases– Not as well for more realistic cases– Slow cases are close to hand segmentation

• By Hand– More reliable segmentation at this point– Done with sample counts in Logic 8– Label the segments with correct sound

10

Modifications• Use additional features in the Eigen-faces

method– Stutter detection– Pauses and spacing within the spoken word– Pitch inflections

• Utilize Mel-Cepstrum to pick up features

• Substitute Laplacian Eigenmap for PCA

13

Results• Features performing

well– Blatant stutter

detection– Long durations– Spectrum analysis

• Good class seperability

14

Conclusions

• Eigen-faces work for spoken audio data

• More tweaking required

• Further research– Mel-Cepstrum features– Laplacian Eigenmapping to replace PCA

• May be useful as a front end to Fuzzy-Neuro classifiers

15

References1. Wu, H., Siegel, M., & Khosla, P. (1999). Vehicle sound signature recognition by

frequency vector principal component analysis. IEEE Transactions on Instrumentation and Measurement, 48(5) doi: http://dx.doi.org/10.1109/19.799662.

2. Belkin, M. & Niyogi, P. (2002). Laplacian Eigenmaps for Dimensionality Reduction and Data Representation.

1. Prahalld, K. Speech Technology: A Practical Introduction Topics: Spectogram, Cepstrum and Mel-Frequency Analysis. http://www.speech.cs.cmu.edu/11-492/slides/03_mfcc.pdf.

16

http://dx.doi.org/10.1109/19.799662

Project 1 : Eigen-Faces Applied to Speech Style Classification

Documents

Transcript of Project 1 : Eigen-Faces Applied to Speech Style Classification