Project 1 : Eigen-Faces Applied to Speech Style Classification
-
Upload
hunter-emerson -
Category
Documents
-
view
19 -
download
2
description
Transcript of Project 1 : Eigen-Faces Applied to Speech Style Classification
Project 1 :Eigen-Faces Applied to
Speech Style ClassificationBrad Keserich, Senior, Computer Engineering
College of Engineering and Applied Science; University of Cincinnati; Cincinnati, Ohio
Suryadip Chakraborty, School of Computing Sciences and Informatics
Dr. Dharma Agrawal, Professor, School of Computing Sciences and Informatics
1
Sponsored ByThe National Science Foundation Grant ID No.: DUE-0756921
Introduction
• Speech recognition• Voice disorders
– Stuttering– Pausing– Other less known forms
• Research group focus on Parkinson’s Patients
2
Techniques
• Previous work– Good results using Neural Network
classifiers using Fuzzy values– Wavelet Transformations are effective
• For this project– Eigen-faces method adapted to audio
3
Goals
Investigate the usefulness of the
eigen-faces method for speech classification
4
Objectives
• Acquire data
• Extract salient features
• Analyze Eigen-faces effectiveness
5
Eigen-faces for audio
6
w1 w2 w3 w4 w5
t
wi vi =
f1f2:::::fr
Classifiers using Abstract Features
• Training– Training set of feature vectors– Convert to Zero-mean truth set– Top k principle components (using principle
component analysis (PCA))
• Classifying– Project new vectors onto eigenbasis– Residuals indicate closeness to a class
7
Data• Recorded word: “Ta-Be-Mo-No”
– Consonant + vowel sounds– Easy to do segmentation– Use “Ta” portion only
• Use voice acting for data collection– Same person– Vary the way the word is spoken
• Variance of speaking style– Stuttering– Pausing– Pace– Pitch inflections
8
9
Pipeline
Segmentation and Labeling• Automation
– Works well for slow clear cases– Not as well for more realistic cases– Slow cases are close to hand segmentation
• By Hand– More reliable segmentation at this point– Done with sample counts in Logic 8– Label the segments with correct sound
10
11
12
Modifications• Use additional features in the Eigen-faces
method– Stutter detection– Pauses and spacing within the spoken word– Pitch inflections
• Utilize Mel-Cepstrum to pick up features
• Substitute Laplacian Eigenmap for PCA
13
Results• Features performing
well– Blatant stutter
detection– Long durations– Spectrum analysis
• Good class seperability
14
Conclusions
• Eigen-faces work for spoken audio data
• More tweaking required
• Further research– Mel-Cepstrum features– Laplacian Eigenmapping to replace PCA
• May be useful as a front end to Fuzzy-Neuro classifiers
15
References1. Wu, H., Siegel, M., & Khosla, P. (1999). Vehicle sound signature recognition by
frequency vector principal component analysis. IEEE Transactions on Instrumentation and Measurement, 48(5) doi: http://dx.doi.org/10.1109/19.799662.
2. Belkin, M. & Niyogi, P. (2002). Laplacian Eigenmaps for Dimensionality Reduction and Data Representation.
1. Prahalld, K. Speech Technology: A Practical Introduction Topics: Spectogram, Cepstrum and Mel-Frequency Analysis. http://www.speech.cs.cmu.edu/11-492/slides/03_mfcc.pdf.
16