Post on 15-Aug-2015
*DEVELOPMENT OF SPEAKER VERIFICATION UNDER LIMITED DATA AND CONDITION
Under guidance of
Dr. G. Pradhan
NIT PATNA (ECE dept.)
NAME-PAMMI KUMARI
M.TECH 2nd yr (ECE dept.)
ROLL NO.-1329005
• Introduction
• Summary of Literature review
• Issues in existing speaker verification systems
• Motivation for the present work
• Baseline speaker verification system
• Experimental results
• Proposal for future work
*OUTLINE
To develop voice password based speaker verificationTo study impact of text-mismatch on the performance of voice password based speaker verification system
Develop a voice password based speaker verification system in text-independent mode
Explore method to model speaker information in limited data condition
Most of the application where speech signal of short duration used around 3-5ms, but Speaker verification system provide poor performance for short duration speech signal
This degradation of performance is due to phonetic variability between training and testing speech data
Objective and Motivation for this work
SPEAKER VERIFICATION: The speaker verification is a process of verifying the identity of the claimant . It performs one-to-one comparison between a newly input voiceprint and the voiceprint for the claimed identity that is stored in the database.
* INTRODUCTION
Fig :-Block diagram of speaker verification system
InputSpeech
Similarity
FeatureExtraction
Verification result
Speaker ID(#M)
Reference model (Speaker #M)
Threshold
Decision
*Modular representation of Voice pass word based speaker verification system
Training Reference model
Speech
Identity claim
Testing
Speech R
Accept/reject
Pre-
processing
Feature
extraction
Model
Building
Pre-
processing
Feature
extraction
comparison
Decision logic
Fig: Voice password speaker verification system
Cont….
• when an identity claim is made by a speaker, the speech data is compared with respect to the model of the speaker whose identity is claimed.
• The concept of threshold is used to come up with the decision.
• If the similarity of the test speech data to the target model is below the threshold ,the speaker is accepted.
• This process involves a binary decision (accept/reject) about the claimed identity regardless of the population size.
• Hence, the performance of the verification system does not depend on the size of the population.
• In the first stage, pre-processing and feature extraction is performed over a database of speakers.
• The second stage is to generate models, where vectors representing speaker specific characteristic are obtained, this leads to the feature vectors.
• The third stage is decision, which accepts or rejects the claimed identity of a speaker.
* Speaker verification system comprises of three stages :-
Basic block diagram of a biometric system
PRE-PROCESSING
FEATHER EXTRACTION
APPLICATION DEVICE
TEMPLATEGENERATO
RMATCHER
STOREDTEMPLATE
SENSOR
* Speaker verification can be classified into:-1) Text-dependent2) Text-independent
Text-dependent speaker verification-In this, speaker system is based on the utterance of a fixed predetermined phrases. Text-independent speaker verification-In this, the reference (what are spoken in training) & the test (what are uttered in actual use) utterance may have completely different content is text-independent.
*Literature review
• Research in the field of speaker recognition was initially carried out in 1950s in Bell laboratories using isolated digites [1].
• In 2000 most of the research was describe the major elements of Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations(SREs).
• 1960-1990 most of the research was focused on extraction of speaker specific information from the speech data, and development of text dependent speaker verification system.
• In 1990-2005 the speaker recognition method shifted from template based pattern matching to statistical modeling. Different statistical modeling method like GMM and GMM-UBM are proposed.
• 2005- 2014 most of the research was focused on compensation of mismatches and development of practical verification systems. Different compensation methods like i-vectors and PLDA are proposed
1. K. H. Davis, et. al., “Automatic recognition of spoken digits,”
J.A.S.A., 24 (6), pp. 637-642, 1952.
* Cont…
• In the speech analysis stage, through the techniques have been developed to improve the speaker verification performance, no particular analysis techniques is specially meant for limited data condition.
• The use of segmental analysis under limited data condition provides few feature vectors which leads to poor speaker models leads to degradation of performance.
* Issues in existing speaker verification system
• Most of the application where speech signal of short duration used around 3-5ms, but Speaker verification system provide poor performance for short duration speech signal
• This degradation of performance is due to phonetic variability between training and testing speech data
• The phonetic variability may be reduced by artificially generating multiple utterance.
• Most of the SV system develop score normalization using on cohort centric normalization. The speaker centric score normalization may provide better result.
* MOTIVATION FOR THE PRESENT WORK
• For Baseline speaker verification the following parameter are used VAD threshold is taken 0.1 of
average energy Baseline uses MFCC features Feature vector: It uses 39
dimension feature vector and 20ms frame size with shift 2ms.
Modeling: GMM GMM size: 8, 16, 32, 64.
* BASELINE SPEAKER VERIFICATION SYSTEM
*Experimental ResultFor original data
34.61332.87
32.097132.4634
* Experimental resultFor test 15sec and train15sec
27.4725 25.1374
23.672222.6190
• Extraction of feature to reduce the impact of phonetic variability.
• Different residue of behavioral feature may be extracted in addition to MFCC for speaker verification.
• In this project we considered GMM modeling technique in next work many other technique may be used like i-vector.
* Proposal for future work
*Thank you