Development of voice password based speaker verification system

33
Voice Password Based Speaker Verification Using Vowel Region Under guidance of Dr. G. Pradhan NIT PATNA (ECE dept.) Presented By: Piyush Kumar(1104091) Kamlesh Kalvaniya(1104080) Niranjan Kumar(1104087)

Transcript of Development of voice password based speaker verification system

Page 1: Development of voice password based speaker verification system

Voice Password Based Speaker Verification Using Vowel Region

Under guidance of Dr. G. PradhanNIT PATNA (ECE dept.)

Presented By:Piyush Kumar(1104091)Kamlesh Kalvaniya(1104080)Niranjan Kumar(1104087)

Page 2: Development of voice password based speaker verification system

Content

• Introduction• Motivation for present work• Issues in speaker verification• Development of baseline• Proposed speaker verification system• Summary • Conclusion

Page 3: Development of voice password based speaker verification system

Introduction

• Speaker Verification is a task of validating identity claim of a person from his/her voice.

• Voice password based speaker verification system – Speaker is free to choose his/her password – Password remains same for training and

verification

Page 4: Development of voice password based speaker verification system

Motivation

• Development of a low complexity speaker verification system with reasonable performance using few seconds of speech data– For mobile based applications– Low security person authentication

Page 5: Development of voice password based speaker verification system

Issues in limited Speech Speaker Verification

• Information in human speech– Message, Language, Speaker, Emotion/ health

Recording environment, channel, sensor etc.• Speaker specific information extracted from

speech data varies depending on other factors• Challenge – Enhance the speaker specific information – Normalize other variability's in speech data

Page 6: Development of voice password based speaker verification system

Baseline System

• Gaussian Mixture Model (Text Independent)Database: NIST-2003VAD: Energy based VAD (0.6 * average

energy)Feature vector: 13 dimension MFCC

appended with delta and delta-deltaModeling: GMMGMM size: 8, 16, 32, 64 Comparison: log Likelihood score

Page 7: Development of voice password based speaker verification system

Flowchart for GMM based SV system

Page 8: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 8

GMM based SV system EER

.

GAUSSIAN SIZE

8

16

32

64

TEST 15 SecTRAIN 15 SEC

TEST 15 SecTrain Full

TEST Full Train 15 Sec

Test FullTrain Full

EQUAL ERRORRATE(%)

EQUAL ERRORRATE(%)

EQUAL ERRORRATE(%)

EQUAL ERRORRATE(%)

34.90 33.18 34.24 32.70

33.05 30.50 32.28 29.67

32.46 28.78 32.92 27.77

32.82 27.42 33.06 26.05

Page 9: Development of voice password based speaker verification system

Conclusion

• Performance is sensitive to duration of testing and training data.

• Performance is more sensitive to duration of training data compared to testing data.

• GMM based SV system may not suitable for limited data.

Page 10: Development of voice password based speaker verification system

Baseline system for Voice password based system

• Data Collection Data of 100 speakers was collected. Each speaker utter his/her full name or roll no as the

voice password which was recorded over phone. No of male speaker: 81, No of female speaker: 19 Duration of data: 2 -5 Sec No of training session: 3, No of testing session: 5 With

minimum gap of one day between each sessions During verification task each speaker was compared

with its own & 19 other imposter speakers.

Page 11: Development of voice password based speaker verification system

Dynamic Time Warping

• DTW is a template matching technique• Test Features and Template (Model) are

sequence of feature vectors• Aim is to find distortion between Test Features

and Template • They may have different length • DTW uses dynamic programming to find

optimal path for normalizing the length variation.

Page 12: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 12

Experimental results for DTW based system for Voice password database

13 39 13 39 13 39 13 39 13 39

25 28 14 14.6 25 27.9 14.7 15 25.2 26.3

31 34 17 18.9 30 33.6 18 19.3 29.4 32.6

28 29 18 19 29 31 18.7 20 31.5 32.6

31 32 15 16 32 32.3 16.1 17.5 30.5 32.6

31 33 17 18 32 34 18.2 20.7 34.7 35.7

13 39

14.7 15.7

20 21.05

20 21.94

16.8 18.94

18.9 21.05

Start to End

VAD Start to end VAD Start to end VAD

Session1(EER %) Session2(EER%) Session3(EER%)

1

2

3

4

5

Train

Test

Page 13: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 13

Experimental results for GMM based system for Voice password database

17.9 19.7 21.2

18.34 18.1 20.3

18.69 19.6 18.7

19.8 20.1 18.9

20.6 20.6 20

Session 1(EER%) Session 2(EER%) Session 3(EER%)

Session 1

Session 2

Session 3

Session 4

Session 5

TrainTest

Page 14: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 14

DTW using only mean vector of GMM

15.9 19.7 21.2

16.26 18.1 20.3

18.69 19.6 18.7

19.8 20.1 18.9

20.6 20.6 20

Session1(EER%) Session2(EER%) Session3(EER%)

1

2

3

4

5

TrainTest

Page 15: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 15

Verification result comparison and discussion

DTW based system best EER :14%GMM based system best EER :17.9%DTW using mean vector of GMM best EER :15.9%Best result was obtained for DTW.Performance of DTW based system depends on

detection of end points.Performance of DTW based system may be improved

by robust end point detection and enhancing more speaker specific regions

Hence the motivation for the present work

Page 16: Development of voice password based speaker verification system

Vowel Regions In Speech Signal

• VOP and VEP are two important events in speech signal– VOP: instants at which onset of vowel takes place

in speech signal– VEP: instants at which offset of vowel takes place

in speech signal

Page 17: Development of voice password based speaker verification system

Vowel Regions In Speech Signal

VOP (circle) and VEP (arrow head) events for an utterance /the sea/

Page 18: Development of voice password based speaker verification system

Vowel Regions In Speech Signal

• Vowel regions are prominent regions in speech signal:– High amplitude– Near Periodic Excitation– Long Duration– Lower Zero Crossing rate

• Due to high amplitude SNR of vowel regions are high.

Page 19: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 19

Empirical Mode Decomposition

• Empirical Mode Decomposition (EMD)• Data-driven, multi-scale, robust to non-stationary signal• Fast oscillating signal can be superimposed to slow oscillating signals• Local mean of decomposed signals is zero and the signals are symmetric to

its local mean.• Impact of noise on the signal can be reduced

• Decomposed signals are defined as Implicit Mode Function (IMF), if it satisfies following conditions

• The number of extrema and the number of zero crossing differs only by one

• The local average is zero. This implies that envelop mean of upper envelop and lower envelop is zero.

.

Page 20: Development of voice password based speaker verification system

EMD Algorithm• For a given input signal X to decompose

Identify the local extrema of the signal X. Construct upper envelop E max & lower envelop Emin by interpolating maximum

&minimum,respectively Approximate local average by envelop mean Em taking average of two

envelops E max &Emin.

Compute candidate implicit mode h1=X-Em. If h1 is IMF,decompose the signal X as IMF imf= hi& the residue signal r=X-

imf.Otherwise repeat above steps.• If r has implicit oscillation mode,set r as input signal & repeat the steps.• A signal S(n) can be represented through IMFs as follows

S(n)= +r(n)Where r(n) is the residue.

Page 21: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 21

MOTIVATION FOR USE OF EMD

• Environmental effect on the speech data can be deemphasized

• Excitation information present in different frequency range can be analyzed separately.

• To emphasize the weak transitions in case of nasal-vowel, semivowel-vowel & Dipthongs.

Page 22: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 22

Flowchart for VOP detection

Page 23: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 23

VOP EVIDENCE PLOT

Page 24: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 24

Experiment

Speech data• Complete TIMIT database• Number of Male speakers: 438• Number of Female speakers: 192• Sampling Frequency=8 KHz• VOP experiment was performed on 100 speakers.

Page 25: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 25

Performance measure

• Identification rate (IR): Percentage of reference VOPs (VEPs) that are matched by detected VOPs (VEPs) with in vowel regions

• Spurious rate (SR): Percentage of detected VOPs (VEPs), which are detected outside vowel regions

Page 26: Development of voice password based speaker verification system

04/15/2023 N.I.T. PATNA ECE, DEPTT. 26

Performance of proposed VOP detection method

Baseline 47 74 78 88 15

Proposed 62 83 90 96 13

Detection Rate % Spurious Rate%

Method 10ms 20ms 30ms 40ms

Observation:•Performance of proposed method is better than baseline in terms of both Detection rate & Spurious Rate.•83% detection is achieved in 20ms window which is beneficial when used for comparison of strings of vowel regions.

Page 27: Development of voice password based speaker verification system

SV System by applying DTW on Vowel regions only

Page 28: Development of voice password based speaker verification system

SV System by applying DTW on mean of vowel regions only

Page 29: Development of voice password based speaker verification system

Score Normalization

DET Plot for DTW & Normalized DTW

Page 30: Development of voice password based speaker verification system

DET Plot for DTW on vowel regions only

Page 31: Development of voice password based speaker verification system

DET Plot of DTW on mean of vowel regions

Page 32: Development of voice password based speaker verification system
Page 33: Development of voice password based speaker verification system

Conclusion

• The proposed VOP Detection algorithm performed better than the best method present in the literature.

• The performance of proposed algorithm for voice password SV system is better than the any of the baseline system.

• The complexity of the proposed algorithm for voice password SV system is less than any baseline system which makes it useful for online SV task.