2002 VIU Oct 2007 : Speaker Recognition1F. Schiel Florian Schiel Venice International University Oct...

VIU Oct 2007 : Speaker Recognition 1 F. Schiel

Florian SchielVenice International University

Oct 2007

Speaker Recognition =Speaker Identification, Speaker Verification

Agenda

• See the Context

• Speech Recognition vs. Speaker Recognition

• Speaker Identification vs. Speaker Verification

• Speaker Recognition: Basics

• Speaker Verification using HMM

• Discussion

• and then ...

General Approach to Authentification

• Three general ways to perform authentification:- proof of knowledge (e.g. password),- proof of possession (e.g. chip card),- proof of property (biometrics), and their combinations

• Biometrics: physiological based vs. behavioural based• Biometrical features:

Fingerprint, iris scan, facial scan, hand geometry, signature, voice

from U. Türk 2007

Biometric Features: General Requirements

• universal: can be found in any user• unique: even for identical twins• measurable: does not require human evaluation• robust to short-term and long-term variability• low dimensionality• robust to changing environment• robust to impersonation

from U. Türk 2007

++++++ooo+

Taxonomie Speech Processing

Natural Language Processing(NLP)

Spoken Language Processing(SLP)

Lexica

SyntaxParsing

Spellers

Search /IndexingSemantics

Terminology

Thesaurus

Dialogue systems

SpeechIdentification

Speech Synthesis

Speaker recognition

Speech Recognition

Forensics

Speech Recognition

"Decode the spoken content from the acoustic signal"

Speaker Recognition

"Determine the identity of a speaker from acoustic signal"

ASR "Sehr geehrter .." SI/SVAccepted/Rejected

SpeechModels

SpeakerCharacteristics

ClaimedIdentity

Speaker Verification• Authentification according to

claimed identity• Result is binary:

"accept" / "reject"• Scaling: effort independent

of number of participants• Accuracy: dependent of size

of enrolment data

Speaker Identification• Identification from limited number

of participants• Result is speaker identity• Scaling: effort increases linear

with number of participants• Accuracy: dependent of

+ size of enrolment data+ number of participants

reject

Identität falsch

accept

Identität ok correctidentity ok

accept

Identität ok falsereject

rejectreject

Identität falsch correct

accept

falseaccept

identity wrong

Speaker Recognition

• Applications:– Access Control

– Verification of identity

via the phone

– Automatic Teller Machines

– Password resetting

– Banking: Identity for new

accounts etc.

– Protection against theft (cars...)

Speaker Verification

• Applications:– Forensics

– Police Work

– Automatic User Settings

– Speaker Classification:

Advertising

Speaker Identification

Speaker Verification: Doddington's Zoo (1)

User = registered speaker, Impostor = non-registered speaker

• Goats : users that are often rejected wrongly (increasing 'false reject' errors)

• Lambs : users that are easily imitated (increasing 'false accept' errors)

• Sheep : users that 'behave' (not goats and not lambs)• Wolfs : particulary successful impostors

(increasing 'false accept' errors)

from Doddington 1998

Speaker Verification: Doddington's Zoo (2)

Wolfs may perform zero-effort or active impostor attempts to break into a SV system.

Problem:Speaker verification data bases do not contain active impostorattempts data of wolfs -> most technical evaluations are based on non-realistic data!

Technical Speech Processing

Featuredetection

DekoderHighpass

Analog Signal

Digital Signal

Vectors

...• "Call Richard!"• "Radio off!"• "216"

Symbols

Symbols:• Text• Action• Semantics

A / DAnti-

AliasingFilter

Verification"Accept""Reject"

Featuredetection

Highpass

A / DAnti-

AliasingFilter

Claimedidentity

PINFingerprint

SelectID

Speaker Models

Speaker Verifikation: Basics (1)

VerificationFeature

detectionHighpass

Speaker Verification: Basics (2)

Analog low pass filterto avoid anti-aliasingeffects

+ Analog-DigitalConverter

„Accept”„Reject”A / D

Anti-Aliasing

Filter

Anti-aliasing

filterA / D

Features:• speaker specific• robust against noise• partly long term

Extraction ofSpeakercharacteristics

Window

Merkmals-berechnung

VerificationHighpass

A / DAnti-

AliasingFilter

"Accept""Reject"A / D

Anti-Aliasing

FilterFeature

detection

Featuredetection

Highpass

A / DAnti-

AliasingFilter

Verification

"Accept""Reject"

p(S | ID) < threshold

vector sequenceS

decision

p(S | ID) > threshold

"Accept"

"Reject"

speaker modelof claimed ID

Speaker Verification: Tuning

• Error types highly dependent on threshold

high security -> false accept low false reject highuser friendly -> false reject low false accept high

EqualErrorRate

falseaccept

falsereject

• Both errors increase by:- channel disturbance- crosstalk- noise- room acoustics

threshold

• Solution:- multiple enrolments- adaptive learning

Speaker Verification: Score Normalisation (1)

Problem:How to set the optimal threshold?

HMMs generate a priori probabilities:O : observation = sequence of featuresl : speaker model

Bayes:

but is dependent on various factors

P l∣O=p O∣l P l P O

p O∣l

Solution: Bayesian Decision Rule:

with Bayes and log to both sides this leads to:

P l∣O =p O∣l P l P O

C FR P l∣O C FAP l∣O

log p O∣l − log p O∣l log C FAP l C FRP l

=threshold

: cost functions

Often assumed: costs are equal and speakers occurequally distributed

is estimated using a world or cohort model

world model : speaker model trained to all speakers

cohort model : speaker model trained to a group of

most competing models (wolfs)

lo g p O∣l − lo g p O∣l lo g N − 1

N : number of users∧ im postors

p O∣l

Speaker Verification: Enrolment

Method

Fixed, pre-specified sentence:e.g. "My voice is my password"

Fixed, selectable sentence:e.g. maiden name of grandmother

Changing number triplets:e.g. fifteen, thirtynine, seventythree

System generates a new sentencefor each verification

Enrolment Remarks

Speak sentence3 - 5 times

Speak sentence3 – 5 times

Speak each number3 – 5 times

Sentence may be intercepted and played back

Additional securityby content

High security by manypossible combinations

Elaborate enrolment,high processing effort,very high security

Speak each phoneme3 – 5 times

Speaker Verification: HMM types

Method

pre-specified sentence

recombination of segments taken from enrolment data

modeling without time structure

Model Security

Accuracy

linear

piecewise linear

ergodic

Speaker Verification: Features (1)

Variable signal characteristics• often required: telephone band 300 – 3300 Hz

(higher resonances cut off)• changing channel characteristics, caused by

transmission line, handset, distance to mouth• static and intermittent noise • user: health, intoxication, fatigue

Candidates determined by physiology:• fundamental frequency, average• wave form of vocal folds, jimmer, jitter, irregularities• formants: average and dynamics• places of articulation: fricatives, plosives• nasal cavity resonance• sub-glottal resonance

Candidates determined by behaviour:• voiced/unvoice ratio• fundamental frequency, dynamics• syllable rate, pause/speech ratio• dialectal features: vowel qualityCandidates determined by speech technology:• Linear Predictor Coefficients (LPC)• filter bank, Bark filter bank, Mel filter bank• Cepstrum, Mel-Cepstrum• (derivations with respect to time)

Sprecherverifikation: Road Map

1990 Heute 2010 2020

ZugangskontrollenSicherheitsbereich

Authentifizierungüber Telefon

Geräte "erkennen"ihren Benutzer

Sprecherprofilauf Chipkarten

Zugangskontrolle fürTastaturlose PDAs

Authentifizierungim Hintergrund

ÖffentlicheSprecherprofile

Automatischer Alkohol-test im Fahrzeug

Thank You!

2002 VIU Oct 2007 : Speaker Recognition1F. Schiel Florian Schiel Venice International University Oct...

Documents

Transcript of 2002 VIU Oct 2007 : Speaker Recognition1F. Schiel Florian Schiel Venice International University Oct...

Oct. 25, 2005© Artem Chebotko, 20051 OntoELAN: An Ontology-Based Linguistic Multimedia Annotator Speaker: Artem Chebotko (artem@cs.wayne.edu) Department.

Collection Development Class #6. TCB Guest speaker today Oct. 28 th class at ARHS Library 21 Mattoon St., Amherst, MA Nov. 18 th guest speaker Switch.

LMU|LA Distinguished Speaker Series Oct 2013

CALENDAR/PLANNER - Rowan College at Burlington · PDF file · 2017-08-16Ghost Hunters | ML 26 27 29 Leadership Speaker: ... OCT 10 WEDNESDAY OCT 11 THURSDAY OCT 12 FRIDAY OCT 13 SATURDAY

Florian Schiel Venice International University Oct 2007

PIECING IT ALL TOGETHER - Christian Publishers · 2015. 8. 11. · Piecing It All Together. CAST OF CHARACTERS Speaker 1 Speaker 2 Speaker 3 Speaker 4 Speaker 5 Speaker 6 Speaker

CPAC 2014 Schedule v3.5Official’CPAC’events’appear’in’BOLD. 1 Sponsor’events’are’italicized. Start Session Title Location Speaker I Speaker 2 Speaker 3 Speaker 4 Speaker

Keynote Speaker Plenary Speaker

Week 13 1Week 13 1st.October 2014 SPEAKERS Date Speaker Subject Speaker Host Wed. 1st. Oct. P.O.O.T. Yr Event Tue. 7th.Oct. Fellowship Meeting At Broadbeach Hub John Monks Tue. 14th.Oct.

· PDF fileKonsumtion Huub de Jonge | 190 Konversion Günther Windhager | 194 ... (Evers/ Schiel 1988; Migdal 1988). Zivilgesellschaftliche Akteurlnnen spielen eine

Stratos Group’s Unofficial Guide to Chromeandrasblake.com/files/guides/Chrome.pdfUnofficial Guide to Chrome Authored By: Noah “Windaria” Conrad Edited By: Zack “Zaxxon” Schiel

Speaker Magazine Article | Keynote Speaker | Business Speaker

MAHARASHTRA FOUNDATION · 2 September, 2013 Chief Guest & Key note Speaker for MF Fund Raising Event – Oct. 26, 2013 Guest Speaker: Chandrakala N Bhargaw Ms. Chandrakala Bhargaw

How to Form Ultrarelativistic Jets Speaker: Jonathan C. McKinney, Stanford Oct 10, 2007 Chandra Symposium 2007.

Ann Bucklin University of Connecticut – Avery Point, USA Shuhei Nishida University of Tokyo, Tokyo, Japan Sigrid Schiel Alfred Wegener Institute for Polar.

Guest Speaker Dr. Bob Westin Paris & Normandy · 2015-07-22 · Oct. 20 – Depart USA Board your overnight flight to Paris, France. Oct. 21 – Arrive in Paris, France - Embarkation

BY:KATHLEEN SCHIEL Constructivism. THE CONSTRUCTIVISM THEORY ARGUES THAT HUMANS BUILD NEW KNOWLEDGE AND MEANING THROUGH EXPERIENCES. STUDENTS LEARN.

Annual Report 2018 - iip.kit.edu · Glöser-Chahoud, Andreas Rudi, Tobias Zimmer, Marina Maier, Carmen Schiel, Kira Schumacher, Sonja ... IIP and the Federal Waterways Engineering

STONES SOUND STUDIO JP3.0 Ribbon Tweeter€¦ · Version 1 18db/oct @ 7.5 kHz with attenuator chart for 3way speaker Version 2 18db/oct @ 4 kHz with attenuator chart for 2way speaker

Flocculants for Water Treatment Date: May ???,2011 Speaker :Speaker A 、 Speaker B 、 Speaker C.