Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National...
-
Upload
susanna-theresa-jordan -
Category
Documents
-
view
217 -
download
0
Transcript of Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National...
![Page 1: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/1.jpg)
SP_1_intro
1
Introduction to Digital Speech Processing
69451
Presented by Dr. Allam Mousa
An Najah National University
DRA
FT
![Page 2: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/2.jpg)
SP_1_intro
WHAT IS THE SPEECH? Speech is the primary method of
human communication.
To transmit/store a speech waveform using as few bits as possible while retaining high quality
2
![Page 3: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/3.jpg)
SP_1_intro
3
SPEECH PROCESSING
Speech Processing aims at modeling and manipulating the speech signal to be able to
transmit (code) speech efficiently produce (synthesis) natural sounding voice recognize (decode) spoken words
Speech is a natural form of communication between humans and it reflects a lot of the variability and complexity of humans! This makes modeling speech an interesting and challenging task. The speech signal contains information from many levels and encodes information about the speaker and acoustic channel; the words and pronunciation; the language syntax and semantics, etc.
Speech technology is becoming increasingly well established with quite sophisticated technology now incorporated into many widely deployed applications and speech technologists are much in demand!
![Page 4: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/4.jpg)
SP_1_intro
4
Speech ProcessingSpeech processing is the study of
speech signals and the processing methods of these signals.
Speech is the way of choice for humans to communicate:
– no special equipment required – no physical contact required – no visibility required – can communicate while doing
something else
![Page 5: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/5.jpg)
SP_1_intro
5
![Page 6: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/6.jpg)
SP_1_intro
SPEECH PROCESS:1- Production:
6
![Page 7: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/7.jpg)
SP_1_intro
SPEECH PROCESS:
2- Propagation:the sound waves propagatethrough the air at a speed of 300 m/s, reaching the listener’s ears.
7
![Page 8: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/8.jpg)
SP_1_intro
SPEECH PROCESS:
3-· Perception: the incoming sounds are deciphered by
the listener into a received message, thereby completing the chain of events that culminated in the transfer of information from the speaker to the listener.
8
![Page 9: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/9.jpg)
SP_1_intro
9
SOME APPLICATIONS OF SPEECH PROCESSING Coding Compression Synthesis Automatic Speech Recognition (ASR) Speaker Recognition Speech Recognition Spoken Language recognition Speech Enhancement Echo Cancellation Noise Cancellation… and more
![Page 10: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/10.jpg)
SP_1_intro
10
Speech Processing
SignalProcessing Information
TheoryPhonetics
Acoustics
Algorithms(Programming)
Fourier transformsDiscrete time filtersAR(MA) models
EntropyCommunication theoryRate-distortion theory
Statistical SPStochastic models
PsychoacousticsRoom acousticsSpeech production
![Page 11: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/11.jpg)
SP_1_intro
11
DIGITAL SIGNAL PROCESSING
![Page 12: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/12.jpg)
SP_1_intro
12
SPEECH SOUND CATEGORIES
– Voiced: speech sounds where the vocal folds vibrate.
– Vowels: no blockage of the vocal tract and no turbulence (e)
– Consonants: non-vowels (s)– Plosives: consonants involving an explosion
(p)
![Page 13: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/13.jpg)
SP_1_intro
13
Speech WaveformsExtracts from “my speech”
(a) start of “y” vowel
(b) “ee” vowel
(c) “s” consonant
![Page 14: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/14.jpg)
SP_1_intro
14
HUMAN SPEECH PRODUCTION MECHANISM
![Page 15: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/15.jpg)
SP_1_intro
15
SPEECH CHAIN
SPEAKER
![Page 16: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/16.jpg)
SP_1_intro
16
SPEECH PROCESSING DIAGRAM
![Page 17: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/17.jpg)
SP_1_intro
17
SOURCES OF SOUND ENERGY
1- Turbulence: air moving quickly through a small hole (e.g./s/ in “size”)
2- Explosion: pressure built up behind a blockage is suddenly released (e.g. /p/ in “pop”)
3- Vocal Fold Vibration: like the neck of a balloon (e.g. /a/ in “hard”)
– airflow through vocal folds (vocal cords) reduces the pressure and they snap shut (Bernoulli effect) – muscle tension and air pressure build up force the folds open again and the process repeats – frequency of vibration (fx) determined by tension in vocal folds and pressure from lungs – for normal breathing and voiceless sounds (e.g. /s/) the vocal folds are held wide open and don’t vibrate
![Page 18: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/18.jpg)
SP_1_intro
SPEECH SOUND CATEGORIES:
1-Voiced: speech sound where the vocal tract folds vibrate.
2-Vowels: no blockage of the vocal tract and no turbulence
18
![Page 19: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/19.jpg)
SP_1_intro
SPEECH SOUND CATEGORIES:
3-Consonants: non-vowels.
4-Plosives: consonants involving an explosion
19
![Page 20: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/20.jpg)
SP_1_intro
THE VOCAL TRACT FILTER
20
![Page 21: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/21.jpg)
SP_1_intro
SPEECH SPECTROGRAME:
Ex: my speech
21
![Page 22: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/22.jpg)
SP_1_intro
22
VOCAL TRACT FILTER
The sound spectrum is modified by the shape of the vocal tract. This is determined by movements of the jaw, tongue and lips.
• The resonant frequencies of the vocal tract cause peaks in the spectrum called formants.
• The first two formant frequencies are roughly determined by the distances from the tongue hump to the larynx and to the lips respectively.
![Page 23: Introduction to Digital Speech Processing 69451 Presented by Dr. Allam Mousa 1 An Najah National University SP_1_intro.](https://reader036.fdocuments.in/reader036/viewer/2022062500/5697bfd01a28abf838caaa53/html5/thumbnails/23.jpg)
SP_1_intro
23
http://www.youtube.com/watch?v=uTOhDqhCKQs
http://www.youtube.com/watch?v=X_JvfZiGEek