Speech acoustics
-
Upload
derek-stiles -
Category
Education
-
view
4.385 -
download
2
Transcript of Speech acoustics
![Page 1: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/1.jpg)
Speech acoustics
Objectives: Describe relative frequency and intensity
of phonemes by voice, manner, and formant frequency.
Describe various phonemic cues.Describe speech constraints.
![Page 2: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/2.jpg)
Average speech intensity
~65 dB SPL (~45 dB HL) 30 dB range Any vowel has more power than any
consonant
![Page 3: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/3.jpg)
Average speech frequency
~50 – 10,000 Hz Most energy below 1000 Hz
Fundamental frequency Men: 100 Hz Women: 200 Hz Children: 300 Hz Crying babies: 500 Hz
Cues for talker identity
![Page 4: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/4.jpg)
Average speech duration
Vowels: 130 – 360 msec Consonants: 20 – 150 msec Rate: ~5 syllables/second; ~12
phonemes/second
![Page 5: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/5.jpg)
Vowel formants
High F1
Low F2
High F1
High F2
Low F1
Low F2
Low F1
High F2
![Page 6: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/6.jpg)
Vowel formants
![Page 7: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/7.jpg)
Consonants: place, manner, voicing
w
![Page 8: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/8.jpg)
Consonants: energy bandsFrequency Bands
Consonant 1 2 3 4 Intensity
r 600-800 1000-1500 1800-2400 46
l 250-400 2000-3000 43
sh 1500-2000 4500-5500 41
ng 250-400 1000-1500 200-3000 41
ch 1500-2000 4000-5000 38
n 250-350 1000-1500 2000-3000 37
m 250-350 1000-1500 2500-3500 35
th (ð) 250-350 4500-6000 34
t 2500-3500 34
h 1500-2000 32
k 2000-2500 34
j 200-300 2000-3000 36
f 4000-5000 34
g 200-300 1500-2500 33
s 5000-6000 32
z 200-300 4000-5000 31
v 300-400 3500-4500 31
p 1500-2000 30
d 300-400 2500-3000 29
b 300-400 2000-2500 29
th (θ) ~6000 28
![Page 9: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/9.jpg)
Phonemic cues - Stops
Closure Voiceless stops – silent period Voiced stops – low level energy
Burst Wide-band energy ~40 msec Greater intensity for voiceless stops Frequency depends on place
Formant transition First formant always rising Second formant transition depends on
place
![Page 10: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/10.jpg)
Phonemic cues - Stops
Voice easier to detect than place For voiced stops
Voice-onset time is earlier Energy present at fundamental frequency Burst energy is lower in amplitude Vowels are longer in duration before voiced
final stops (“eyes” v. “ice”)
![Page 11: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/11.jpg)
Phonemic cues - Nasals
Always voiced Continuant Nasal resonance
highest for /m/ lowest for /n/
Second formant (frequency and transition) gives place information
![Page 12: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/12.jpg)
Phonemic cues - Fricatives
Hissing quality Voiced fricatives
Periodic Lower frequency Lower amplitude Greater overall energy (from
fundamental) Sibilants (s, z, sh, zh)
Higher amplitude than other fricatives
![Page 13: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/13.jpg)
-f- -θ- -s- -S-
![Page 14: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/14.jpg)
Suprasegmental cues
Stress changes in fundamental frequency,
intensity, duration Intonation
changes in fundamental frequency, pitch pattern
expresses attitudes, feeling, meaning (command, request, statement)
Duration variations in speech sounds due to
context of other sounds
![Page 15: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/15.jpg)
Speech constraints
Syntactic S = NP (Aux) VP
NP = (Det) (AP) N (PP) “the naughty boy in the daycare…”
VP = V (NP) (PP) (Adv) “…took the toy away brusquely”
![Page 16: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/16.jpg)
Speech constraints
Syntactic S = NP (Aux) VP
NP = (Det) (AP) N (PP) “the naughty boy in the daycare…”
VP = V (NP) (PP) (Adv) “…took the toy away brusquely”
![Page 17: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/17.jpg)
Speech constraints
SyntacticThe question “What should you eat”
Answer is a noun phrase
The question “How should you eat” Answer is an adverbial phrase
![Page 18: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/18.jpg)
Speech constraints
Semantic Words in a sentence are related
meaningfully “Plug the mouse into the computer”
Situational Conversation usually refers to the context
of the environment “I like that oat!”
Mall vs. Farm
![Page 19: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/19.jpg)
Overlapping cues help protect the signal from noise
Speech predictability helps protect the signal from noise
Noise can come from the speaker (poor intelligibility, etc) the environment (distractions, etc) the listener (ESL, etc)
![Page 20: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/20.jpg)
Effects of hearing loss on speech perception
Objectives: Describe speech characteristics that are
lost and that are preserved for hearing losses of various degree, type and configuration.
![Page 21: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/21.jpg)
0 20 50 100 200 500 1000 2000 5000 10000 200000
20
40
60
80
100
120
140
160
Auditory Response Area
![Page 22: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/22.jpg)
0 20 50 100 200 500 1000 2000 5000 10000 200000
20
40
60
80
100
120
140
160
Auditory Response Area
![Page 23: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/23.jpg)
0 20 50 100 200 500 1000 2000 5000 10000 200000
20
40
60
80
100
120
140
160
Auditory Response Area
![Page 24: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/24.jpg)
Speech audiogram
![Page 25: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/25.jpg)
Speech audiogram
X X X X X X
![Page 26: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/26.jpg)
Speech audiogram
![Page 27: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/27.jpg)
Consonants: energy bandsFrequency Bands
Consonant 1 2 3 4 Intensity
r 600-800 1000-1500 1800-2400 46
l 250-400 2000-3000 43
sh 1500-2000 4500-5500 41
ng 250-400 1000-1500 200-3000 41
ch 1500-2000 4000-5000 38
n 250-350 1000-1500 2000-3000 37
m 250-350 1000-1500 2500-3500 35
th 250-350 4500-6000 34
t 2500-3500 34
h 1500-2000 32
k 2000-2500 34
j 200-300 2000-3000 36
f 4000-5000 34
g 200-300 1500-2500 33
s 5000-6000 32
z 200-300 4000-5000 31
v 300-400 3500-4500 31
p 1500-2000 30
d 300-400 2500-3000 29
b 300-400 2000-2500 29
th ~6000 28
![Page 28: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/28.jpg)
Consonants: energy bandsFrequency Bands
Consonant 1 2 3 4 Intensity
r 600-800 1000-1500 1800-2400 46
l 250-400 2000-3000 43
sh 1500-2000 4500-5500 41
ng 250-400 1000-1500 200-3000 41
ch 1500-2000 4000-5000 38
n 250-350 1000-1500 2000-3000 37
m 250-350 1000-1500 2500-3500 35
th 250-350 4500-6000 34
t 2500-3500 34
h 1500-2000 32
k 2000-2500 34
j 200-300 2000-3000 36
f 4000-5000 34
g 200-300 1500-2500 33
s 5000-6000 32
z 200-300 4000-5000 31
v 300-400 3500-4500 31
p 1500-2000 30
d 300-400 2500-3000 29
b 300-400 2000-2500 29
th ~6000 28
![Page 29: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/29.jpg)
Consonants: energy bandsFrequency Bands
Consonant 1 2 3 4 Intensity
r 600-800 1000-1500 1800-2400 46
l 250-400 2000-3000 43
sh 1500-2000 4500-5500 41
ng 250-400 1000-1500 200-3000 41
ch 1500-2000 4000-5000 38
n 250-350 1000-1500 2000-3000 37
m 250-350 1000-1500 2500-3500 35
th 250-350 4500-6000 34
t 2500-3500 34
h 1500-2000 32
k 2000-2500 34
j 200-300 2000-3000 36
f 4000-5000 34
g 200-300 1500-2500 33
s 5000-6000 32
z 200-300 4000-5000 31
v 300-400 3500-4500 31
p 1500-2000 30
d 300-400 2500-3000 29
b 300-400 2000-2500 29
th ~6000 28
![Page 30: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/30.jpg)
Speech audiogram
![Page 31: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/31.jpg)
Speech audiogram
![Page 32: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/32.jpg)
![Page 33: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/33.jpg)
![Page 34: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/34.jpg)
![Page 35: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/35.jpg)
34 dots
![Page 36: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/36.jpg)
Correlating SII to speech
Adult values (children would be worse)
Digits easy
Words hard
![Page 37: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/37.jpg)
X X X X X X
![Page 38: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/38.jpg)
Correlating SII to speech
![Page 39: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/39.jpg)
![Page 40: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/40.jpg)
![Page 41: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/41.jpg)
![Page 42: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/42.jpg)
Deafness
No access to average speech
![Page 43: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/43.jpg)
Severe
Access to only loudest components of speech
Speech production High airflow rate Speech initiation at low lung volumes Poor velar control (nasality) High fundamental frequency Slow speech rate
![Page 44: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/44.jpg)
Moderate
Access to louder half of speech, or to loud speech
Speech production Substitutions and distortions Errors in affricate, fricatives and blends
![Page 45: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/45.jpg)
Slight to Mild
Access to all but the quietest components of speech
Speech production Fewer distortions/substitutions Good intelligibility
![Page 46: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/46.jpg)
Rising v. Sloping loss
![Page 47: Speech acoustics](https://reader035.fdocuments.in/reader035/viewer/2022062319/558493c6d8b42a043a8b535c/html5/thumbnails/47.jpg)
Rising v. Sloping loss
SII = 64 SII = 45