5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure...

13
-I- A~V ac s top cDnsezea W H 5-T t o. CI / Ag il I , , I for -150 -100 -50 0 50 o00 TIME FROM CONSONANT RELEASE (ms) Figure 8.58 The various curves represent the assumed time variation of the cross-sectional area of the glottal opening A, and the supraglottal constriction A, when a voiceless aspirated stop consonant is produced in intervocalic position. In the case of the curves for A s . the solid curve represents the trajectory A s would have if there were no pressure increase in the mouth; it is the same A s that was assumed for /h/, in figure 8.34. The short-dashed curve represents the actual dg that would occur as a consequence of the abducting forces due to the increased intra- oral pressure for the stop consonant. The sloping straight line following the consonant release (from = 0 to I = 70 ms) is an approximation to As(t) during this time interval and was used to determine the calculated curves in figure 8.60. The supraglottal curves A(f) are intended to approximate the changes in cross-sectional area for a velar consonant (dashed lines) and for a labial consonant (solid lines). + Po ONk Figure 8.59 Equivalent circuit used to calculate the flows and pressures at the release of a voiceless aspirated stop consonant. Values of the elements are discussed in the text and in the legend to figure 7.4. R1 = 1.5 acoustic ohms. E 0 9 L- :r a: TIME RELATIVE TO CONSONANT RELEASE (mns) Figure 8.60 Calculated values of airflow versus time for intervocalic voiceless aspirated stop consonants, corresponding to the time variation of glottal and supraglottal areas given in figure 8.58. Calculations are based on the equivalent circuit in figure 8.59, with a subglottal pressure of 8 cm H 2 0. Element values for C, and R. are given in the legend for figure 7.4. The estimated time of onset of vocal fold vibration is indicated by the arrow at 50 ms. Y 0.3 4;71 E -S a0.2 r 0.I 0 tc --- I- ------------- ----- d- 5z~I

Transcript of 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure...

Page 1: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

-I- A~V ac

s top cDnsezea WH 5-T t o.

CI / Ag il

I , ,

I for

-150 -100 -50 0 50 o00TIME FROM CONSONANT RELEASE

(ms)

Figure 8.58 The various curves represent the assumed time variation of the cross-sectionalarea of the glottal opening A, and the supraglottal constriction A, when a voiceless aspiratedstop consonant is produced in intervocalic position. In the case of the curves for As. the solidcurve represents the trajectory As would have if there were no pressure increase in the mouth; itis the same As that was assumed for /h/, in figure 8.34. The short-dashed curve represents theactual dg that would occur as a consequence of the abducting forces due to the increased intra-oral pressure for the stop consonant. The sloping straight line following the consonant release(from = 0 to I = 70 ms) is an approximation to As(t) during this time interval and was used todetermine the calculated curves in figure 8.60. The supraglottal curves A(f) are intended toapproximate the changes in cross-sectional area for a velar consonant (dashed lines) and for alabial consonant (solid lines).

+PoONk

Figure 8.59 Equivalent circuit used to calculate the flows and pressures at the release of avoiceless aspirated stop consonant. Values of the elements are discussed in the text and in thelegend to figure 7.4. R1 = 1.5 acoustic ohms.

E0

9L-:ra:

TIME RELATIVE TO CONSONANTRELEASE (mns)

Figure 8.60 Calculated values of airflow versus time for intervocalic voiceless aspirated stopconsonants, corresponding to the time variation of glottal and supraglottal areas given in figure8.58. Calculations are based on the equivalent circuit in figure 8.59, with a subglottal pressure of8 cm H20. Element values for C, and R. are given in the legend for figure 7.4. The estimatedtime of onset of vocal fold vibration is indicated by the arrow at 50 ms.

Y

0.34;71E-S

a0.2

r

0.I

0

tc

--- I- ------------- �-----

d- 5z~I

cc_pbhat
Text Box
All images in this file are courtesy of MIT Press. Used with permission.
Page 2: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

5

t/aL ap ,/ /a!'+ EWt /- -kkf)

-- - '- LIŽ1

0 1X I52X

200 SOO Y _I_

l 3hi8(n co

7­ = _ I L-L1-*0mgeaul L~4~L-J- 11

I I V 1

.Z­

I

TN-'t_ _

s _

It

'- . 1I I

i

ii - -- - -Z

Cv i -1 I­ I -j- - I

]~i~/

I

~

_ _ . I. i

i~~~~~~n

4L = = .­ := 4-o~ti---

. I A

I 2 = i

I ri:I Pd1I 4 U1 II I

i._ _ /I

rAI--. .

I · A ~TVs"

I -.2~

II.K)A

4n ccfculte 0CEaoc

3.U

SC OJspres sIi.

I E U LdJ

-JV)W

.­z

TIME FROM CONSONANT RELEASE (ms)

Figure 8.61 Calculations of airflow (top) and intraoral pressure (bottom) at the release of a

voiceless aspirated stop consonant. Curves are shown for airflow through the supraglottal con­striction () and through the glottis (Us). Dashed lines represent estimates for velar release, and solid lines for labial release. See text.

I-- -�-'T��-�-----l-�--�--� �-�-�-`

Page 3: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

I , % t

AtLe& S top coso0V\ W¥s ..-

Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the vocal

tract for voiced obstruent consonants. The volume velocity source L4 accounts for the increase

in vocal tract volume due to active expansion of this volume. The resistance R is connected to

the circuit during fricatives and after release of the consonants.

Figure 8.69 Schematic representation of midsagittal section of the vocal tract at two points in

time during production of the voiced stop consonant Id/ in intervocalic position: at the time the

closure for the consonant is produced (solid contour), and immediately before the consonant is released (dashed contour). The changing midsagittal contour is intended to illustrate the expan­

sion of the vocal tract volume during the closure interval for the consonant.

Z1J­

VOICED STOP VOICELESS STOP

'OtI 5 5

I/N . n Y .L _II

-00 00 OC -100 0 00

20 20

0 Z

10Un wE I. - PA ;. +0

X, -100 0 100 -100 0 100

- P

0 _ I0 <ME 7Pm

1 I·_O_ 0 o/ /f 1, w

Za , - 00 0 o00 -100 0 100 _

S r. 300 _ 300

4 0a 150 150

M- I I I ts. -100 0 100 ., -100 0 100

"3 i 20 20hi 0 U)LU 0a:

U.2

40a hi

I -20 , , I -20 ° | f 1. Inn-Inna -100 0 100 -Ivv V VUU

0- TIME RELATIVE TO CONSONANTRELEASE(ms)

Figure 8.70 Schematized representation of the time course of various articulatory and aero­

dynamic parameters during the production of an intervocalic voiced stop consonant (left) and

voiceless stop consonant (right), both consonants being unaspirated. The implosion of the con­

sonant occurs at time -100 ms on the abscissa and the consonant release occurs at time zero. An

intraoral pressure of 8 cm HzO is assumed. The top panel in each column gives the forward dis­

placement of the pharyngeal wall (data estimated from Perkell; 1969). The next panel shows the

estimated change in volume of the vocal tract during the dosure interval. The curves labeled

PASS. give the estimated passive volume change as a consequence of the increased intraoral

pressure; the curves P + A are the estimated combined volume increase due to active volume

change as well as passive change. The third panel gives the intraoral pressure P,. with the sub-

glottal pressure indicated by the horizontal line. The fourth panel shows the estimated glottal

airflow. The bottom panel gives estimates of the percent change in vocal fold stiffness, relative

to the stiffness in the vowels, that occurs during the dosure interval and immediately following

the release.

-· ·-- -·· ·- · ·- · · II- LI__---------�-----.---------'----

Page 4: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

riTrmttive' tDAsos\aATS

Icm I cm

Figure 8.1 Midsagittal sections obtained from cineradiographs during the production of the three fricative consonants /f/, /s/, and As/, as indicated. The subject is an adult female speaker of French. The approximate scale is given between the /f/ and /s/ configurations. In the case of the /s/ configuration, an estimate of the cross-sectional shape at the constriction is given. (This shape is drawn with an enlarged scale, as shown.) For each panel, two midsagittal sections for the tongue (and, in the case of //, for the lips) are shown, representing the fricative in two different phonetic environments. For /f/, the following vowel contexts are the high front rounded vowel /y/ (solid line) and // (dashed line); for /s/, /a/ (solid line) and /i/ (dashed line); and for /Al,/a/ (solid line) and /u/ (dashed line). (After Bothorel et al, 1986.)

I)E -

cEAs C)w 0)

t?

(a o -200 -100 0 100 o TIME RELATIVE TO RELEASE (ms)

Figure 8.2 Estimates of the time course of the area of the glottal opening (A.) and the area ofI-E

the supraglottal constriction (A,) when a fricative consonant is produced in intervocalic position. The solid lines indicate the area trajectories that would occur in the absence of a subglottal pres­sure, and the dashed lines indicate the modified areas that are calculated when forces on the structures due to the increased intraoral pressure are taken into account. The subglottal pressure

-J is assumed to be 8 cm H2 0.

TIME FROM RELEASED(ms)

Figure 8.3 Top: Time course of intraoral pressure (Pm) and transglottal pressure (AP,) corre­sponding to the area trajectories for an intervocalic fricative consonant, in figure 8.2. The verti­cal marks on the AP, curve indicate times when vocal fold vibration is expected to stop and to start, that is, at about -105 ms and 5 ms. Bottom: Calculated airflow vs. time for the intervocalic fricative consonant. Only the average flow is shown in the figure. There are fluctuations in the flow when the vocal folds are vibrating. Offset and onset of vocal fold vibration are expected at

times indicated by the vertical marks.

----------- ,--·""-"- _ .

Page 5: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

--

/0-~O­

I

:

300 4 ..... . _ r.�,

Ji InF 1 Il] -44- J I : : ! --

-- I I - - - I Im L­I -isff ffri S

r4t I .- V v1 I---j · ·a X , -. 4 . m'- A

Z~ .tJ . AA I-T 4Jv-Iv IV·s)·'slC�gh�··

" 6' 7- D 10­fI i I- III 'ii

't'! 1W Ll - t -d=1J

I I -,.4 .A4..

/N 'A,II.- I i.- A.-J. k - - V.1J

E I7

I !lI Ii- Ii oo~~~l I

--I I =44,.'

1.. II

AVI \' /I I ."-��f-� '#"! - P.rin . neluA>o /\P, t \ -t,'A - - y p wIVIPr -UL.41. .Ab>. -_--l - ----------------- ,-"\'s �

IVI IIIL-:.I,-

Ae

I,' I' ,

-I8 L­iI: LU 9ffl1 I 1l N­

I fflItAW" RRP~~~~~~~~~~~~~t~~~~'-~~~~~/j-----~~~~~~~~~~I . I I -" .

/. , -I. 1\J-- - .1 M' I V ­ I - --- 7,17"~dh~~b

Figure 8.12 A spectrogram of the utterance /afa/ is shown at the top. The spectra in the three columns immediately below the spectrogram are sampled at several points in three regions of the utterance: in the vowel immediately preceding the consonant (left, panels I to 4); within the fricative region (middle, panels 5 to 8); and in the vowel immediately following the consonant (right, panels 9 to 12). These spectra are obtained with a 6.4-ms Hamming window. In the vowel regions, spectra are sampled at four adjacent glottal periods, whereas in the consonant the spectra are sampled more sparsely. In panels 6 to 8, where there is no evidence of glottal vibra­tion, the lower spectra are averages of a series of spectra obtained with a 6.4-ms window over a 20-ms interval. The upper curves in these panels are smoothed versions of the average spectra. Waveforms and time windows, together with the times at which the spectra are sampled, are shown below each spectrum panel.

·-- - --- - -··· · -·-- - · · ·-

Page 6: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

----

I

Figure 8.16 Same as figure 8.12, except that the utterance is /osa/. Panels I to 4 represent spectra (6.4-ms window) sampled at several times preceding the VC boundary, and panels 7 to 10 are sampled following the CV boundary. Panels 5 and 6 are average spectra calculated over the first and second halves of the frication region, respectively. The spectra in panels A, B, C, and D were obtained with a longer time window. See text.

----------- ·- �- I----- --- --"---��--·.--I.-- -C-l^-·--"l �. -~~~~------

Page 7: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

-------------

-?­

SOURCE P LIPS I

I' 1

(a) SUBUNGUAL \

PALATAL CHANNEL BACK

CAVITY tAo&Le .wor?4

CAVITY I c S I cm

(b)

cm cm

Figure 8.19 (a) Schematized representation of the detailed cavity shapes in the anterior por­

tion of the vocal tract for //. The source of sound is assumed to be located at the lower incisors,

as indicated. (From Halle and Stevens, 1991b.) (b) The zeros in the transfer function from source

to mouth output in (a) are the natural frequencies of this configuration

1-s i

I �-----------`--I----------

l

Page 8: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

Vbciedu& ricait\e-QL Hi V

Is lul

WI If Tffl#4A1l 'l A -¢

II ' -I

fii _Llll lI - 2 I I

i111 hiff, 11 A I 1-1 1/1I iI :1111 In f . .

_111 4 o

_-Ah- 5n

-' wUl__ 0 000 200 30 40 500IC " r3W i~~IWM I

.2_ . I I . .w

w, -J

-m Iw2,W >

W-AI­ 100 200 300

i !

I

L I

v

500 200

100 200 300 . 1, I_ - - .

C

I- zl-r z

5 ~

LI

- ~ /7-V-TIME (ms)

I

L__i.

Y. I.~ . . .

m-

Figure 8.76 Samples of acoustic data for the intervocalic voiced fricative consonant /z/ in the utterance a zip. The spectrogram of the utterance is shown at the top left, and below this are: HI, the amplitude of the first harmonic, measured as in figure 8.71; "noise," the amplitude of the fri-

cation noise, taken to be the peak spectrum amplitude (6.4-ms window) in the frequency range of 3.5 to 5 kHz; Fl, the first formant frequency, measured as in figure 8:71; the difference HI - H2

between the amplitudes (in decibels) of the first and second harmonics; and FO, the fundamental

frequency. The vertical lines are estimates of the points where intraoral pressure begins to build

up and where the pressure begins to decrease, based on examination of the acoustic record.

Spectra at selected points near the left and right boundaries of the fricative are displayed at the right, together with waveforms and time windows (6.4-ms duration). The noise spectrum (panel 4) was obtained by averaging spectra over a 55-ms time interval All other spectra were calcu­

lated with a single time window.

~~~~~~~~~~~~~.-­__ - ------- ·--------_�I

Page 9: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

CI- ,-

A9 A2 Al

rnoD_, zzzLz Figure 8.25 Schematized model of the vocal tract for an affricate consonant. Closure is formed by setting the area A1 to zero at the front of the constriction. The channel behind the con­striction, with area A2, forms the fricative portion of the affricate after Al is released.

Figure 8.26 Conception of the nmidsagittal section of the anterior part of the vocal tract prior to the initial release (dashed line) and 20 to 30 ms following the release (solid line) of a palatoalveolar affricate.

tewz�n(e Ynr iL

I ,

' A

EE gE0- M

£ I tI

L.&I1TIL, T'Pr7

20 0 a0TRE ()

TEW)

VII S1

I

-

I . ZIn - -1/1VC, --Nrg II

i>. /iI~ ih ~- - ,~~~E,p "aC~f

�--�`------��"�� ~� -"�... ... :....... ...... �`.... `.. .�... ..._r

Page 10: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

- -

---------

/h/ &A-a-pirc-vnE 0.3

a- b * uOiL (J 0.2 (o0)

-j

_- 0.1

0 w

III I B

0 10010

0-.

E (b) 50

I IaI n 0 loo 200 300Hi

TIME (ms)

Figure 8.34 (a) Estimate of time course of glottal cross-sectional area during the production of /h/ in intervocalic position. (b) Calculated airflow corresponding to area variation in (a). A lung pressure of 8 cm H2 0 is assumed. Resistance of subglottal airways is taken to be 1.5 acoustic ohms.

Figure 8.33 Schematization of glottal opening for an abducted glottal configuration. The glottal width at the level of the arytenoid cartilages is w,; ,m and {c are the lengths of the mem­branous and cartilaginous portions of the glottis.

- 600 MODAL ­

400 (a) >. 200 I--o O ,2 I 4 8 12 16 20 2'4

I

b) 0

TIME (s)

0 500 o0O

FREQUENCY (Hz)

Figure 8.35 (a) and (b) Schematic representation of the waveforms and (c) and (d) spectra for modal voicing and for breathy voicing. For breathy-toicing, the amplitude of the fundamental component is enhanced relative to the amplitudes of higher-frequency components. A funda­mental frequency of 125 Hz is assumed. The amplitudes of the waveforms and of the spectral components are selected to be in the range observed for male voices.

~------� ·-··- ----7--· ----·---------------- -- -------­

Page 11: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

I -e-

so - f--- - - -'i

I <o- - 1 I 1 I

-2

50 1 2 3 4 2RE1 3 FREOani

. . I-.__

Ii~- 'Y IlA WV. . ...... ~ /~

--L AL

- . ICC�=-m

A A ,- T*Cw-')lu· WV W' - - ALJ

I· . ,-. _--. An.- - ' Jo,.E RA---

bn o mJW

9I 970 ma

1 ms

401 IA J.I

�A.. . �_ A,,..-A,~~ 1.. - - , 4-_i-'-.,> &?= LR 1 _ �mm __- V-5" U""

� ax t/U''qA. v.-,- nr--- r· I r1000 I I

1000m "NI

890 a

Figure 8.36 Spectra sampled near the onset of the vowel (upper panels) and about 30 ms later (lower panels) in the syllable /ho/ produced by two speakers: a female (left) and a male (right). Waveforms are shown below each spectrum. together with the time window over which the discrete Fourier transform is calculated.

I I I I

60 ?o\ to pdrinc Don D&ilc periodic

*,so I-o source

xyJ noise $ ) Lk ?L(

IE af glot, sn 40 L- o

.l)0 SOURCES OF noise source o o

Z 30 V-

_Jo ,0 nX

I r I I o 1 2 3 4

FREQUENCY (kHz)

Figure 8.41 Calculated spectra of radiated sound (before adding the effect of an all-pole trans­fer function) due to sources of turbulence noise in the laryngeal region during /h/ and to the periodic glottal source at a time when the glottis is in a configuration for modal voicing in an adjacent vowel. For the periodic source, the solid line gives the levels in 300-Hz bands, whereas

GLOTI the open circles give the levels of every fourth harmonic, assuming a fundamental frequency of 125 Hz. The spectrum of the noise is in 300-Hz bands. The distance from the mouth is 20 cm. To obtain the spectra for a vowel, the all-pole component of the vocal tract transfer function must be added to these spectra. See text.

Icm

Figure 8.38 Schematized views of the laryngeal region in the sagittal plane (left) and in coro­nal section (right), indicating how airflow through the glottis nmight impinge on the surface of the epiglottis (left) or on the ventricular folds (right) to produce turbulence noise that is repre­sented as a source of sound pressure.

- --·- ------------ �-�

Page 12: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

12

OC 4

0

C _ - _m

o 1 2 3 4 5 FREQ (kHz)

Ci

4

(a)

o 1 i h a r 2 3 4 5i

FREOQ(z)

Figure 8.44 Examples of smoothed spectra sampled in /h/ (solid lines) and in the follow

ing vowel (dotted lines) for the utterances /he/ (male speaker) and /ho/ (female speaker) a

indicated. Spectra were obtained by calculating a discrete Fourier transform (time window c AODE A -30 ms) and then smoothing this spectrum with a weighted frequency window of width abou

MODEL Z/- U > (b) 400 Hz.

Figure 8.50 (a) Midsagittal section of vocal trad for vowel Ai/, showing constriction in the palatal region. (From Perkell 1969.) (b) Schematization of the airway with two constrictions-one at the glottis, with cross-sectional area A, and the other in the supraglottal airway, with cross-sectional area A.

-- [60 [h

o

0o 2 3 4 5 FREO z)

0 -[h[h

o 'C [h]

-

0 1 2 3 4 5 FREQO(kHz) IFRU (kHz)

Figure 8.54 Some smoothed spectra sampled in the syllables /hi/ and /hu/. For each utterance,

two spectra are shown: a spectrum sampled within /h/ (solid line) and a spectrum sampled in the

following vowel (dotted line). The prominence of the FZ or F3 peak in the noise spectrum is

indicated in each panel. The upper row represents a male speaker, and the lower row a female

speaker.

1 ------- � --------� - --------------�I

Page 13: 5z~I All images in this file are courtesy of MIT …...I , % t AtLe& S top coso0V\ W¥s - .. Figure 8.68 Low-frequency equivalent circuit for estimating flows and pressures in the

FREQ Oz)

- 1 /,,, ^ .riL 1AAAh

It I S I I I5 I

o 20 40 60 80 100

TIME (mns)

E

FREO (kHz)

,IVAW vVV Y,

47 ' ' ' aam

0 20 40 60 80 o1

I 20 40TIMETIME

(ms)(s)

60 80 o100

Figure 8.57 Displays of several aspects of the acoustic signal in the vicinity of voicing onset in the syllable /he/ produced by a female speaker (top) and a male speaker (bottom). The upper panel of each section gives the spectrum (30-ms window) sampled near the onset of glottal vibration (left) and 30 to 40 ms later (right), as indicated by the waveforms immediately below the spectra Below these waveforms is the complete waveform for about 100 ms near voicing onset. The bottom waveform in each section was obtained by passing the signal through a bandpass filter (bandwidth = 600 Hz) centered on the third formant. This waveform is less regu­lar near voicing onset than it is later, indicating the dominance of the noise component in the glottal source.

^--I------ ------"c�- -�