Prosody and Suprasegmentals

Dr. Christian DiCaniocdicanio@buffalo.edu

University at Buffalo

11/24/15

DiCanio (UB) Suprasegmentals 11/24/15 1 / 45

Suprasegmentals

Suprasegmental contrasts refer to those aspects of the speech signal whichare not mainly defined by the actions of the oral articulators. Tone, stress,intonation, and length may be “overlaid” on a string of speech sounds.

secondary none primary none

mæ sə tʃu sɪts

Time (s)0 1.253

Considerations

Prosodic units

Stress

Length

Intonation/Prosody

Suprasegmentals

Suprasegmentals refers either to:1 the acoustic dimensions of the speech signal not directly produced by

the oral articulators (F0, voice quality), the temporal aspects of thespeech signal (duration), and the “strength” of such articulations(intensity, hyperarticulation)

2 the phonological categories which utilize such dimensions (stress, tone,length)

Of relevance here is that such dimensions extend beyond the segment orfunction independently from segmental units.

Acoustics of F0

Acoustics of F0 and pitch

Physically, F0, or fundamental frequency, refers to the least commondivisor frequency value in a periodic complex wave.Thus, if we have acomplex wave with three components (300 Hz., 500 Hz., 700 Hz), thefundamental frequency is 100 Hz.

The pitch of a sound corresponds to its fundamental frequency (F0),but the two are distinct. Pitch is often perceived as the difference infrequency between successive harmonics of the F0. Thus, we can heara pitch even if a fundamental frequency is absent.

The pitch range is humans varies. For men, it is usually between75-250 Hz. For women, it is usually between 120-350 Hz.

Acoustics of F0

Measurement of F0

While we can measure F0 manually by examining the inverse of periodduration, we rely on algorithms in programs like Praat (Boersma andWeenink, 2013) to determine it automatically for us.

Typically, autocorrelation is used to determine F0. This method uses aroving window (3 x 1/(F0min) in Praat) over which the correlation betweensuccessive pitch periods is calculated.

Within the window, many different components of the signal mightcorrelate with each other, including pieces within a single period of acomplex wave or components across several periods.

The greatest correlation should be across subsequent pitch periods.

Acoustics of F0

Autocorrelation varies the distance between windows by a short and longlag, defined by the F0 range. The best correlation across windows is usuallya distance of one period, which then defines the F0 for that period.

Acoustics of F0

F0 tracking is statistical though - the highest correlation “wins” anddetermines the F0 of the signal. Autocorrelation favors the shortest lagbetween windows and this can cause problems.

Pitch-doubling occurs when the highest correlation across windows isfound at a distance less than one period.

Pitch-halving occurs when the highest correlation across windows is foundat a distance greater than one period.

These patterns are typically more common in non-modal phonation type orwhen the F0 range is incorrectly-defined.

Duration

Like pitch, duration is used as a suprasegmental property. In languageslike English and Polish, it is used to mark stress.

Duration can be used for segmental contrasts (singletons-geminates).Yet, even in languages with contrastive length, duration may be usedto indicate stress.

Other things, apart from stress, also influence vowel length, like wordsize or consonant voicing.

Duration

Long vowels in multisyllabic words

Duration (in msec) of /!"/ in the sequence /t!"t/ in a set of Hungarian

words (Tarnóczy, 1965):

210 /t!"t/

180 /t!"to#/

140 /t!"to#$t/

120 /t!"to#$to"k/

110 /t!"to#$to"kn$k/

The more syllables in the word the shorter the initial “long” vowel.

Duration

There is contrastive vowel length in Levantine Arabic (Ham, 2001), butword size still has an effect on vowel length.

Duration

Thai vowel length contrast (Abramson, 1974; Roengpitya, 2002)

pàk to stick in pààk mouthcìp to sip cììp to foldbaN to hide baaN someòt to abstain òòt to lamentkham dusk khaam to crossthan on time thaan alms

Vowel length is neutralized in unstressed syllables in Thai. So, stress is notsimply indicated by acoustic duration in the language, but by the presenceof a length contrast.

Duration

The effect of stress on vowel length in Thai (Gandour et al., 1996).

Duration

Intrinsic Duration.

Vowel length is tied to vowel height (low vowels tend to be longer

than high vowels).

Intrinsic duration in Swedish (Elert, 1964):

long short

high /i y u/ 140 95

mid /e ! "/ 155 103

low /æ œ #/ 164 111

Duration

American English vowel duration (Hillenbrand et al., 1995)

/i/ 282 ms /u/ 273 ms/I/ 226 ms /U/ 229 ms/eI/ 300 ms /oU/ 300 ms/E/ 226 ms /5/ 216 ms/æ/ 311 ms /O/ 318 ms/a/ 300 ms /ô/ 297 ms

Lower vowels are longer than higher vowels.

Lax vowels /I, E, U, 5/ are shorter than non-lax vowels, with adifference ratio of 1:1.33.

Loudness

Stress differences in languages also sometimes correlate withdifferences in loudness, but this is uncommon. In English, stressedsyllables are often no louder than unstressed sylables (Ladefoged,1971).

Loudness often does vary with respect to vowel quality or pitch level.

Loudness

Intrinsic Loudness - low vowels are louder.

English: VU (volume unit) meter readings

(Lehiste & Peterson, 1959)

/i/ 80.1 /i/ -9.7

/e/ 81.1 /e/ -3.6

/a/ 85.7 /a/ 0

/o/ 83.5 /o/ -7.2

/u/ 80.4 /u/ -12.3

Hungarian: dB relative to /a/

Fónagy, 1966

Loudness

Intensity profile of 5 different tones in Trique

33 3 3 3 3 3

2 4 6 8 10

Time (normalized)

2 2 2 2 22

32 3 2 3 2

Prosodic units

Despite the fact that we can describe speech articulations in terms ofsegments, there is actually little evidence that this is the main unit we usefor speech planning or organizing our speech.

Larger sized units, like syllables or moras are the organizationalframes over which suprasegmental properties apply.

The syllable is a unit of timing in speech, but it is ill-defined in purelyacoustic/articulatory terms.

Tap test - syllables and stress, e.g. “massachussetts”

Prosodic units

What’s a syllable?

Early work by Stetson that one could identify the syllable in terms of“chest pulses”, though this was disproved by Ladefoged in 1967.

There are no individual pulses in the chest cavity which correspond todistinct syllables in speech.

A quasi-acoustic definition relies on sonority, reflecting the relativeintensity of a speech sound. Syllables are defined as having a rise insonority and a subsequent fall, e.g. [sIt] low sonority + high sonority +low sonority; [snoU] lowest sonority + higher sonority + highestsonority.

All syllables have a “peak” in prominence.

Prosodic units

Problems with sonority

How do we define if a particular sound is louder though? Is [z] louder than[d]? How do syllabic sonorants count, e.g. is “prism” one or two syllables?

In languages like Polish, where sonorants can be part of clusters, less clearhow to define different syllables in terms of sonority, e.g. [’jabwkO] ‘apple’is disyllabic (due to stress), but where does [w] belong?

There are many disagreements about the status of medial consonants, e.g.“fellow.”

Tautological with respect to prominence: a sound is prominent if it formsthe peak of a syllable, but a syllable is present is there is a peak inprominence.

Prosodic units

Solution: planning and coordination

A better solution is one where the syllable is defined as a units of speechplanning. We plan our motor actions in syllable-sized chunks.

Comp. by: Kandavel Date:29/5/06 Time:18:16:17 Stage:1st Revises File Path://spsind002s/cup_prod1/PRODENV/000000~1/00E26E~1/S00000~3/000000~2/000000~2/000010355.3D Proof by: QC by: Author: Goldstein

(Browman and Goldstein, 2000), that multiple, competing coupling relations can be

specified in the network of oscillators in the coupling graph. For example, in the case

of “spot,” the oral constriction gestures of /s/ and /p/ are coupled in-phase to the vowel

gesture and simultaneously anti-phase to one another, as shown in the coupling graph in

Fig. 7.8. The coupled oscillator planning model (Nam and Saltzman, 2003) predicts that

the onset of the vowel gesture should occur midway between the onset of the tongue tip

gesture for /s/ and the lip gesture for /p/. As shown in Fig. 7.10 for the phrase “pea spots,”

kinematic data (for the same speaker as in Fig. 7.9) supports this prediction. Nam and

Figure 7.9 Time functions of vocal tract variables, as measured using X-ray microbeam data, for thephrase “pea pots” showing the in-phase (synchronous within 25 ms) coordination of the lip gesturefor the /p/ in “pots” and the /a/ gesture for the vowel in “pots.” Tract variables shown are lipaperture (distance between upper and lower lips), which is controlled for lip closure gestures (/p/ inthis example) and tongue tip constriction degree (distance of the tongue tip from the palate), whichis controlled in tongue tip gestures (/t/ and /s/ in this example). Also shown is the time function forthe distance of the tongue body from the palate, which is small for /i/ and large for the vowel /a/,when the tongue is lowered and back into the pharynx. (The actual controlled tract variable for thevowel /a/ is the degree of constriction of the tongue root in pharynx, which cannot be directlymeasured using a technique that employs transducers on the front of the tongue only. So distance ofthe tongue body from the palate is used here as a rough index of tongue root constriction degree.)Boxes delimit the times of presumed active control for the oral constriction gestures for /p/ and /a/.These are determined algorithmically from the velocities of the observed tract variables. The leftedge of the box represents gesture onset, the point in time at which the tract variable velocitytowards constriction exceeds some threshold value. The right edge of the box represents the gesturerelease, the point in time at which velocity away from the constricted position exceeds somethreshold. The line within the box represents the time at which the constriction target is effectivelyachieved, defined as the point in time at which the velocity towards constriction drops belowthreshold.

230 L. Goldstein, D. Byrd, and E. Saltzman

(Goldstein et al., 2006)DiCanio (UB) Suprasegmentals 11/24/15 23 / 45

Prosodic units

The syllable and suprasegmentals

The syllable is also the target of suprasegmental contrasts. Hakha Lai hasthree tones: falling, rising, and low. The same trajectory occurs acrossvowels preceding a stop coda as on the entire rime (VC) which includes asonorant coda.

!!!"#$!%#&&'()*$!

.*/&0!12(!)3/'$4)!*5!67!8#9:()!5*/!32(!; ;!)<99#=9()!.(/(!32($!#9'4$(&!#53(/!32')!>(#$!$:>=(/!*5!)3(?)0!6*99*.'$4!#9'4$>($3@!32(!)#>(!3/:$A#3'*$;'$3(/?*9#3'*$!?/*A(&:/(!.#)!#??9'(&!3*!32(!($&!*5!32(!; ;!)<99#=9()0!6'$#99<!32(!67!8#9:()!#3!(#A2!3'>(!?*'$3!.(/(!#8(/#4(&! 5*/!(#A2!:33(/#$A(! 3<?(!.'32!#!4'8($! 3#/4(3!.*/&0!12(! /():93'$4!67!A*$3*:/)!#/(!?9*33(&!'$!6'4:/()!B@!C@!#$&!D7!:)'$4!&'55(/($3!)2#&'$4!#$&!?*'$3!)2#?()!3*!'&($3'5<!(#A2! 3<?(0! 12/((! 5'4:/()! #/(! :)(&! 3*! #8*'&! 3**!>:A2! ):?(/?*)'3'*$! *5! 32(! '$&'8'&:#9!3/#AE)0! 12(! 3'>(! #$&! 5/(F:($A<! )A#9()! #/(! 32(! )#>(! 5*/! 32(! 32/((! 5'4:/()@! )*! 32#3!&'55(/($3!?#33(/$)!>#<!=(!/(#&'9<!A*>?#/(&0!

G'$A(!#99! 32(! 3#/4(3!.*/&)!#$&! 32(! ; ;!>#/E(/!=(4'$!.'32!8*'A(9())!*=)3/:($3)@!32(!9*A#9!/#')'$4!(55(A3!*5!32')!A9#))!*5!A*$)*$#$3!*$!32(!5'/)3!5(.!?(/'*&)!*5!8*A#9!5*9&!8'=/#3'*$!>:)3!=(!3#E($!'$3*!#AA*:$30!H)!#!/:9(!*5!32:>=@!*$)(3!?'3A2!A#$!=(!(I?(A3(&!3*!=(! )*>(!D7JDK!LM!2'42(/! 32#$! '3!*32(/.')(!.*:9&!=(0!12(! (55(A3! &'>'$')2()!*8(/! 32(!5'/)3!K7!>)!*5!32(!5*99*.'$4!8*.(90!!

D77! N77 O77! ,77 K77 -77 +77 B77! C77!

$#%$#&

>)! !6'4:/(!CP!67!'$!3#/4(3!.*/&)!.'32!)2*/3!8*.(9)!#$&!5#99'$4!3*$()!#$&!5*99*.'$4!; ;0!!

6'4:/(! C! &')?9#<)! 32(! 3#/4(3!.*/&)!.'32! )2*/3! 8*.(9)! #$&! 5#99'$4! 3*$()0! "$! 32*)(!.'32! A*&#! )*$*/#$3)! 32(! 5#99! ')!>*)39<! /(#9'M(&! &:/'$4! 32(! )*$*/#$3! ?*/3'*$@!.2'A2! #)!$*3(&!(#/9'(/!')!9*$4(/!32#$!32(!8*.(90!QR50!6'4:/()!K!#$&!+0S!12(!($&J?*'$3!*5!32(!5#99!'$!32()(! A#)()! ')! #/*:$&! DDN!LM@!.2'A2! ')! )'>'9#/! 3*! 32(! ($&J?*'$3! *5! 32(! 5#99)! '$! 9*$4J8*.(9!/2<>()!'$!6'4:/()!D7!#$&!DD0!12(!)2*/3!&:/#3'*$!*5!32(!8*'A(&!?*/3'*$!'$!RTU!)<99#=9()!&*()!$*3!#99*.!8(/<!>:A2!*5!#! 5#99'$4!A*$3*:/! 3*!=(! /(#9'M(&0!12(! 3/:$A#3(&!

(Maddieson, 2004)DiCanio (UB) Suprasegmentals 11/24/15 24 / 45

Prosodic units

The syllable is also the target of suprasegmental contrasts. Hakha Lai hasthree tones: falling, rising, and low. The same trajectory occurs acrossvowels preceding a stop coda as on the entire rime (VC) which includes asonorant coda.

!!!"#$!%#&&'()*$!

.*/&0!12(!)3/'$4)!*5!67!8#9:()!5*/!32(!; ;!)<99#=9()!.(/(!32($!#9'4$(&!#53(/!32')!>(#$!$:>=(/!*5!)3(?)0!6*99*.'$4!#9'4$>($3@!32(!)#>(!3/:$A#3'*$;'$3(/?*9#3'*$!?/*A(&:/(!.#)!#??9'(&!3*!32(!($&!*5!32(!; ;!)<99#=9()0!6'$#99<!32(!67!8#9:()!#3!(#A2!3'>(!?*'$3!.(/(!#8(/#4(&! 5*/!(#A2!:33(/#$A(! 3<?(!.'32!#!4'8($! 3#/4(3!.*/&0!12(! /():93'$4!67!A*$3*:/)!#/(!?9*33(&!'$!6'4:/()!B@!C@!#$&!D7!:)'$4!&'55(/($3!)2#&'$4!#$&!?*'$3!)2#?()!3*!'&($3'5<!(#A2! 3<?(0! 12/((! 5'4:/()! #/(! :)(&! 3*! #8*'&! 3**!>:A2! ):?(/?*)'3'*$! *5! 32(! '$&'8'&:#9!3/#AE)0! 12(! 3'>(! #$&! 5/(F:($A<! )A#9()! #/(! 32(! )#>(! 5*/! 32(! 32/((! 5'4:/()@! )*! 32#3!&'55(/($3!?#33(/$)!>#<!=(!/(#&'9<!A*>?#/(&0!

G'$A(!#99! 32(! 3#/4(3!.*/&)!#$&! 32(! ; ;!>#/E(/!=(4'$!.'32!8*'A(9())!*=)3/:($3)@!32(!9*A#9!/#')'$4!(55(A3!*5!32')!A9#))!*5!A*$)*$#$3!*$!32(!5'/)3!5(.!?(/'*&)!*5!8*A#9!5*9&!8'=/#3'*$!>:)3!=(!3#E($!'$3*!#AA*:$30!H)!#!/:9(!*5!32:>=@!*$)(3!?'3A2!A#$!=(!(I?(A3(&!3*!=(! )*>(!D7JDK!LM!2'42(/! 32#$! '3!*32(/.')(!.*:9&!=(0!12(! (55(A3! &'>'$')2()!*8(/! 32(!5'/)3!K7!>)!*5!32(!5*99*.'$4!8*.(90!!

D77! N77 O77! ,77 K77 -77 +77 B77! C77!

$#%$#&

>)! !6'4:/(!CP!67!'$!3#/4(3!.*/&)!.'32!)2*/3!8*.(9)!#$&!5#99'$4!3*$()!#$&!5*99*.'$4!; ;0!!

6'4:/(! C! &')?9#<)! 32(! 3#/4(3!.*/&)!.'32! )2*/3! 8*.(9)! #$&! 5#99'$4! 3*$()0! "$! 32*)(!.'32! A*&#! )*$*/#$3)! 32(! 5#99! ')!>*)39<! /(#9'M(&! &:/'$4! 32(! )*$*/#$3! ?*/3'*$@!.2'A2! #)!$*3(&!(#/9'(/!')!9*$4(/!32#$!32(!8*.(90!QR50!6'4:/()!K!#$&!+0S!12(!($&J?*'$3!*5!32(!5#99!'$!32()(! A#)()! ')! #/*:$&! DDN!LM@!.2'A2! ')! )'>'9#/! 3*! 32(! ($&J?*'$3! *5! 32(! 5#99)! '$! 9*$4J8*.(9!/2<>()!'$!6'4:/()!D7!#$&!DD0!12(!)2*/3!&:/#3'*$!*5!32(!8*'A(&!?*/3'*$!'$!RTU!)<99#=9()!&*()!$*3!#99*.!8(/<!>:A2!*5!#! 5#99'$4!A*$3*:/! 3*!=(! /(#9'M(&0!12(! 3/:$A#3(&!

The syllable duration is the constant timing unit, so long vowels occur witha short coda and short vowels occur with a long coda (Maddieson, 2004).

Prosodic units

In many languages, there is evidence of a different timing unit, called themora.

TABLE I. The possible segment configurations for the mora.

Number of moras Word English gloss

1 da to be hi fire

2 ko-re this i-chi one

ko-ko-ro heart Ke-i-ko girl's name Ho-n-da family name cho-t-to a little bit

4 Kyu-u-de-n Imperial Palace

does provide some specific constraint on articulatory timing in Japanese. The results below seem to support a somewhat freer and more abstract view of the temporal properties of the mora but suggest that it is inherently temporal, neverthe- less. The following experiments examine this concept by ex- ploring interactions of the mora with other phonological properties of Japanese.

I. EXPERIMENT 1' WORD LENGTH VERSUS NUMBER OF SYLLABLES

Since words should always begin and end at mora boun- daries, one of the simplest predictions of the mora hypothesis is that word durations should come in steps that are integral multiples of the duration of amora at a particular speaking tempo•us, if we start with a one-mora word and add successive moras to it, the total word duration should in- crease by constant amounts in each case. Note that this is not generally true in English (Lehiste, 1972; Port, 1981; Klatt, 1976) where the distinction between stressed and unstressed syllables means that addition of an unstressed syllable adds only a little to total word duration, and where further (unstressed) syllables add less and less to the duration of the word. This experiment tested the prediction that the number of moras in a word should control word duration no matter

what the number of syllables is. Thus it is a fairly simple and direct test of the mora hypothesis.

A. Methods

Sentences were constructed from the words on the fol-

lowing list. Two sets of words were used in an attempt to average across any effects due to individual segments. They are mostly nonsense words but are plausible Japanese. The words were constructed so that those in the left column would have a syllable for each new mora, but the list in the right-hand column should undergo several phonological rules that might be expected to reduce the number of syllables.

ra si [•i] raku sita [gta] rakuda sitaku [•taku] rakudaga sitakusu [•tak h su] rakudagasi sitakusuru [•tak h suru]

The words were embedded in a constant carrier sentence Kore wa__ desu, "This is__" and written in Japanese or- thography. The words were constructed with the help of a native speaker to make sure that all sounded like possible Japanese words. The first set contains an increasing number of syllablesmand, of course, moras. According to the mora hypothesis, then, each word should be longer than the preceding one by the same amount of time. Words of the second set undergo a well-known phonological rule in Japanese which results in the apparent deletion of several vowels (McCawley, 1968; Beckman and Shoji, 1984). The rule specifies that the high vowels/i,u/are either deleted alto- gether (e.g., after/s, •/) or else made voiceless (whispered) when they occur between two voiceless consonants. Thus a word that is phonemically (and orthographically)/sita/is actually pronounced [ •ta ], and/sitakusuru/is pronounced [ •tak h sura ]. The [ h ] is the transcription means that the/k/ is released (perhaps into a voiceless vowel) before the/s/ (cf. Beckman and Shoji, 1984). Although this is their audi- tory impression to an English-speaking phonetician, it must be kept in mind that there is still orthographic support for the presence of the "underlying" moras despite the audible weakening of the underlying syllables. The issue here is what the pattern of total word durations is. If words have an integral number of moras and if each mora tends to have the same duration as every other mora, then we should expect that the word durations will get longer by a constant amount independently of either the segmental content or syllabic structure of the words.

The subjects were four native speakers of Japanese studying as undergraduates at Indiana University. They were asked to read these sentences from a list containing three tokens of each sentence at a comfortable tempo. After some practice readings, they were recorded in a quiet room in the Phonetics Laboratory. The recordings were analyzed on a sound spectrograph and the duration of each test as well as various component segmental units were measured.

400, 2O0 • ß ra.ku-da-ga-si ß $i-ta-ku-$u-ru

0 I I I i i I 2 3 4 5

NUMBER OF MORAS

FIG. 1. Duration of the test words from experiment 1 for the two series of words pooled across speakers. The words differ in the number of moras per word; the two series differ in segmental content.

1575 J. Acoust. Soc. Am., Vol. 81, No. 5, May 1987 Port eta/.' Evidence for mora timing in Japanese 1575

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.132.173.67 On: Mon, 14 Apr 2014 14:41:52

(Port et al., 1987)DiCanio (UB) Suprasegmentals 11/24/15 26 / 45

Prosodic units

TABLE I. The possible segment configurations for the mora.

Number of moras Word English gloss

1 da to be hi fire

2 ko-re this i-chi one

ko-ko-ro heart Ke-i-ko girl's name Ho-n-da family name cho-t-to a little bit

4 Kyu-u-de-n Imperial Palace

does provide some specific constraint on articulatory timing in Japanese. The results below seem to support a somewhat freer and more abstract view of the temporal properties of the mora but suggest that it is inherently temporal, neverthe- less. The following experiments examine this concept by ex- ploring interactions of the mora with other phonological properties of Japanese.

I. EXPERIMENT 1' WORD LENGTH VERSUS NUMBER OF SYLLABLES

Since words should always begin and end at mora boun- daries, one of the simplest predictions of the mora hypothesis is that word durations should come in steps that are integral multiples of the duration of amora at a particular speaking tempo•us, if we start with a one-mora word and add successive moras to it, the total word duration should in- crease by constant amounts in each case. Note that this is not generally true in English (Lehiste, 1972; Port, 1981; Klatt, 1976) where the distinction between stressed and unstressed syllables means that addition of an unstressed syllable adds only a little to total word duration, and where further (unstressed) syllables add less and less to the duration of the word. This experiment tested the prediction that the number of moras in a word should control word duration no matter

what the number of syllables is. Thus it is a fairly simple and direct test of the mora hypothesis.

A. Methods

Sentences were constructed from the words on the fol-

lowing list. Two sets of words were used in an attempt to average across any effects due to individual segments. They are mostly nonsense words but are plausible Japanese. The words were constructed so that those in the left column would have a syllable for each new mora, but the list in the right-hand column should undergo several phonological rules that might be expected to reduce the number of syllables.

ra si [•i] raku sita [gta] rakuda sitaku [•taku] rakudaga sitakusu [•tak h su] rakudagasi sitakusuru [•tak h suru]

The words were embedded in a constant carrier sentence Kore wa__ desu, "This is__" and written in Japanese or- thography. The words were constructed with the help of a native speaker to make sure that all sounded like possible Japanese words. The first set contains an increasing number of syllablesmand, of course, moras. According to the mora hypothesis, then, each word should be longer than the preceding one by the same amount of time. Words of the second set undergo a well-known phonological rule in Japanese which results in the apparent deletion of several vowels (McCawley, 1968; Beckman and Shoji, 1984). The rule specifies that the high vowels/i,u/are either deleted alto- gether (e.g., after/s, •/) or else made voiceless (whispered) when they occur between two voiceless consonants. Thus a word that is phonemically (and orthographically)/sita/is actually pronounced [ •ta ], and/sitakusuru/is pronounced [ •tak h sura ]. The [ h ] is the transcription means that the/k/ is released (perhaps into a voiceless vowel) before the/s/ (cf. Beckman and Shoji, 1984). Although this is their audi- tory impression to an English-speaking phonetician, it must be kept in mind that there is still orthographic support for the presence of the "underlying" moras despite the audible weakening of the underlying syllables. The issue here is what the pattern of total word durations is. If words have an integral number of moras and if each mora tends to have the same duration as every other mora, then we should expect that the word durations will get longer by a constant amount independently of either the segmental content or syllabic structure of the words.

The subjects were four native speakers of Japanese studying as undergraduates at Indiana University. They were asked to read these sentences from a list containing three tokens of each sentence at a comfortable tempo. After some practice readings, they were recorded in a quiet room in the Phonetics Laboratory. The recordings were analyzed on a sound spectrograph and the duration of each test as well as various component segmental units were measured.

400, 2O0 • ß ra.ku-da-ga-si ß $i-ta-ku-$u-ru

0 I I I i i I 2 3 4 5

NUMBER OF MORAS

FIG. 1. Duration of the test words from experiment 1 for the two series of words pooled across speakers. The words differ in the number of moras per word; the two series differ in segmental content.

1575 J. Acoust. Soc. Am., Vol. 81, No. 5, May 1987 Port eta/.' Evidence for mora timing in Japanese 1575

Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.132.173.67 On: Mon, 14 Apr 2014 14:41:52

(Port et al., 1987)DiCanio (UB) Suprasegmentals 11/24/15 27 / 45

Stress and Intonation

Tone vs. Stress

Languages can be divided into two types based on their prosodic system:stress languages and tone languages.

A language has tone if F0 plays a role distinguishing meaning betweenwords. Remember that F0 reflects the frequency at which the vocalfolds vibrate, measured in Hz.

A language has stress if two conditions are met (Hyman, 2006):1 One syllable in the word has the greatest stress (stress is culminative).2 There is a syllable with stress in every word (stress is obligatory).

Stress

In most stress languages, stressed syllables have higher pitch thanunstressed syllables. The duration of the stressed syllable is oftenlonger (as in English, Polish) and its loudness may also be greater.

In English, French, Russian, and Estonian unstressed syllables havereduced vowels, or a reduced set of vowels, e.g. Russian [i, e, a, o, u]in stressed syllables but [i, u, @] in unstressed syllables.

But... stress can be a tricky thing, especially in larger prosodiccontexts (compounds, phrases, sentences).

Acoustics

F0 (or pitch) is always used to distinguish between tones in a tonelanguage. It is also often used to distinguish between unstressed andstressed syllables in a stress language.

In addition to F0, other acoustic characteristics are used to distinguishstress, including length and vowel reduction.

Because tone and stress share acoustic properties, one can not simplycategorize languages by looking at their acoustics. We must also lookat how the prosody behaves in words (phonology).

In context, stress is probably better defined as the syllable on a wordwhere a intonational peak (accent) can occur. This is more of anabstract phonological definition than a phonetic one.

ə ɹ ɪ dʒ ə n l ioriginally

Stress in English

Word IPA Word IPA‘attic’ ["æ.RIk] ‘article’ ["aô.RI.kë

‘automatic’ [O.R@."mæ.RIk] ‘articulate’ [aô."thIk.ju.lIt]‘catastrophe’ [k@."thæs.tô@.fi] ‘catastrophic’ [khæ.R@s."tôa.fIk]

Stressed syllables in English are often longer than unstressed syllablesand have higher pitch.Unstressed syllables in English often are produced with a reducedvowel, [@].

In these words, there is always a single syllable that is more prominentthan all the others. This is a property of a stress language.

Stress “types”

Stress languages can be divided into two typesIn fixed stress languages, the position of the stressed syllable ispredictable.In variable stress languages, the position of the stressed syllable isnot predictable.

Fixed stress: Polish

Word Gloss Word Gloss"klu.bu ‘club.GEN’ klu."bO.vi ‘club.DAT’"dum.n1 ‘proud.NOM’ dum."nE.gO ‘proud.DAT’"zvjE.üEU ‘animal.NOM’ zvjE.üEn."ta.mi ‘animal.INST.PL’"vrot.swaf ‘Wrocław’ ve vrot."swa.vju ‘in Wrocław’

Stress always falls on the penultimate (second from last) syllable.

Variable stress: Spanish

Stress is used in Spanish morphology, where changes in stress indicatechanges in tense/person/mood. Despite this, there are many uninflectedwords where stress is also unpredictable.

Word Gloss Word Gloss"to.ko ‘I touch’ to."ko ‘He/She touched’to.ka."Ra ‘He/She will touch’ to."ka.Ra ‘I/he/she was touching’

(subj.)

"a.gi.la ‘eagle’ toR."ti.ja ‘tortilla’pe."Ri.ko ‘parakeet’ pe.Ri."fe.Ri.ko ‘peripheral’

Languages differ substantially in what phonetic cues they use to markstress differences.

English uses a combination of F0, duration, and vowel quality. Intensity isused to a lesser extent.

Spanish uses mainly F0 and duration to a lesser extent, but not vowelquality.

The location of F0 peak is frequently delayed. Cross-linguistically, it tendsto reach its peak on the following syllable.

Spanish: [de.teR.’mi.no la masa] ‘determino la masa...’

Spanish: [de.teR.’mi.no kom.pla.’si.Da] ‘determino complacida...’

Intonation

Intonation refers to the use of pitch melodies over an utterance to conveydiscourse-level meaning, e.g. questions, statements, contradiction, etc.

An intonational contour is applied across an utterance, which mayconsist of any number of syllables.

Speech acts are conveyed by intonational means in many languages,such as English, Spanish, Polish, etc.

Statements generally have a falling pitch contour, where declinationoccurs throughout the utterance and is most pronounced in the finalsyllable.

Questions generally have a rising pitch contour.

Statements and questions

Time (s)0 2.876

I spent ninedollars

smoothieat

claire’s

Time (s)0 2.295

Were you planningto

come tothe

tomorrow

night?

Other meanings

But there is a lot of additional meaning that can be conveyed withintonational contours, including counterfactuality, disbelief, etc.

Time (s)0 2.786

Ispent

nine dollars

asmoothieat Claire’s

Contrastive focus, for instance, can be placed on almost any word in theutterance, e.g. “I bought the red shoes.” vs. “I bought the red shoes.”

Downstep

A general pattern in English intonation is the use of downsteppingintonation, where every other syllable receives a slightly lower F0 than thepreceding one.

Time (s)0 2.093

littlebit

of chocolate

This type of pattern suggests that the units for organizing pitch targets inEnglish are feet (two syllable units) instead of just syllables. We organize anutterance like ‘I wish I had a little bit of chocolate.’ into trochaic (stressed+ unstressed) feet: (I) (wísh I) (hád a) (líttle) (bít of) (chócolate).

Stress and intonation

When we look for stressed syllables in an utterance, it is often difficult tofind them from their known acoustic properties.

This suggests that stress is not simply a set of acoustic properties, but astructural property of syllables in words. When intonational pitch accentsfall on a word, they fall on the stressed syllable.

Key topics

SyllablesMorasSonorityFixed stressVariable stressIntonationDownstep and feet

Abramson, A. S. (1974). Experimental phonetics in phonology: Vowel duration in Thai. Pasaa,4:71–90.

Boersma, P. and Weenink, D. (2013). Praat: doing phonetics by computer [computer program].www.praat.org.

Gandour, J. T., Potisuk, S., and Harper, M. P. (1996). Effects of stress on vowel length in Thai.In The Fourth International Symposium on Language and Linguistics, pages 95–103. Instituteof Language and Culture for Rural Development, Mahidol University.

Goldstein, L., Byrd, D., and Saltzman, E. (2006). The role of the vocal tract gestural actionunits in understanding the evolution of phonology. In Action to Language via the MirrorNeuron System, pages 215–249. New York: Cambridge University Press.

Ham, W. H. (2001). Phonetic and Phonological Aspects of Geminate Timing. OutstandingDissertations in Linguistics. Routledge.

Hillenbrand, J., Getty, L. A., Clark, M. J., and Wheeler, K. (1995). Acoustic characteristics ofAmerican English vowels. Journal of the Acoustical Society of America, 97(5):3099–3111.

Hyman, L. M. (2006). Word-prosodic typology. Phonology, 23:225–257.Ladefoged, P. (1971). Preliminaries to Linguistic Phonetics. Chicago, University of Chicago.Maddieson, I. (2004). Timing and Alignment: A Case Study of Lai. Language and Linguistics,

5(4):729–755.Port, R. F., Dalby, J., and O’Dell, M. (1987). Evidence for mora timing in Japanese. Journal of

the Acoustical Society of America, 81(5):1574–1585.Roengpitya, R. (2002). A historical and perceptual study of vowel length in Thai. In Macken,

M. A., editor, Papers from the 10th Annual Meeting of the Southeast Asian LingusiticsSociety, pages 353–366. Arizona State University, Program for Southeast Asian Studies.

Prosody and Suprasegmentals

Documents

Transcript of Prosody and Suprasegmentals

SPEECH AND PROSODY IN DEVELOPMENTAL DISORDERS ...

1 Early Speech Development Suprasegmentals (voice patterns)Suprasegmentals (voice patterns) –Vocalization (voice on/voice off) –Duration –Loudness –Pitch.

Prosody Training

Sanskrit Prosody

Prosody Introduction

Prosody and melody in vowel disorder

Suprasegmentals - JALT Publications

Prosody and intention recognition1 - cogsci.msu.edu

Prosody 2014

SPEECH PROSODY — THEORIES, MODELS AND ANALYSIS

Prosody Modeling

Tucker Root and Prosody

Suprasegmental features and Prosody

Prosody in Language and in Music - EECSeniale/teaching/ise575/e/presentations/12... · Prosody in Language and in Music Jordan B. L. Smith 29 March 2011 ... prosody ¥ Potential ...

Prosody 2012

English words and phrases (suprasegmentals)

Suprasegmentals (and syllables)

Prosody: Stress, Rhythm, and Intonation

Prosody and Intonation of Western Cham

PROSODY AND BROCA’S APHASIA: AN ACOUSTIC … on read and spontaneous speech of four right-handed non-fluent ... “Prosody and Broca’s Aphasia: ... PROSODY AND BROCA’S APHASIA: