Post on 14-Jan-2017
Prosody and Suprasegmentals
Dr. Christian DiCaniocdicanio@buffalo.edu
University at Buffalo
11/24/15
DiCanio (UB) Suprasegmentals 11/24/15 1 / 45
Suprasegmentals
Suprasegmental contrasts refer to those aspects of the speech signal whichare not mainly defined by the actions of the oral articulators. Tone, stress,intonation, and length may be “overlaid” on a string of speech sounds.
0
900
1800
2700
3600
4500
Freq
uenc
y (H
z)
secondary none primary none
mæ sə tʃu sɪts
Time (s)0 1.253
DiCanio (UB) Suprasegmentals 11/24/15 2 / 45
Considerations
Prosodic units
Stress
Length
Intonation/Prosody
DiCanio (UB) Suprasegmentals 11/24/15 3 / 45
Suprasegmentals
Suprasegmentals refers either to:1 the acoustic dimensions of the speech signal not directly produced by
the oral articulators (F0, voice quality), the temporal aspects of thespeech signal (duration), and the “strength” of such articulations(intensity, hyperarticulation)
2 the phonological categories which utilize such dimensions (stress, tone,length)
Of relevance here is that such dimensions extend beyond the segment orfunction independently from segmental units.
DiCanio (UB) Suprasegmentals 11/24/15 4 / 45
Acoustics of F0
Acoustics of F0 and pitch
Physically, F0, or fundamental frequency, refers to the least commondivisor frequency value in a periodic complex wave.Thus, if we have acomplex wave with three components (300 Hz., 500 Hz., 700 Hz), thefundamental frequency is 100 Hz.
The pitch of a sound corresponds to its fundamental frequency (F0),but the two are distinct. Pitch is often perceived as the difference infrequency between successive harmonics of the F0. Thus, we can heara pitch even if a fundamental frequency is absent.
The pitch range is humans varies. For men, it is usually between75-250 Hz. For women, it is usually between 120-350 Hz.
DiCanio (UB) Suprasegmentals 11/24/15 5 / 45
Acoustics of F0
Measurement of F0
While we can measure F0 manually by examining the inverse of periodduration, we rely on algorithms in programs like Praat (Boersma andWeenink, 2013) to determine it automatically for us.
Typically, autocorrelation is used to determine F0. This method uses aroving window (3 x 1/(F0min) in Praat) over which the correlation betweensuccessive pitch periods is calculated.
Within the window, many different components of the signal mightcorrelate with each other, including pieces within a single period of acomplex wave or components across several periods.
The greatest correlation should be across subsequent pitch periods.
DiCanio (UB) Suprasegmentals 11/24/15 6 / 45
Acoustics of F0
Autocorrelation varies the distance between windows by a short and longlag, defined by the F0 range. The best correlation across windows is usuallya distance of one period, which then defines the F0 for that period.
DiCanio (UB) Suprasegmentals 11/24/15 7 / 45
Acoustics of F0
F0 tracking is statistical though - the highest correlation “wins” anddetermines the F0 of the signal. Autocorrelation favors the shortest lagbetween windows and this can cause problems.
Pitch-doubling occurs when the highest correlation across windows isfound at a distance less than one period.
Pitch-halving occurs when the highest correlation across windows is foundat a distance greater than one period.
These patterns are typically more common in non-modal phonation type orwhen the F0 range is incorrectly-defined.
DiCanio (UB) Suprasegmentals 11/24/15 8 / 45
Duration
Duration
Like pitch, duration is used as a suprasegmental property. In languageslike English and Polish, it is used to mark stress.
Duration can be used for segmental contrasts (singletons-geminates).Yet, even in languages with contrastive length, duration may be usedto indicate stress.
Other things, apart from stress, also influence vowel length, like wordsize or consonant voicing.
DiCanio (UB) Suprasegmentals 11/24/15 9 / 45
Duration
Long vowels in multisyllabic words
Duration (in msec) of /!"/ in the sequence /t!"t/ in a set of Hungarian
words (Tarnóczy, 1965):
210 /t!"t/
180 /t!"to#/
140 /t!"to#$t/
120 /t!"to#$to"k/
110 /t!"to#$to"kn$k/
The more syllables in the word the shorter the initial “long” vowel.
DiCanio (UB) Suprasegmentals 11/24/15 10 / 45
Duration
There is contrastive vowel length in Levantine Arabic (Ham, 2001), butword size still has an effect on vowel length.
DiCanio (UB) Suprasegmentals 11/24/15 11 / 45
Duration
Thai vowel length contrast (Abramson, 1974; Roengpitya, 2002)
pàk to stick in pààk mouthcìp to sip cììp to foldbaN to hide baaN someòt to abstain òòt to lamentkham dusk khaam to crossthan on time thaan alms
Vowel length is neutralized in unstressed syllables in Thai. So, stress is notsimply indicated by acoustic duration in the language, but by the presenceof a length contrast.
DiCanio (UB) Suprasegmentals 11/24/15 12 / 45
Duration
The effect of stress on vowel length in Thai (Gandour et al., 1996).
DiCanio (UB) Suprasegmentals 11/24/15 13 / 45
Duration
Intrinsic Duration.
Vowel length is tied to vowel height (low vowels tend to be longer
than high vowels).
Intrinsic duration in Swedish (Elert, 1964):
long short
high /i y u/ 140 95
mid /e ! "/ 155 103
low /æ œ #/ 164 111
DiCanio (UB) Suprasegmentals 11/24/15 14 / 45
Duration
American English vowel duration (Hillenbrand et al., 1995)
/i/ 282 ms /u/ 273 ms/I/ 226 ms /U/ 229 ms/eI/ 300 ms /oU/ 300 ms/E/ 226 ms /5/ 216 ms/æ/ 311 ms /O/ 318 ms/a/ 300 ms /ô/ 297 ms
Lower vowels are longer than higher vowels.
Lax vowels /I, E, U, 5/ are shorter than non-lax vowels, with adifference ratio of 1:1.33.
DiCanio (UB) Suprasegmentals 11/24/15 15 / 45
Loudness
Loudness
Stress differences in languages also sometimes correlate withdifferences in loudness, but this is uncommon. In English, stressedsyllables are often no louder than unstressed sylables (Ladefoged,1971).
Loudness often does vary with respect to vowel quality or pitch level.
DiCanio (UB) Suprasegmentals 11/24/15 16 / 45
Loudness
Intrinsic Loudness - low vowels are louder.
English: VU (volume unit) meter readings
(Lehiste & Peterson, 1959)
/i/ 80.1 /i/ -9.7
/e/ 81.1 /e/ -3.6
/a/ 85.7 /a/ 0
/o/ 83.5 /o/ -7.2
/u/ 80.4 /u/ -12.3
Hungarian: dB relative to /a/
Fónagy, 1966
DiCanio (UB) Suprasegmentals 11/24/15 17 / 45
Loudness
Intensity profile of 5 different tones in Trique
33 3 3 3 3 3
33
2 4 6 8 10
6570
7580
85
Time (normalized)
Inte
nsity
(dB
.)
2
2 2 2 2 22
11
1
1
11 1
1
32 3 2 3 2
32
3
13
1
3
13 1
DiCanio (UB) Suprasegmentals 11/24/15 18 / 45
Prosodic units
Prosodic units
Despite the fact that we can describe speech articulations in terms ofsegments, there is actually little evidence that this is the main unit we usefor speech planning or organizing our speech.
Larger sized units, like syllables or moras are the organizationalframes over which suprasegmental properties apply.
The syllable is a unit of timing in speech, but it is ill-defined in purelyacoustic/articulatory terms.
Tap test - syllables and stress, e.g. “massachussetts”
DiCanio (UB) Suprasegmentals 11/24/15 19 / 45
Prosodic units
What’s a syllable?
Early work by Stetson that one could identify the syllable in terms of“chest pulses”, though this was disproved by Ladefoged in 1967.
There are no individual pulses in the chest cavity which correspond todistinct syllables in speech.
A quasi-acoustic definition relies on sonority, reflecting the relativeintensity of a speech sound. Syllables are defined as having a rise insonority and a subsequent fall, e.g. [sIt] low sonority + high sonority +low sonority; [snoU] lowest sonority + higher sonority + highestsonority.
All syllables have a “peak” in prominence.
DiCanio (UB) Suprasegmentals 11/24/15 20 / 45
Prosodic units
DiCanio (UB) Suprasegmentals 11/24/15 21 / 45
Prosodic units
Problems with sonority
How do we define if a particular sound is louder though? Is [z] louder than[d]? How do syllabic sonorants count, e.g. is “prism” one or two syllables?
In languages like Polish, where sonorants can be part of clusters, less clearhow to define different syllables in terms of sonority, e.g. [’jabwkO] ‘apple’is disyllabic (due to stress), but where does [w] belong?
There are many disagreements about the status of medial consonants, e.g.“fellow.”
Tautological with respect to prominence: a sound is prominent if it formsthe peak of a syllable, but a syllable is present is there is a peak inprominence.
DiCanio (UB) Suprasegmentals 11/24/15 22 / 45
Prosodic units
Solution: planning and coordination
A better solution is one where the syllable is defined as a units of speechplanning. We plan our motor actions in syllable-sized chunks.
Comp. by: Kandavel Date:29/5/06 Time:18:16:17 Stage:1st Revises File Path://spsind002s/cup_prod1/PRODENV/000000~1/00E26E~1/S00000~3/000000~2/000000~2/000010355.3D Proof by: QC by: Author: Goldstein
(Browman and Goldstein, 2000), that multiple, competing coupling relations can be
specified in the network of oscillators in the coupling graph. For example, in the case
of “spot,” the oral constriction gestures of /s/ and /p/ are coupled in-phase to the vowel
gesture and simultaneously anti-phase to one another, as shown in the coupling graph in
Fig. 7.8. The coupled oscillator planning model (Nam and Saltzman, 2003) predicts that
the onset of the vowel gesture should occur midway between the onset of the tongue tip
gesture for /s/ and the lip gesture for /p/. As shown in Fig. 7.10 for the phrase “pea spots,”
kinematic data (for the same speaker as in Fig. 7.9) supports this prediction. Nam and
Figure 7.9 Time functions of vocal tract variables, as measured using X-ray microbeam data, for thephrase “pea pots” showing the in-phase (synchronous within 25 ms) coordination of the lip gesturefor the /p/ in “pots” and the /a/ gesture for the vowel in “pots.” Tract variables shown are lipaperture (distance between upper and lower lips), which is controlled for lip closure gestures (/p/ inthis example) and tongue tip constriction degree (distance of the tongue tip from the palate), whichis controlled in tongue tip gestures (/t/ and /s/ in this example). Also shown is the time function forthe distance of the tongue body from the palate, which is small for /i/ and large for the vowel /a/,when the tongue is lowered and back into the pharynx. (The actual controlled tract variable for thevowel /a/ is the degree of constriction of the tongue root in pharynx, which cannot be directlymeasured using a technique that employs transducers on the front of the tongue only. So distance ofthe tongue body from the palate is used here as a rough index of tongue root constriction degree.)Boxes delimit the times of presumed active control for the oral constriction gestures for /p/ and /a/.These are determined algorithmically from the velocities of the observed tract variables. The leftedge of the box represents gesture onset, the point in time at which the tract variable velocitytowards constriction exceeds some threshold value. The right edge of the box represents the gesturerelease, the point in time at which velocity away from the constricted position exceeds somethreshold. The line within the box represents the time at which the constriction target is effectivelyachieved, defined as the point in time at which the velocity towards constriction drops belowthreshold.
230 L. Goldstein, D. Byrd, and E. Saltzman
(Goldstein et al., 2006)DiCanio (UB) Suprasegmentals 11/24/15 23 / 45
Prosodic units
The syllable and suprasegmentals
The syllable is also the target of suprasegmental contrasts. Hakha Lai hasthree tones: falling, rising, and low. The same trajectory occurs acrossvowels preceding a stop coda as on the entire rime (VC) which includes asonorant coda.
!!!"#$!%#&&'()*$!
!+,-!
.*/&0!12(!)3/'$4)!*5!67!8#9:()!5*/!32(!; ;!)<99#=9()!.(/(!32($!#9'4$(&!#53(/!32')!>(#$!$:>=(/!*5!)3(?)0!6*99*.'$4!#9'4$>($3@!32(!)#>(!3/:$A#3'*$;'$3(/?*9#3'*$!?/*A(&:/(!.#)!#??9'(&!3*!32(!($&!*5!32(!; ;!)<99#=9()0!6'$#99<!32(!67!8#9:()!#3!(#A2!3'>(!?*'$3!.(/(!#8(/#4(&! 5*/!(#A2!:33(/#$A(! 3<?(!.'32!#!4'8($! 3#/4(3!.*/&0!12(! /():93'$4!67!A*$3*:/)!#/(!?9*33(&!'$!6'4:/()!B@!C@!#$&!D7!:)'$4!&'55(/($3!)2#&'$4!#$&!?*'$3!)2#?()!3*!'&($3'5<!(#A2! 3<?(0! 12/((! 5'4:/()! #/(! :)(&! 3*! #8*'&! 3**!>:A2! ):?(/?*)'3'*$! *5! 32(! '$&'8'&:#9!3/#AE)0! 12(! 3'>(! #$&! 5/(F:($A<! )A#9()! #/(! 32(! )#>(! 5*/! 32(! 32/((! 5'4:/()@! )*! 32#3!&'55(/($3!?#33(/$)!>#<!=(!/(#&'9<!A*>?#/(&0!
G'$A(!#99! 32(! 3#/4(3!.*/&)!#$&! 32(! ; ;!>#/E(/!=(4'$!.'32!8*'A(9())!*=)3/:($3)@!32(!9*A#9!/#')'$4!(55(A3!*5!32')!A9#))!*5!A*$)*$#$3!*$!32(!5'/)3!5(.!?(/'*&)!*5!8*A#9!5*9&!8'=/#3'*$!>:)3!=(!3#E($!'$3*!#AA*:$30!H)!#!/:9(!*5!32:>=@!*$)(3!?'3A2!A#$!=(!(I?(A3(&!3*!=(! )*>(!D7JDK!LM!2'42(/! 32#$! '3!*32(/.')(!.*:9&!=(0!12(! (55(A3! &'>'$')2()!*8(/! 32(!5'/)3!K7!>)!*5!32(!5*99*.'$4!8*.(90!!
LM!
D77!
DD7!
DN7!
DO7!
D,7!
DK7!
D-7!
D77! N77 O77! ,77 K77 -77 +77 B77! C77!
"#$
$#%$#&
$#"
"#%
>)! !6'4:/(!CP!67!'$!3#/4(3!.*/&)!.'32!)2*/3!8*.(9)!#$&!5#99'$4!3*$()!#$&!5*99*.'$4!; ;0!!
6'4:/(! C! &')?9#<)! 32(! 3#/4(3!.*/&)!.'32! )2*/3! 8*.(9)! #$&! 5#99'$4! 3*$()0! "$! 32*)(!.'32! A*&#! )*$*/#$3)! 32(! 5#99! ')!>*)39<! /(#9'M(&! &:/'$4! 32(! )*$*/#$3! ?*/3'*$@!.2'A2! #)!$*3(&!(#/9'(/!')!9*$4(/!32#$!32(!8*.(90!QR50!6'4:/()!K!#$&!+0S!12(!($&J?*'$3!*5!32(!5#99!'$!32()(! A#)()! ')! #/*:$&! DDN!LM@!.2'A2! ')! )'>'9#/! 3*! 32(! ($&J?*'$3! *5! 32(! 5#99)! '$! 9*$4J8*.(9!/2<>()!'$!6'4:/()!D7!#$&!DD0!12(!)2*/3!&:/#3'*$!*5!32(!8*'A(&!?*/3'*$!'$!RTU!)<99#=9()!&*()!$*3!#99*.!8(/<!>:A2!*5!#! 5#99'$4!A*$3*:/! 3*!=(! /(#9'M(&0!12(! 3/:$A#3(&!
(Maddieson, 2004)DiCanio (UB) Suprasegmentals 11/24/15 24 / 45
Prosodic units
The syllable is also the target of suprasegmental contrasts. Hakha Lai hasthree tones: falling, rising, and low. The same trajectory occurs acrossvowels preceding a stop coda as on the entire rime (VC) which includes asonorant coda.
!!!"#$!%#&&'()*$!
!+,-!
.*/&0!12(!)3/'$4)!*5!67!8#9:()!5*/!32(!; ;!)<99#=9()!.(/(!32($!#9'4$(&!#53(/!32')!>(#$!$:>=(/!*5!)3(?)0!6*99*.'$4!#9'4$>($3@!32(!)#>(!3/:$A#3'*$;'$3(/?*9#3'*$!?/*A(&:/(!.#)!#??9'(&!3*!32(!($&!*5!32(!; ;!)<99#=9()0!6'$#99<!32(!67!8#9:()!#3!(#A2!3'>(!?*'$3!.(/(!#8(/#4(&! 5*/!(#A2!:33(/#$A(! 3<?(!.'32!#!4'8($! 3#/4(3!.*/&0!12(! /():93'$4!67!A*$3*:/)!#/(!?9*33(&!'$!6'4:/()!B@!C@!#$&!D7!:)'$4!&'55(/($3!)2#&'$4!#$&!?*'$3!)2#?()!3*!'&($3'5<!(#A2! 3<?(0! 12/((! 5'4:/()! #/(! :)(&! 3*! #8*'&! 3**!>:A2! ):?(/?*)'3'*$! *5! 32(! '$&'8'&:#9!3/#AE)0! 12(! 3'>(! #$&! 5/(F:($A<! )A#9()! #/(! 32(! )#>(! 5*/! 32(! 32/((! 5'4:/()@! )*! 32#3!&'55(/($3!?#33(/$)!>#<!=(!/(#&'9<!A*>?#/(&0!
G'$A(!#99! 32(! 3#/4(3!.*/&)!#$&! 32(! ; ;!>#/E(/!=(4'$!.'32!8*'A(9())!*=)3/:($3)@!32(!9*A#9!/#')'$4!(55(A3!*5!32')!A9#))!*5!A*$)*$#$3!*$!32(!5'/)3!5(.!?(/'*&)!*5!8*A#9!5*9&!8'=/#3'*$!>:)3!=(!3#E($!'$3*!#AA*:$30!H)!#!/:9(!*5!32:>=@!*$)(3!?'3A2!A#$!=(!(I?(A3(&!3*!=(! )*>(!D7JDK!LM!2'42(/! 32#$! '3!*32(/.')(!.*:9&!=(0!12(! (55(A3! &'>'$')2()!*8(/! 32(!5'/)3!K7!>)!*5!32(!5*99*.'$4!8*.(90!!
LM!
D77!
DD7!
DN7!
DO7!
D,7!
DK7!
D-7!
D77! N77 O77! ,77 K77 -77 +77 B77! C77!
"#$
$#%$#&
$#"
"#%
>)! !6'4:/(!CP!67!'$!3#/4(3!.*/&)!.'32!)2*/3!8*.(9)!#$&!5#99'$4!3*$()!#$&!5*99*.'$4!; ;0!!
6'4:/(! C! &')?9#<)! 32(! 3#/4(3!.*/&)!.'32! )2*/3! 8*.(9)! #$&! 5#99'$4! 3*$()0! "$! 32*)(!.'32! A*&#! )*$*/#$3)! 32(! 5#99! ')!>*)39<! /(#9'M(&! &:/'$4! 32(! )*$*/#$3! ?*/3'*$@!.2'A2! #)!$*3(&!(#/9'(/!')!9*$4(/!32#$!32(!8*.(90!QR50!6'4:/()!K!#$&!+0S!12(!($&J?*'$3!*5!32(!5#99!'$!32()(! A#)()! ')! #/*:$&! DDN!LM@!.2'A2! ')! )'>'9#/! 3*! 32(! ($&J?*'$3! *5! 32(! 5#99)! '$! 9*$4J8*.(9!/2<>()!'$!6'4:/()!D7!#$&!DD0!12(!)2*/3!&:/#3'*$!*5!32(!8*'A(&!?*/3'*$!'$!RTU!)<99#=9()!&*()!$*3!#99*.!8(/<!>:A2!*5!#! 5#99'$4!A*$3*:/! 3*!=(! /(#9'M(&0!12(! 3/:$A#3(&!
The syllable duration is the constant timing unit, so long vowels occur witha short coda and short vowels occur with a long coda (Maddieson, 2004).
DiCanio (UB) Suprasegmentals 11/24/15 25 / 45
Prosodic units
Moras
In many languages, there is evidence of a different timing unit, called themora.
TABLE I. The possible segment configurations for the mora.
Number of moras Word English gloss
1 da to be hi fire
2 ko-re this i-chi one
ko-ko-ro heart Ke-i-ko girl's name Ho-n-da family name cho-t-to a little bit
4 Kyu-u-de-n Imperial Palace
does provide some specific constraint on articulatory timing in Japanese. The results below seem to support a somewhat freer and more abstract view of the temporal properties of the mora but suggest that it is inherently temporal, neverthe- less. The following experiments examine this concept by ex- ploring interactions of the mora with other phonological properties of Japanese.
I. EXPERIMENT 1' WORD LENGTH VERSUS NUMBER OF SYLLABLES
Since words should always begin and end at mora boun- daries, one of the simplest predictions of the mora hypothesis is that word durations should come in steps that are integral multiples of the duration of amora at a particular speaking tempo•us, if we start with a one-mora word and add successive moras to it, the total word duration should in- crease by constant amounts in each case. Note that this is not generally true in English (Lehiste, 1972; Port, 1981; Klatt, 1976) where the distinction between stressed and unstressed syllables means that addition of an unstressed syllable adds only a little to total word duration, and where further (un- stressed) syllables add less and less to the duration of the word. This experiment tested the prediction that the number of moras in a word should control word duration no matter
what the number of syllables is. Thus it is a fairly simple and direct test of the mora hypothesis.
A. Methods
Sentences were constructed from the words on the fol-
lowing list. Two sets of words were used in an attempt to average across any effects due to individual segments. They are mostly nonsense words but are plausible Japanese. The words were constructed so that those in the left column would have a syllable for each new mora, but the list in the right-hand column should undergo several phonological rules that might be expected to reduce the number of sylla- bles.
ra si [•i] raku sita [gta] rakuda sitaku [•taku] rakudaga sitakusu [•tak h su] rakudagasi sitakusuru [•tak h suru]
The words were embedded in a constant carrier sentence Kore wa__ desu, "This is__" and written in Japanese or- thography. The words were constructed with the help of a native speaker to make sure that all sounded like possible Japanese words. The first set contains an increasing number of syllablesmand, of course, moras. According to the mora hypothesis, then, each word should be longer than the pre- ceding one by the same amount of time. Words of the second set undergo a well-known phonological rule in Japanese which results in the apparent deletion of several vowels (McCawley, 1968; Beckman and Shoji, 1984). The rule specifies that the high vowels/i,u/are either deleted alto- gether (e.g., after/s, •/) or else made voiceless (whispered) when they occur between two voiceless consonants. Thus a word that is phonemically (and orthographically)/sita/is actually pronounced [ •ta ], and/sitakusuru/is pronounced [ •tak h sura ]. The [ h ] is the transcription means that the/k/ is released (perhaps into a voiceless vowel) before the/s/ (cf. Beckman and Shoji, 1984). Although this is their audi- tory impression to an English-speaking phonetician, it must be kept in mind that there is still orthographic support for the presence of the "underlying" moras despite the audible weakening of the underlying syllables. The issue here is what the pattern of total word durations is. If words have an inte- gral number of moras and if each mora tends to have the same duration as every other mora, then we should expect that the word durations will get longer by a constant amount independently of either the segmental content or syllabic structure of the words.
The subjects were four native speakers of Japanese studying as undergraduates at Indiana University. They were asked to read these sentences from a list containing three tokens of each sentence at a comfortable tempo. After some practice readings, they were recorded in a quiet room in the Phonetics Laboratory. The recordings were analyzed on a sound spectrograph and the duration of each test as well as various component segmental units were measured.
600
400, 2O0 • ß ra.ku-da-ga-si ß $i-ta-ku-$u-ru
0 I I I i i I 2 3 4 5
NUMBER OF MORAS
FIG. 1. Duration of the test words from experiment 1 for the two series of words pooled across speakers. The words differ in the number of moras per word; the two series differ in segmental content.
1575 J. Acoust. Soc. Am., Vol. 81, No. 5, May 1987 Port eta/.' Evidence for mora timing in Japanese 1575
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.132.173.67 On: Mon, 14 Apr 2014 14:41:52
(Port et al., 1987)DiCanio (UB) Suprasegmentals 11/24/15 26 / 45
Prosodic units
TABLE I. The possible segment configurations for the mora.
Number of moras Word English gloss
1 da to be hi fire
2 ko-re this i-chi one
ko-ko-ro heart Ke-i-ko girl's name Ho-n-da family name cho-t-to a little bit
4 Kyu-u-de-n Imperial Palace
does provide some specific constraint on articulatory timing in Japanese. The results below seem to support a somewhat freer and more abstract view of the temporal properties of the mora but suggest that it is inherently temporal, neverthe- less. The following experiments examine this concept by ex- ploring interactions of the mora with other phonological properties of Japanese.
I. EXPERIMENT 1' WORD LENGTH VERSUS NUMBER OF SYLLABLES
Since words should always begin and end at mora boun- daries, one of the simplest predictions of the mora hypothesis is that word durations should come in steps that are integral multiples of the duration of amora at a particular speaking tempo•us, if we start with a one-mora word and add successive moras to it, the total word duration should in- crease by constant amounts in each case. Note that this is not generally true in English (Lehiste, 1972; Port, 1981; Klatt, 1976) where the distinction between stressed and unstressed syllables means that addition of an unstressed syllable adds only a little to total word duration, and where further (un- stressed) syllables add less and less to the duration of the word. This experiment tested the prediction that the number of moras in a word should control word duration no matter
what the number of syllables is. Thus it is a fairly simple and direct test of the mora hypothesis.
A. Methods
Sentences were constructed from the words on the fol-
lowing list. Two sets of words were used in an attempt to average across any effects due to individual segments. They are mostly nonsense words but are plausible Japanese. The words were constructed so that those in the left column would have a syllable for each new mora, but the list in the right-hand column should undergo several phonological rules that might be expected to reduce the number of sylla- bles.
ra si [•i] raku sita [gta] rakuda sitaku [•taku] rakudaga sitakusu [•tak h su] rakudagasi sitakusuru [•tak h suru]
The words were embedded in a constant carrier sentence Kore wa__ desu, "This is__" and written in Japanese or- thography. The words were constructed with the help of a native speaker to make sure that all sounded like possible Japanese words. The first set contains an increasing number of syllablesmand, of course, moras. According to the mora hypothesis, then, each word should be longer than the pre- ceding one by the same amount of time. Words of the second set undergo a well-known phonological rule in Japanese which results in the apparent deletion of several vowels (McCawley, 1968; Beckman and Shoji, 1984). The rule specifies that the high vowels/i,u/are either deleted alto- gether (e.g., after/s, •/) or else made voiceless (whispered) when they occur between two voiceless consonants. Thus a word that is phonemically (and orthographically)/sita/is actually pronounced [ •ta ], and/sitakusuru/is pronounced [ •tak h sura ]. The [ h ] is the transcription means that the/k/ is released (perhaps into a voiceless vowel) before the/s/ (cf. Beckman and Shoji, 1984). Although this is their audi- tory impression to an English-speaking phonetician, it must be kept in mind that there is still orthographic support for the presence of the "underlying" moras despite the audible weakening of the underlying syllables. The issue here is what the pattern of total word durations is. If words have an inte- gral number of moras and if each mora tends to have the same duration as every other mora, then we should expect that the word durations will get longer by a constant amount independently of either the segmental content or syllabic structure of the words.
The subjects were four native speakers of Japanese studying as undergraduates at Indiana University. They were asked to read these sentences from a list containing three tokens of each sentence at a comfortable tempo. After some practice readings, they were recorded in a quiet room in the Phonetics Laboratory. The recordings were analyzed on a sound spectrograph and the duration of each test as well as various component segmental units were measured.
600
400, 2O0 • ß ra.ku-da-ga-si ß $i-ta-ku-$u-ru
0 I I I i i I 2 3 4 5
NUMBER OF MORAS
FIG. 1. Duration of the test words from experiment 1 for the two series of words pooled across speakers. The words differ in the number of moras per word; the two series differ in segmental content.
1575 J. Acoust. Soc. Am., Vol. 81, No. 5, May 1987 Port eta/.' Evidence for mora timing in Japanese 1575
Redistribution subject to ASA license or copyright; see http://acousticalsociety.org/content/terms. Download to IP: 130.132.173.67 On: Mon, 14 Apr 2014 14:41:52
(Port et al., 1987)DiCanio (UB) Suprasegmentals 11/24/15 27 / 45
Stress and Intonation
Tone vs. Stress
Languages can be divided into two types based on their prosodic system:stress languages and tone languages.
A language has tone if F0 plays a role distinguishing meaning betweenwords. Remember that F0 reflects the frequency at which the vocalfolds vibrate, measured in Hz.
A language has stress if two conditions are met (Hyman, 2006):1 One syllable in the word has the greatest stress (stress is culminative).2 There is a syllable with stress in every word (stress is obligatory).
DiCanio (UB) Suprasegmentals 11/24/15 28 / 45
Stress and Intonation
Stress
In most stress languages, stressed syllables have higher pitch thanunstressed syllables. The duration of the stressed syllable is oftenlonger (as in English, Polish) and its loudness may also be greater.
In English, French, Russian, and Estonian unstressed syllables havereduced vowels, or a reduced set of vowels, e.g. Russian [i, e, a, o, u]in stressed syllables but [i, u, @] in unstressed syllables.
But... stress can be a tricky thing, especially in larger prosodiccontexts (compounds, phrases, sentences).
DiCanio (UB) Suprasegmentals 11/24/15 29 / 45
Stress and Intonation
Acoustics
F0 (or pitch) is always used to distinguish between tones in a tonelanguage. It is also often used to distinguish between unstressed andstressed syllables in a stress language.
In addition to F0, other acoustic characteristics are used to distinguishstress, including length and vowel reduction.
Because tone and stress share acoustic properties, one can not simplycategorize languages by looking at their acoustics. We must also lookat how the prosody behaves in words (phonology).
DiCanio (UB) Suprasegmentals 11/24/15 30 / 45
Stress and Intonation
In context, stress is probably better defined as the syllable on a wordwhere a intonational peak (accent) can occur. This is more of anabstract phonological definition than a phonetic one.
ə ɹ ɪ dʒ ə n l ioriginally
DiCanio (UB) Suprasegmentals 11/24/15 31 / 45
Stress and Intonation
Stress in English
Word IPA Word IPA‘attic’ ["æ.RIk] ‘article’ ["aô.RI.kë
"]
‘automatic’ [O.R@."mæ.RIk] ‘articulate’ [aô."thIk.ju.lIt]‘catastrophe’ [k@."thæs.tô@.fi] ‘catastrophic’ [khæ.R@s."tôa.fIk]
Stressed syllables in English are often longer than unstressed syllablesand have higher pitch.Unstressed syllables in English often are produced with a reducedvowel, [@].
In these words, there is always a single syllable that is more prominentthan all the others. This is a property of a stress language.
DiCanio (UB) Suprasegmentals 11/24/15 32 / 45
Stress and Intonation
Stress “types”
Stress languages can be divided into two typesIn fixed stress languages, the position of the stressed syllable ispredictable.In variable stress languages, the position of the stressed syllable isnot predictable.
DiCanio (UB) Suprasegmentals 11/24/15 33 / 45
Stress and Intonation
Fixed stress: Polish
Word Gloss Word Gloss"klu.bu ‘club.GEN’ klu."bO.vi ‘club.DAT’"dum.n1 ‘proud.NOM’ dum."nE.gO ‘proud.DAT’"zvjE.üEU ‘animal.NOM’ zvjE.üEn."ta.mi ‘animal.INST.PL’"vrot.swaf ‘Wrocław’ ve vrot."swa.vju ‘in Wrocław’
Stress always falls on the penultimate (second from last) syllable.
DiCanio (UB) Suprasegmentals 11/24/15 34 / 45
Stress and Intonation
Variable stress: Spanish
Stress is used in Spanish morphology, where changes in stress indicatechanges in tense/person/mood. Despite this, there are many uninflectedwords where stress is also unpredictable.
Word Gloss Word Gloss"to.ko ‘I touch’ to."ko ‘He/She touched’to.ka."Ra ‘He/She will touch’ to."ka.Ra ‘I/he/she was touching’
(subj.)
"a.gi.la ‘eagle’ toR."ti.ja ‘tortilla’pe."Ri.ko ‘parakeet’ pe.Ri."fe.Ri.ko ‘peripheral’
DiCanio (UB) Suprasegmentals 11/24/15 35 / 45
Stress and Intonation
Languages differ substantially in what phonetic cues they use to markstress differences.
English uses a combination of F0, duration, and vowel quality. Intensity isused to a lesser extent.
Spanish uses mainly F0 and duration to a lesser extent, but not vowelquality.
The location of F0 peak is frequently delayed. Cross-linguistically, it tendsto reach its peak on the following syllable.
DiCanio (UB) Suprasegmentals 11/24/15 36 / 45
Stress and Intonation
Spanish: [de.teR.’mi.no la masa] ‘determino la masa...’
DiCanio (UB) Suprasegmentals 11/24/15 37 / 45
Stress and Intonation
Spanish: [de.teR.’mi.no kom.pla.’si.Da] ‘determino complacida...’
DiCanio (UB) Suprasegmentals 11/24/15 38 / 45
Stress and Intonation
Intonation
Intonation refers to the use of pitch melodies over an utterance to conveydiscourse-level meaning, e.g. questions, statements, contradiction, etc.
An intonational contour is applied across an utterance, which mayconsist of any number of syllables.
Speech acts are conveyed by intonational means in many languages,such as English, Spanish, Polish, etc.
Statements generally have a falling pitch contour, where declinationoccurs throughout the utterance and is most pronounced in the finalsyllable.
Questions generally have a rising pitch contour.
DiCanio (UB) Suprasegmentals 11/24/15 39 / 45
Stress and Intonation
Statements and questions
Time (s)0 2.876
Pitc
h (H
z)
60
180
I spent ninedollars
ona
smoothieat
claire’s
Time (s)0 2.295
Pitc
h (H
z)
80
250
Were you planningto
come tothe
party
tomorrow
night?
DiCanio (UB) Suprasegmentals 11/24/15 40 / 45
Stress and Intonation
Other meanings
But there is a lot of additional meaning that can be conveyed withintonational contours, including counterfactuality, disbelief, etc.
Time (s)0 2.786
Pitc
h (H
z)
60
180
Ispent
nine dollars
on
asmoothieat Claire’s
Contrastive focus, for instance, can be placed on almost any word in theutterance, e.g. “I bought the red shoes.” vs. “I bought the red shoes.”
DiCanio (UB) Suprasegmentals 11/24/15 41 / 45
Stress and Intonation
Downstep
A general pattern in English intonation is the use of downsteppingintonation, where every other syllable receives a slightly lower F0 than thepreceding one.
DiCanio (UB) Suprasegmentals 11/24/15 42 / 45
Stress and Intonation
Time (s)0 2.093
Pitc
h (H
z)
80
220
I
wish
I
had
a
littlebit
of chocolate
This type of pattern suggests that the units for organizing pitch targets inEnglish are feet (two syllable units) instead of just syllables. We organize anutterance like ‘I wish I had a little bit of chocolate.’ into trochaic (stressed+ unstressed) feet: (I) (wísh I) (hád a) (líttle) (bít of) (chócolate).
DiCanio (UB) Suprasegmentals 11/24/15 43 / 45
Stress and Intonation
Stress and intonation
When we look for stressed syllables in an utterance, it is often difficult tofind them from their known acoustic properties.
This suggests that stress is not simply a set of acoustic properties, but astructural property of syllables in words. When intonational pitch accentsfall on a word, they fall on the stressed syllable.
DiCanio (UB) Suprasegmentals 11/24/15 44 / 45
Stress and Intonation
Key topics
SyllablesMorasSonorityFixed stressVariable stressIntonationDownstep and feet
DiCanio (UB) Suprasegmentals 11/24/15 45 / 45
Stress and Intonation
Abramson, A. S. (1974). Experimental phonetics in phonology: Vowel duration in Thai. Pasaa,4:71–90.
Boersma, P. and Weenink, D. (2013). Praat: doing phonetics by computer [computer program].www.praat.org.
Gandour, J. T., Potisuk, S., and Harper, M. P. (1996). Effects of stress on vowel length in Thai.In The Fourth International Symposium on Language and Linguistics, pages 95–103. Instituteof Language and Culture for Rural Development, Mahidol University.
Goldstein, L., Byrd, D., and Saltzman, E. (2006). The role of the vocal tract gestural actionunits in understanding the evolution of phonology. In Action to Language via the MirrorNeuron System, pages 215–249. New York: Cambridge University Press.
Ham, W. H. (2001). Phonetic and Phonological Aspects of Geminate Timing. OutstandingDissertations in Linguistics. Routledge.
Hillenbrand, J., Getty, L. A., Clark, M. J., and Wheeler, K. (1995). Acoustic characteristics ofAmerican English vowels. Journal of the Acoustical Society of America, 97(5):3099–3111.
Hyman, L. M. (2006). Word-prosodic typology. Phonology, 23:225–257.Ladefoged, P. (1971). Preliminaries to Linguistic Phonetics. Chicago, University of Chicago.Maddieson, I. (2004). Timing and Alignment: A Case Study of Lai. Language and Linguistics,
5(4):729–755.Port, R. F., Dalby, J., and O’Dell, M. (1987). Evidence for mora timing in Japanese. Journal of
the Acoustical Society of America, 81(5):1574–1585.Roengpitya, R. (2002). A historical and perceptual study of vowel length in Thai. In Macken,
M. A., editor, Papers from the 10th Annual Meeting of the Southeast Asian LingusiticsSociety, pages 353–366. Arizona State University, Program for Southeast Asian Studies.
DiCanio (UB) Suprasegmentals 11/24/15 45 / 45