Some Speech Basics Phonetic Transcription, Context-dependent variation, and Intonation

Post on 30-Dec-2015

46 views 8 download

Tags:

description

Jennifer J. Venditti Postdoctoral Research Associate Columbia Computer Science 12 September 2002. Some Speech Basics Phonetic Transcription, Context-dependent variation, and Intonation. 1. Phonetic Transcription. Spelling vs. Sounds. same spelling = different sounds - PowerPoint PPT Presentation

Transcript of Some Speech Basics Phonetic Transcription, Context-dependent variation, and Intonation

Some Speech BasicsPhonetic Transcription,

Context-dependent variation,and Intonation

Jennifer J. VendittiPostdoctoral Research Associate

Columbia Computer Science

12 September 2002

1. Phonetic Transcription

Spelling vs. Sounds same spelling = different sounds

o comb, tomb, bomb oo blood, food, goodc court, center, cheese s reason, surreal, shy

same sound = different spellings[i] sea, see, scene, receive, thief [s] cereal, same,

miss[u] true, few, choose, lieu, do [ay] prime, buy, rhyme, lie

combination of letters = single soundch child, beach th that, batheoo good, foot gh laugh

single letter = combination of soundsx exit, Texas u use, music

‘silent’ lettersk knife, know p psycho, pterodactyle moose, bone gh through

Figures 4.1 and 4.2:Jurafsky & Martin (2000),pages 94-95.

On-line pronunciation dictionaries

phonesetderived from:

number of wordforms

English variety

LDC PRONLEX

ARPAbet 90,694 American

CMUdict ARPAbet 100,000 American

CELEX IPA 160,595 British

Source: Jurafsky & Martin (2000), page 121.

Places of articulation

http://www.chass.utoronto.ca/~danhall/phonetics/sammy.html

labial

dentalalveolar post-alveolar/palatal

velar

uvular

pharyngeal

laryngeal/glottal

Vocal fold vibration

[UCLA Phonetics Lab demo]

Articulatory parameters for English consonants (in ARPAbet) PLACE OF ARTICULATION

bilabial

labio-dental

inter-dental

alveolar

palatal

velar glottal

stop p b t d k g q

fric. f v th dh s z sh zh h

affric. ch jh

nasal m n ng

approx

w l/r y

flap dxMA

NN

ER

OF

AR

TIC

ULA

TIO

N

VOICING: voiceless

voiced

American English vowel space

FRONT BACK

HIGH

LOW

eyow

aw

oy

ay

iy

ih

eh

ae aa

ao

uw

uh

ah

ax

ix ux

[iy] vs. [uw]

(From a lecture given by Rochelle Newman)

[ae] vs. [aa]

(From a lecture given by Rochelle Newman)

Acoustic landmarks

“Patricia and Patsy and Sally”

[p] [t] [p] [t]

[p] [t]

[l][sh] [s] [s][n] [n][ix]

[ix] [ih]

[ih] [ax] [ae] [iy] [iy][ae]

Articulators in action

“Why did Ken set the soggy net on top of his deck?”

(Sample from the Queen’s University / ATR Labs X-ray Film Database)

Exercise (1)1. Write your name in:

(a) IPA.(b) ARPAbet (if possible).

2. Choose one of the following triplets and transcribe each word in both IPA and ARPAbet. cone, tomb, bottom blood, fool, hook court, race, cheese reason, surreal, cash thing, these, other laugh, through, ghoul

Figures 4.1 and 4.2:Jurafsky & Martin (2000),pages 94-95.

IPA consonants

(Distributed by the International Phonetics Association.)

IPA vowels

(Distributed by the International Phonetics Association.)

2. Context-dependent phonetic variation

Context-dependent variation What we would consider a single ‘sound’ can be

pronounced differently depending on the phonetic context. For example, the phoneme /t/:

Figure 4.8: Jurafsky & Martin (2000), page 104.

Another regular alternation I can ask [ay k ae n ae s k] I can see [ay k ae n s iy] I can bake [ay k ae m b ey k] I can play [ay k ae m p l ey] I can go [ay k ae ng g ow] I can carry [ay k ae ng k ae r iy]

n m / __ [+labial stop]n ng / __ [+velar stop]

(inopportune [n], insatiable [n], impervious [m], immortal [m], incoherent [ng], ingratitude [ng])

English pluralshiccup [p] hiccups flood [d] floodssock [k] socks scab [b] scabshabit [t] habits frog [g] frogsspoof [f] spoofs comb [m] combshearth [th] hearths grave [v] graves

lathe [dh] lathesbeach [ch] beaches fool [l] foolsdish [sh] dishes sewer [r] sewersjudge [jh] judges pies [ay] piesrace [s] races curfew [uw] curfewsaxe [s] axes sofa [ax] sofasraise [z] raises

Phonological rules for Engl. plurals

Assume that the lexical form of plural is /z/. Insertion: ix / [+sibilant] ^__ z # Devoicing: z s / [-voice] ^__ #

bus+PL cape+PL hen+PL/b ah s +z/ /k ey p +z/ /h eh n +z/

insertion:b ah s +ix z -- --devoicing: -- k ey p s --

[b ah s ix z] [k ey p s] [h eh n z]

/b ah s +z/ /k ey p +z/ /h eh n +z/devoicing: b ah s s k ey p s

--insertion:-- -- --

*[b ah s s] [k ey p s] [h eh n z]

3. Intonation

Intonation makes the difference

A: I’d like to fly to Davenport, Iowa on TWA.B: TWA doesn’t fly there ...

B1: They fly to Des Moines. B2: They fly to Des Moines.

A: What types of foods are a good source of vitamins?B1: Legumes are a good source of vitamins.B2: Legumes are a good source of vitamins.

A1: I met Mary and Elena’s mother at the mall yesterday.A2: I met Mary and Elena’s mother at the mall yesterday.

Intonation is about ...

Pitch Melody, or “tune” Alignment Prominence and focus Chunking, or “phrasing” ... and more ...

Vocal fold vibration

Physical: Fundamental frequency (F0) rate of vibration of the vocal folds

Perceptual: Pitch

perceived pitch

fun

dam

en

tal fr

eq

.[UCLA Phonetics Lab demo]

Pitch range

[from Prosody on the Web tutorial on pitch]

Differences can be due to physical size, gender, social identity, excitement level, linguistic, etc ...

English Pitch Accents Certain words in the speech stream can be made

structurally and perceptually prominent by the use of pitch accents.

Lenora works for Lucent.* *

Pitch accents are local pitch movements (e.g. rising, falling) or pitch maxima/minima that accompany these metrically strong syllables.

The intonational “tune” is the melody that is created by sequences of pitch accents over an utterance.

Intonational tunes: What do they mean?

Lenora works for Lucent.

Lenora works for Lucent.

Lenora works for Lucent.

Lenora works for Lucent.

[Tell me something about the world ...]

[... I hope she doesn’t have stock options.]

[... Really? I wasn’t aware of that.]

[I’ve told you a million times ...]

* *

*

* *

* *

*

[See works by Bolinger, Ladd, Hirschberg ...]

Alignment differences cue “assertion” vs. “suggestion”

A: I’d like to fly to Davenport, Iowa on TWA.B: TWA doesn’t fly there ...

50

100

150

200

250

300

350

400

they fly to Des Moines 50

100

150

200

250

300

350

400

they fly to Des Moines

Alignment with different words

B: LEGUMES are a good source of vitamins.

Legumes are a good source of vitamins.* *

*

“broad focus”

“narrow focus”

A: What types of foods are a good source of vitamins?

# Legumes are a good source of VITAMINS.

50

100

150

200

250

300

350

400

Placement of focal accent

LEGUMES are a good source of vitamins

The rise-fall tune (= “I assert this”) shifts locations.

50

100

150

200

250

300

350

400

Placement of focal accent

Legumes are a GOOD source of vitamins

The rise-fall tune (= “I assert this”) shifts locations.

Placement of focal accent

legumes are a good source of VITAMINS50

100

150

200

250

300

350

400

The rise-fall tune (= “I assert this”) shifts locations.

Chunking, or “phrasing”

A1: I met Mary and Elena’s mother at the mall yesterday.

A2: I met Mary and Elena’s mother at the mall yesterday.

50

100

150

200

250

300

350

400

Phrasing can disambiguate

I met Mary and Elena’s mother at the mall yesterday

Mary & Elena’s mothermall

One intonation phrase with relatively flat overall pitch range.

50

100

150

200

250

300

350

400

Phrasing can disambiguate

I met Mary and Elena’s mother at the mall yesterday

Marymall

Elena’s mother

Separate phrases, with expanded pitch movements.

Lists of numbers, nounstwenty.eight.five

ninety.four.three

seventy.three.seven

forty.seven.seven

seventy.seven.seven coffee cake and cream

chocolate ice cream and cake

fish fingers and bottles

cheese sandwiches and milk

cream buns and chocolate[from Prosody on the Web tutorial on chunking]

Exercise (2)1. Sketch out an F0 contour of

Does Manitowoc have a bowling alley?as uttered in the following two contexts:(a) “I know Green Bay has a bowling alley, but ...”(b) “I know Manitowoc has a theater, but ...”

2. Complete the sentence:When Madonna sings the song ...

Describe the prosodic phrasing of your utterance.

3. How can phrasing help disambiguate the utterance:

that’s right at the traffic light