CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

40
1 CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom Fall 2008

description

CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom Fall 2008. Phonological Processes Phonemes undergo systematic variation depending on their context - PowerPoint PPT Presentation

Transcript of CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

Page 1: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

1

CS 551/651:Structure of Spoken Language

Lecture 6: Phonological Processes

John-Paul HosomFall 2008

Page 2: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

2

Phonological Processes

• Phonemes undergo systematic variation depending on theircontext

• For example, forming the past tense:cause /k aa z/ caused /k aa z d/talk /t aa k/ talked /t aa k t/

/d/ vs. /t/ is predictable based on voicing of word-final phoneme

• Allophones can be viewed as systematic variations of phonemesthat are a result of cultural and physiological processes, butdo not distinguish meaning of utterance

• For example, /p/ and /ph/ in English is predictable:word or syllable initial voiceless stops are aspirated

pit [ph ih t[h]] tip [th ih p[h]] kin [kh ih n]spit [s p ih t[h]] stick [s t ih k[h]] skin [s k ih n]

Page 3: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

3

Phonological Processes

/ph ih th th ih ph kh ih n/

/s p ih th s t ih kh s k ih n/

Page 4: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

4

Phonological Processes

• Other types of phonetic processes:Assimilation, Deletion, Reduction, Insertion, Substitution,Me'tathesis (switching order of two phonemes)

• Assimilation“A feature of one segment is shared by a neighboring segment”

• Examples of Assimilation Nasalization of vowels before nasal consonants in- (negative prefix) becomes im- in words beginning with bilabial consonant (imbalance, imperfect, indifferent, intolerance)

Page 5: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

5

Phonological Processes

• Assimilation may be due to coarticulation, or it may belanguage-specific, “arbitrary”:

“word-final alveolar obstruent may take on placeof articulation of following word-initial segmentif word-initial segment is palato-alveoar”

this /dh ih s/ shop /sh aa ph/ this shop /dh ih sh sh aa ph/this /dh ih s/ fish /f ih sh/ this fish /dh ih s f ih sh/this /dh ih s/ thing /th ih ng/ this thing /dh ih s th ih ng/

also, depending on dialect, not within-word: misshapen /m ih s sh ei p en/

Page 6: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

6

Phonological Processes

• Example of assimilation of /s/ with /sh/ but not /f/:

/dh ih sh sh aa pcl ph dh ih s f ih sh/

Page 7: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

7

Phonological Processes

• Substitution:common in foreign accents or speaking impairments:

welcome /v eh l k ah m/McDonald /m a k uw d ow n aa r uw d ow/Roger /w aa jh er/

• Metathesis:changing order of two phonemes within a word(dialect variation)

pretty /p er dx iy/ask /ae k s/

Page 8: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

8

Phonological Processes

• Deletion:Barbara /b aa r b ax r ah/ /b aa r b r ah/Memory /m eh m ax r iy/ /m eh m r iy/

• Reduction:unstressed vowels become /ax/

conduct (verb) /k ax n d ah k t/conduct (noun) /k aa n d ax k t/

• Insertion:voiceless stop inserted between nasal and voiceless consonant; voiceless stop always has same place of articulation as nasal

fancy /f ae n t s iy/Chomsky /ch aa m p s k iy/

schwa inserted after word-final nasalnine /n ay n ax/

dictionary pronunciation=

Page 9: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

9

Phonological Processes

• Deletion:

/m eh m r iy/

Page 10: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

10

Phonological Processes

• Insertion:

/f ae n t s iy ch aa m p s k iy/

Page 11: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

11

Phonological Processes: Ladefoged Rules

• [–voiced, +stop] [+aspirated] when syllable initialpit vs. spit

• [ax] [–voiced] after syllable-initial [–voiced, +stop] and before [–voiced, +stop]

potato

• [+consonantal] longer at end of phrasebib, did, don, nod

• [–voiced, +stop] [–aspirated] after syllable-initial /s/spew, stew, skew

• [+vowel] shorter before unvoiced phonemes in same syllablecap vs. cab, back vs. bag

Page 12: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

12

Phonological Processes: Ladefoged Rules

• Devoicing, End-of-Phrase Length:

/ph ax tcl th ey dx ow/

/d aa n n aa dcl d/

Page 13: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

13

Phonological Processes: Ladefoged Rules

• Length before Voiceless:

/khae pc ph kh ae bc b b ae kc kh b ae gc g/

Page 14: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

14

Phonological Processes: Ladefoged Rules

• [–voiced] longer when at end of syllablesass, shook vs. push

• [+stop] unreleased before [+stop]apt, act (often see some mark in spectrogram)

• [–voiced, +alveolar, +stop] [+glottal stop] when before an alveolar nasal in same word

beaten /b iy q en/

• [+nasal] [+syllabic] at word end when following [+obstruent]chasm /k ae z em/NOT film (obstruent = complete closure of airway; /l/ is not)

• [+liquid] [+syllabic] at word end and following [+consonant]paddle, whistle, kennelNOT snarl unless classify /r/ as [+vowel, –syllabic]

Page 15: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

15

Phonological Processes: Ladefoged Rules

/ae pcl tcl th ae kcl tcl th/

/bcl b iy q tcl en ax_h/

Page 16: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

16

Phonological Processes: Ladefoged Rules

• [+alveolar, +stop] [+voiced, +flap] when betweentwo vowels, second of which is unstressed This rule has speaker-dependent variations

• [+alveolar, +stop] omitted between two consonantsmost people, sandpaper, grand master

• [+consonant] shortened before identical [+consonant]

[–voice, +stop] between [+nasal] and [–voice, +fricative]when following vowel absent or unstressedprince vs. prints (e'penthesis)

[&] following word-final [+nasal, +consonantal]nine come sang (e'penthesis)

Page 17: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

17

Phonological Processes: Ladefoged Rules

• “most people and grand masters use sandpaper”

/m ow s pc ph iy pc ph el n gc g r ae n m ae s tc th er z yu z s ae n pc ph ey pc ph er/

Page 18: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

18

Phonological Processes: Ladefoged Rules

• “nine come sang”

/n ay n ax kcl kh ah m ax s ae ng ax/

Page 19: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

19

Phonological Processes: Ladefoged Rules

• [+vowel] longer in open syllablessea vs. seed vs. seatsigh vs. side vs. sight

(equalize length of syllables with differing numbers of segments)

• [+vowel] longer in stressed syllablebelow vs. billow

(stressed syllables are longer in duration than unstressed)

• [+vowel] [+nasal] before [+nasal] consonant

• [+vowel, –stressed] schwa (vowel reduction)able vs. abilityCanada vs. Canadianphotograph vs. photography

Page 20: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

20

Phonological Processes: Ladefoged Rules

• “sigh side sight”

/s ay s ay dcl d s a tcl th/

Page 21: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

21

Phonological Processes: Ladefoged Rules

• “below billow”

/b ax l ow b ih l ow/

Page 22: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

22

Phonological Processes

• Why is this useful? (a) Providing models of known phenomenon is better

than having classifier learn the phenomenon from data(b) Provides humans with appropriate cues for understanding,

naturalness(c) Accurate phonetic modeling improves ability of

classifier to discriminate between classes

• Example for Text-to-Speech (case (b)): Create a TTS system Don’t shorten vowels before voiceless plosives Creates, by default, acoustic cue for voiced plosives Decrease intelligibility or at least naturalness of system

Page 23: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

23

Phonological Processes

• Example for Automatic Speech Recognition (case (c)): Train a speech recognizer using “dictionary” pronunciation Then, in all cases where

[–voice, +stop] between [+nasal] and [–voice, +fricative]such as “fancy” (in CMU dictionary as /f ae n s iy/), acoustics show alveolar stop, but trained as either nasal /n/ or fricative /s/.

Decreases ability of model to discriminate classes Decreases performance of system

• Difficulty is in providing comprehensive, accurate rulesthat are not inappropriately “forced” on a system

Page 24: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

24

Stops/Plosives

• There are six plosives (oral stops) in American English: . bilabial alveolar velar

unvoiced | /p/ /t/ /k/voiced | /b/ /d/ /g/

plus the flap /dx/ which is a very short /t/ or /d/

• Plosives can be difficult to identify and discriminate; contextual cues can be varied

• Cue (1) is the formant transitions of neighboring vowels:for bilabials, F2 drops at CV boundaryfor alveolars, F2 goes toward 1800 Hz at CV boundaryfor velars, F2 may meet F3 (velar pinch) or be fairly flat

• Cue (2) is that voiced plosives may have pre-voicing; more likely when plosive is between two vowels

Page 25: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

25

Stops/Plosives

• Cue (3) is that voiced plosives usually have VOT of < 30 msec,

but unvoiced plosives usually have VOT of > 50 msec

• Cue (4) is that the VOT is shortest for bilabials, longer for alveolars, and longest for velars. (VOT /p/ < /t/ < /k/ and /b/ < /d/ < /g/)

• Cue (5) is that aspirated (unvoiced) plosives show evidence of F2 and F3 during aspiration; voiced plosives usually don’t

• Cue (6) is the spectral shape; in theory, the shape of the spectrum at burst release can be used to distinguish plosives: /p/ and /b/ have energy low in frequency or weakly spread

throughout spectrum, /t/ and /d/ have more energy above 4KHz (related to

alveolar fricatives /s/ and /z/),/k/ and /g/ tend to have more well-defined peaks in

the spectrum (near formant locations).

Page 26: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

26

Stops/Plosives

• Other cues related to spectral shape:Cue (7a): In the context of front vowels, /k/ and /g/ have spectral peak just above F2 of adjacent vowel, making them confusable with /t/ and /d/; but front vowels show more “velar pinch”

Cue (7b): In the context of back vowels, /k/ and /g/ have one spectral peak between 1000 and 1500 Hz, a second peak between 3000 and 4500 Hz.

• Cue (8): Velar bursts also sometimes display “double burst”, or a second burst during the frication

• Cue (9): Post-vocalic consonants are often unreleased; they can be identified by (a) glottalization, (b) sudden drop in vowel energy, or (c) formant movement at end of vowel

Page 27: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

27

Stops/Plosives

• Cue (10): When the plosive is unreleased, the voicing distinction is based more on length of preceding vowel; voiced plosives are associated with longer vowels, unvoiced plosives with shorter vowels

• Cue (11): In V1C1C2V2 patterns, where both C are plosives, the existence of two plosives is in the different formant transitions in V1 and V2, the longer duration of closure, and sometimes in a brief “click” in spectrum indicating a change in place of articulation

• Cue (12): Plosives have different characteristics in stressed vs. unstressed environments. VOT for unvoiced plosives before unstressed vowels is shorter than VOT for unvoiced plosives before stressed vowels; plosives in an unstressed-vowel environment are less spectrally clear; in unstressed syllables, /t/ and /d/ may be realized as a flap /dx/.

Page 28: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

28

Stops/Plosives

• Cue (13): Flaps have short duration (< 30 msec), dip in energy levels between two vowels, weak F2 and F3, and F2 tends toward 1800 Hz

• Cue (14): Consonant clusters can provide restrictions; for 3-consonant clusters (beginning with /s/-plosive), the only valid combinations are:

/s p l/, /s p r/, /s p y//s t r/, /s t y//s k l/, /s k r/, /s k y/, /s k w/

• Cue (15): In /s/−plosive−vowel combinations, VOT tends to be shorter and duration of /s/ shorter than normal

Page 29: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

29

Plosives: Unvoiced Initial in Front-Vowel Context

/p iy t iy k iy/

Page 30: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

30

Plosives: Voiced Initial in Front-Vowel Context

/b iy d iy g iy/

Page 31: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

31

Plosives: Unvoiced Initial in Mid-Vowel Context

/p ah t ah k ah/

Page 32: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

32

Plosives: Voiced Initial in Mid-Vowel Context

/b ah d ah g ah/

Page 33: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

33

Plosives: Unvoiced Initial in Back-Vowel Context

/p aa t aa k aa/

Page 34: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

34

Plosives: Voiced Initial in Back-Vowel Context

/b aa d aa g aa/

Page 35: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

35

Plosives: Unvoiced Final in Front-Vowel Context

/iy p iy t iy k/

Page 36: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

36

Plosives: Voiced Final in Front-Vowel Context

/iy b iy d iy g/

Page 37: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

37

Plosives: Unvoiced Final in Mid-Vowel Context

/ah p ah t ah k/

Page 38: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

38

Plosives: Voiced Final in Mid-Vowel Context

/ah b ah d ah g/

Page 39: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

39

Plosives: Unvoiced Final in Back-Vowel Context

/aa p aa t aa k/

Page 40: CS 551/651: Structure of Spoken Language Lecture 6: Phonological Processes John-Paul Hosom

40

Plosives: Voiced Final in Back-Vowel Context

/aa b aa d aa g/