Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

68
Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    219
  • download

    1

Transcript of Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Page 1: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Acoustic Phonetics

Investigating physical properties of speech sounds

From chapter 7, Rogers (2000)

Page 2: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 2

Speech Sound Representation Reconsidered

Articulatory phonetic approach: Describing sounds depending on how they are producedProblems of this approach Representation is only in terms of symbols Sounds are not like

that in reality It’s not reflected that some sounds are more confusing each othe

r when perceived while others are not Eg) i/e vs. a/k s/f vs. s/m

So we need another way of describing speech sounds

Page 3: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 3

Acoustic representation of speech sounds

Representing sounds as they are Visual other than symbolic

representation Depending more upon perception than production or articulationPhysical properties are analyzedSimilarities and differences of sounds are disclosed

Page 4: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 4

Acoustic definition of sound

Variation in air pressureMovements of air particlesAn audible disturbance of a medium produced by a source The source: any object that vibrates

Eg) musical instruments, human vocal cords, microphone The medium: any elastic object that carries vibration

Eg) air, water

Page 5: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 5

Advantages of acoustic representation

Real/physical mechanism of speech communication is represented No convention, no confusion, no controversyGradual change of sounds are shown Example) How loud a sound isSmall variations are shownHelpful for understanding how computers synthesize speech and how speech recognition works

Page 6: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 6

What to represent?

Three aspects sounds that can differ Pitch Loudness Quality (Length)

Page 7: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 7

How to represent acoustically?

Sound is air particle movementsThe best and agreed way of expressing air particle movements:Waveform

Another necessary way of representing sound:Spectrum

Page 8: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Waveforms

Page 9: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 9

Waveform properties

Simple harmonic movement + Time elapse WaveformIndividual particles move only backward and forward

Page 10: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 10

No force

Initial force

Elasticity

Inertia

Elasticity

Time

Displacement

Air particle movement

Page 11: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 11

Simple Waveform

Page 12: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 12

Speech sound properties shown in waveforms

Differentiation of sounds Sounds are different, which is crucial in human

speech as a communication method

Ways in which sounds can differ Perceptually: Pitch, Loudness, Quality Acoustically: Frequency, Amplitude, Phase

Waveform shows differences in Acoustic correlate of Loudness Amplitude Acoustic correlate of Pitch Frequency

Page 13: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 13

Amplitude representing Loudness

Time (s)0 0.05

-0.5

0.5

0

Time (s)0 0.05

-1

1

0

Time (s)0 0.05

-0.5

0.5

0

Time (s)0 0.05

-1

1

0

(a)

(b)

amplitu

de

Page 14: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 14

Amplitude (cntd.)

Air pressure fluctuationThe extent of the maximum variation in air pressure from normal during a soundUnit: Bel, Decibel(dB; 1/10 of Bel), Bark dB: Common logarithm of power ratios

Twice amplitude is not heard as twice loudLoud sound: particles move farther and more rapidly

Page 15: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 15

Frequency representing Pitch

Time (s)0 0.05

-0.5

0.5

0

Time (s)0 0.05

-0.5

0.5

0

(a)

(b)

Page 16: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

SONUS reviving 16

Frequency (cntd.)

The rate at which sound source vibrates Sound sources: tuning forks, vocal cords, etcUnits: Hz, cps (cycle per second)Depending upon Length of the pendulum Length of tuning fork prongsF(requency) = 1/T(period)

Page 17: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 17

Frequency (cntd.)

Standard A frequency: 440 HzOctave: a note which is exactly twice the frequency of another note Eg) A(440Hz), A’(880Hz), A’’(1760Hz)Audible Frequency Human: 20Hz(or16Hz) – 20KHz Bats: 20KHz – 100KHzFastest telephone vibration: 35KHzMost of the human speech sound frequency: below 8KHz

Page 18: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 18

Frequency (cntd.)

Pitch and frequency are not in linear relationship Only in the low frequency, fairly linear 600-700Hz difference sounds greater

than 3600-3700Hz difference

Page 19: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 19

Phase difference

Time (s)0 0.05

-0.5

0.5

0

Time (s)0 0.05

-0.5

0.5

0

Page 20: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 20

Phase (cntd.)

Phase differences cause different waveforms

But

Human ears do not perceive phase differences

Page 21: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 21

Waveform is not sufficient..

Two sounds with the same pitch and loudness can still differ Example) Violin A vs. Piano A Example) [i] vs. [a]

Another way of representation neededSpectrum

Page 22: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 22

More about waveform first..

To know about spectrum and its representation of quality, we need to know more about waveform

Page 23: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 23

Types of Waveforms:Pure tones vs. Complex waves

Most sounds, including human speech, sources produce complex vibrationsPure tone: single harmonic motion (SHM), with only one frequencyComplex wave: more than one harmonic motion, multiple frequency Pure tone + pure tone of the same frequency

and phase another pure tone Pure tone + pure tones of different frequency

a complex tone

Page 24: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 24

Pure tone(Simple Wave,

simple harmonic motion,Sinusoid,

Sine wave)

Page 25: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 25

Complex wave

Time (s)0 0.05

-2.499

2.499

0

100 Hz + 200 Hz + 300 Hz

Page 26: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 26

Complex wave

Time (s)0 0.05

-2.499

2.499

0

Time (s)0 0.0195395

-0.1355

0.1318

0

[a] production by a female speaker

Page 27: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 27

Types of Waveform:Repetitive vs. non-repetitive wave

Strictly Repetitive (periodic): sine wave, ideal soundsVirtually Repetitive (periodic): vowels, sonorantsNon-repetitive (aperiodic): obstruents white noise (most complex) click

Page 28: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 28

Periodic vs non-periodic wave

Time (s)0 0.05

-2.499

2.499

0

Time (s)0 0.0195395

-0.1355

0.1318

0

Time (s)0 0.0732916

-0.08255

0.08606

0

Time (s)0 0.0732916

-0.08255

0.08606

0

Aperiodic [s] Periodic wave [a]

Page 29: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 29

Limitation of Waveform Representation

Sound can be heard in 3 different way Loudness, Pitch, Quality

Quality can’t be represented directly in waveforms A new way of representation needed Spectrum

Page 30: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Spectrum

Page 31: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 31

Background Knowledge on Spectrum

Sound waves can be either simple or complex Simple: sinusoid Complex: Combined simple waves with

different frequency Sound quality can be determined by the way

such simple waves combine into a complex wave

If a complex wave can be split into each simple wave we can see the secret

Page 32: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 32

Waveform and Spectrum(100Hz + 200Hz + 300Hz )

Time (s)0 0.05

-5.354

5.354

0

Time (s)0 0.025

-5.354

5.354

0

Wave

4

2

200 300100

Spectrum

Hz

Page 33: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 33

An Example of Spectrum

Page 34: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 34

Formants shown in spectrumFrequency component(s) with boosted energyFormant frequency: Its frequencyReason for formant shaping: Filtering function in vocal tractDecisive aspect of sound qualityFor vowels three formants (F1, F2, F3) are especially important for their distinction

Page 35: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 35

An Example of Formant :

Vowel []

F1

F2

F3

Page 36: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 36

An Example of Formant:Vowel []

F1F2

F3

0123456

50 300

550

80010

50

1300

1550

1800

2050

2300

2550

2800

3050

3300

3550

3800

4050

Hz

Am

plitu

de

Page 37: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 37

Disadvantages of Spectrum Representation

Less intuitive X-axis denotes frequency level No time varying representationHard to see interaction with WaveformsThus, a new way of representation needed Spectrogram

Page 38: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Spectrogram & its reading

Page 39: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 39

What is spectrogram?

Begin to be used since 1940sAnother representation of frequency domain analysisThe most popular way of representing spectral information3 dimensional representation X-axis: Time Y-axis: Frequency Darkness (or color): Energy

Page 40: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 40

Waveform & Spectrogram aligned

Page 41: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 41

Spectrogram example (color resolution of word “compute”)

Page 42: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 42

Spectrogram example (grayscale of word “compute”)

Page 43: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 43

Types of spectrogram

Wideband spectrogram better time resolution

Narrowband spectrogram better frequency resolution

Page 44: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 44

Wideband vs. Narrowbandspectrograms of the question "Is Pat sad, or mad?" The 5th, 10th and 15th harmonics have been marked by white squares

in two of the vowels

Page 45: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 45

Advantages & Disadvantages

Advantages Time alignment

Disadvantages Less reliable than waveform

Page 46: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 46

Vowel Spectrogram

Formant frequencies are critical cues for vowel distinctionF1: Height high vowels: low F1

F2: Backness back vowels: low F2

Page 47: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 47

Examples of formant frequencies of English monophthongs

F3F3 290

0255

0249

0249

0264

0238

0230

0250

0239

0

F2F2 2250

1900

1770

1660

1100

1030

870 1500

1190

F1F1 280 400 550 690 710 450 310 900 640

Page 48: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 48

"heed, hid, head, had, hod, hawed, hood, who'd" (a male speaker, American English)

From http://hctv.humnet.ucla.edu/departments/linguistics

Page 49: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 49

Consonant Spectrogram

General Acoustic structure more complicated

than vowels Adjacent sounds (especially vowels)

convey important information locus High frequency characteristics

especially for fricatives and affricates

Page 50: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 50

What is LOCUS

Information of formant transition from vowels into obstruents or from obstruents into vowelsThe target frequency that each formant transition is heading toward as an obstruction is made, or the frequency the transition comes as the obstruction is releasedThe characteristic of the consonantal place and manner roughly the same in different vowel contexts

Page 51: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 51

Stops

General Fairly distinct locus for each place Burst Silence during the closure (only at

syllable onset position) Virtually no difference during the

closure

Page 52: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 52

Stops (cntd.)

Voicing distinction voiced: vertical striations for voiced

sounds, less abrupt burst, frequently weakened to be like fricatives or approximants

voiceless: generally abrupt burst at higher frequency area

Page 53: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 53

Stops (cntd.)

Place distinction bilabial

relatively low F2, F3 locus rising into and falling out of vowel

weak and spread vertical lines alveolar

F2 locus about 1800 Hz Strong vertical lines

velar Velar pinch: vowels F2, F3 merging often double burst long formant transitions

Page 54: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 54

Stops (cntd.)

Manner distinction Silence duration, VOT, Following V F0

silence VOT F0

Aspirated short long high

Tense ’ long short high

Lax mid mid low

Page 55: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 55

Examples -- “a bab, a dad, a gag”

Page 56: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 56

Place dependent loci

Page 57: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 57

Fricatives

General Random noise pattern especially in high frequency r

egions Place distinction

Labiodental [f, v]: rising locus into the following vowel Dental []: major energy above 6000Hz Alveolar [s, z]: major energy above 4000Hz Alveopalatal []: major energy above 2000Hz Glottal [h]: the trace of formant frequencies of neighbouring

vowels

Page 58: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 58

Fricatives (cntd.)

Weak vs. strong Strong []: darker bands Weak []: spread and fainter

Voiced [ ]: often so weak and confused with nasals or approximants

Cues to tell [] from []: higher formants of [] fall into adjacent vowels

Page 59: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 59

Example – “fie, thigh, sigh, shy”

Page 60: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 60

Example – “ever, weather, fizzer, pleasure”

Page 61: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 61

Nasals

General Formants similar to vowels but fainter Very low F1 (about 250Hz), F2 (about

2500Hz), and F3 (about 3250Hz)

Place distinction bilabial []: downward F2, F3 locus alveolar []: less amount of F2 transition velar [ ]: velar pinch

Page 62: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 62

Examples -- “a Pam, a tan, a kang”

Page 63: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 63

Liquids & Approximants

General Formants similar to vowels but fainter

(especially at high frequency regions) Approximately F1(250Hz), F2(1200Hz),

F3(2400Hz) Slow formant movements

Page 64: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 64

Liquids & Approximants(cntd.)

Phone specific properties Labial glide [w]:

very low F1, F2 (600-1000Hz|) and gets too close to each

relatively low F3 rapid falloff of spectral amplitude (formant

movements) Palatal glide [y]:

extremely low F1 extremely high F2, F3

Page 65: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 65

Liquids & Approximants(cntd.)

Phone specific properties (cntd.) Flap []: soft burst, short duration Retroflex []:

F3 dipping down close to F2 General lowering of F3, F4

Lateral []: Low F1, F2 (approx. F1 250Hz, F2 1200Hz) usually substantial energy in the high F region

Page 66: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 66

Example – “led, red, wed, yell”

Page 67: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 67

Final remarks on spectrogram

Spectrogram is not the only cue for acoustic distinction of speech sounds.When there is a mismatch between waveform & spectrogram, the waveform is more reliable in general.

Page 68: Acoustic Phonetics Investigating physical properties of speech sounds From chapter 7, Rogers (2000)

Reviving Sonus 68

References & Links

http://cslu.cse.ogi.edu/tutordemos/SpectrogramReading/spectrogram_reading.htmlhttp://hctv.humnet.ucla.edu/departments/linguistics/VowelsandConsonants/coursehttp://www.cs.indiana.edu/~port/teach/306/speech.acoustics.htmlhttp://www.phon.ucl.ac.uk/courses/spsci/b203/week2-5.pdf