ACOUSTICAL THEORY OF SPEECH PRODUCTION Robert A. Prosek, Ph.D. CSD 301 Robert A. Prosek, Ph.D. CSD...

ACOUSTICAL THEORY OF SPEECH PRODUCTION

ACOUSTICAL THEORY OF SPEECH PRODUCTION

Robert A. Prosek, Ph.D.CSD 301

Robert A. Prosek, Ph.D.CSD 301

Acoustical TheoryAcoustical Theory

•There is nothing more practical than a good theory

•The linear source-filter theory is one of the best in our field

•Based on Gunnar Fant’s “Acoustic Theory of Speech Production”

•The theory expresses articulatory-acoustic relationships

•There is nothing more practical than a good theory

•The linear source-filter theory is one of the best in our field

•Based on Gunnar Fant’s “Acoustic Theory of Speech Production”

•The theory expresses articulatory-acoustic relationships

Acoustical TheoryAcoustical Theory•The source is vocal fold vibration

•for some consonants, the source is more complex

•can be in the vocal tract or a combination of both

•The filter is the vocal tract

•extending from the vocal folds to the lips or nares

•like all filters, the vocal tract is frequency dependent

•The source is vocal fold vibration

•for some consonants, the source is more complex

•can be in the vocal tract or a combination of both

•The filter is the vocal tract

•extending from the vocal folds to the lips or nares

•like all filters, the vocal tract is frequency dependent

Acoustic TheoryAcoustic Theory

•The source and the filter are assumed to be independent

•this is an assumption made for convenience

•it implies that you can change the output of the vocal folds without changing the vocal tract

•vice-versa

•The source and the filter are assumed to be independent

•this is an assumption made for convenience

•it implies that you can change the output of the vocal folds without changing the vocal tract

•vice-versa

VowelsVowels•Modeled as a tube closed at one end and open at the

other

•the closure is a membrane with a slit in it

•the tube has uniform cross sectional area

•membrane represents the source of energy (vocal folds)

•the energy travels through the tube

•the tube generates no energy on its own

•the tube represents an important class of resonators

•odd quarter length relationship

•Fn=(2n-1)c/4l

•Modeled as a tube closed at one end and open at the other

•the closure is a membrane with a slit in it

•the tube has uniform cross sectional area

•membrane represents the source of energy (vocal folds)

•the energy travels through the tube

•the tube generates no energy on its own

•the tube represents an important class of resonators

•odd quarter length relationship

•Fn=(2n-1)c/4l

Vowels (2)Vowels (2)

•There are an infinite number of resonances for this tube

•we need only consider the first three or four

•the model is valid to only about 5 kHz

•The model was developed by Chiba and Kajiyama in 1941

•based on pipe organs for which a great deal was known

•There are an infinite number of resonances for this tube

•we need only consider the first three or four

•the model is valid to only about 5 kHz

•The model was developed by Chiba and Kajiyama in 1941

•based on pipe organs for which a great deal was known


•If c=35000 cm/s, and

•l=17.5 cm

•What are the first three resonances?

•The simple tube closed at one end and open at the other, with the above length, is a reasonable approximation of /ᴧ/ produced by a male talker

•If c=35000 cm/s, and

•l=17.5 cm

•What are the first three resonances?

•The simple tube closed at one end and open at the other, with the above length, is a reasonable approximation of /ᴧ/ produced by a male talker

Vowels (4)Vowels (4)• Some points to note:

• A curved tube (vocal tract) and a straight tube (model) behave identically acoustically out to 5 kHz

• this is because the curve begins to affect acoustic signals with a short wavelength

• The resonances are equally spaced if the tube has uniform cross sectional area

• Remember: all of the energy comes from the source (vocal fold vibration for vowels)

• Changing the length of the tube changes the resonance frequencies

• Influenced by age and sex

• l= 14.5 cm for females

• l= 8.75 cm for children

• Some points to note:

• A curved tube (vocal tract) and a straight tube (model) behave identically acoustically out to 5 kHz

• this is because the curve begins to affect acoustic signals with a short wavelength

• The resonances are equally spaced if the tube has uniform cross sectional area

• Remember: all of the energy comes from the source (vocal fold vibration for vowels)

• Changing the length of the tube changes the resonance frequencies

• Influenced by age and sex

• l= 14.5 cm for females

• l= 8.75 cm for children


•A one-vowel model isn’t very useful

•Different vowels are modeled, acoustically, by different vocal tract shapes

•Phonetically, how are vowels distinguished?

•If we place a constriction in the tube (vocal tract)

•the resonances changes

•if you change the articulation, you change the vocal tract shape, and the resonance frequencies, amplitudes and bandwidths

•A one-vowel model isn’t very useful

•Different vowels are modeled, acoustically, by different vocal tract shapes

•Phonetically, how are vowels distinguished?

•If we place a constriction in the tube (vocal tract)

•the resonances changes

•if you change the articulation, you change the vocal tract shape, and the resonance frequencies, amplitudes and bandwidths

Vowels (6)Vowels (6)•The output energy of a vowel is the product of

• the source energy

• the size and shape of the resonator

• the radiation characteristic

•Glottal source characteristics for vowels

•vocal fold vibration is periodic

•what does this imply for the spectrum?

• f0 or F0 is used to indicate the vocal fundamental frequency

•the amplitude of the harmonics decreases by -12 dB/octave

•The output energy of a vowel is the product of

• the source energy

• the size and shape of the resonator

• the radiation characteristic

•Glottal source characteristics for vowels

•vocal fold vibration is periodic

•what does this imply for the spectrum?

• f0 or F0 is used to indicate the vocal fundamental frequency

•the amplitude of the harmonics decreases by -12 dB/octave

Vowels (7)Vowels (7)• Filter characteristics for vowels

• the vocal tract is a dynamic filter

• it is frequency dependent

• it has, theoretically, an infinite number of resonances

• each resonance has a center frequency, an amplitude and a bandwidth

• for speech, these resonances are called formants

• formants are numbered in succession from the lowest

• F1, F2, F3, etc.

• A1, A2, A3, etc.

• B1, B2, B3, etc.

• the formants together form the transfer function

• input-output relationship

• formants become physically evident only when energized

• Filter characteristics for vowels

• the vocal tract is a dynamic filter

• it is frequency dependent

• it has, theoretically, an infinite number of resonances

• each resonance has a center frequency, an amplitude and a bandwidth

• for speech, these resonances are called formants

• formants are numbered in succession from the lowest

• F1, F2, F3, etc.

• A1, A2, A3, etc.

• B1, B2, B3, etc.

• the formants together form the transfer function

• input-output relationship

• formants become physically evident only when energized

Vowels (8)Vowels (8)•Radiation characteristic

•acoustic effect when a sound leaves a small area and enters a large one

•The effect is to raise the slope of the spectrum by +6 dB/octave

•Acoustic Phonetic Relationships for Vowels

•F1 is inversely related to tongue height

•F2 is directly related to tongue advancement

•Lip rounding lowers all formant frequencies

•Radiation characteristic

•acoustic effect when a sound leaves a small area and enters a large one

•The effect is to raise the slope of the spectrum by +6 dB/octave

•Acoustic Phonetic Relationships for Vowels

•F1 is inversely related to tongue height

•F2 is directly related to tongue advancement

•Lip rounding lowers all formant frequencies

Vowels (9)Vowels (9)•Perturbation Theory

•Volume velocity variations reflect the way air particles vibrate at a particular point in the vocal tract

•At some points, vibration is minimal (node); at others, maximal (antinodes)

•For F1, the antinode is at the open end and the node is at the closed end

•For F2, there are two antinodes and two nodes

•For F3, there are three antinodes and three nodes

•etc.

•Perturbation Theory

•Volume velocity variations reflect the way air particles vibrate at a particular point in the vocal tract

•At some points, vibration is minimal (node); at others, maximal (antinodes)

•For F1, the antinode is at the open end and the node is at the closed end

•For F2, there are two antinodes and two nodes

•For F3, there are three antinodes and three nodes

•etc.

Vowels (10)Vowels (10)•Perturbation Theory (continued)

•if a change in cross sectional area is applied (a perturbation)

•the acoustic effect depends on proximity to a node or an antinode

•near an antinode the formant frequency lowers

•near a node the formant frequency rises

•lip constrictions lower all formant frequencies

•laryngeal constrictions raise all formant frequencies

•Perturbation Theory (continued)

•if a change in cross sectional area is applied (a perturbation)

•the acoustic effect depends on proximity to a node or an antinode

•near an antinode the formant frequency lowers

•near a node the formant frequency rises

•lip constrictions lower all formant frequencies

•laryngeal constrictions raise all formant frequencies


•Amplitude relationships

•amplitudes depend on formant frequencies

•if F1 is lowered (raised), A1 lowers (rises)

•if two formant frequencies move closer together, then both peaks increase in amplitude

•how do you raise or lower formant frequencies?

•Amplitude relationships

•amplitudes depend on formant frequencies

•if F1 is lowered (raised), A1 lowers (rises)

•if two formant frequencies move closer together, then both peaks increase in amplitude

•how do you raise or lower formant frequencies?


•Source-Filter Interactions

•Some vocal tract shapes may affect vocal fold vibration

•Singers’ formant

•High impedance constrictions require greater subglottal air pressure

•Vocal tract - vocal fold coupling during open phase of vibratory cycle

•Source-Filter Interactions

•Some vocal tract shapes may affect vocal fold vibration

•Singers’ formant

•High impedance constrictions require greater subglottal air pressure

•Vocal tract - vocal fold coupling during open phase of vibratory cycle

Consonants (1)Consonants (1)•The linear source-filter theory can be used to

describe the acoustics of consonants as well as vowels

•For consonants, however, the source is not always at the level of the vocal folds

•some sources are in the vocal tract

•these sources are aperiodic

•durations and amplitudes also are different from vowels

•Nonetheless, source-filter theory gives us a series of expectations for the acoustic characteristics for consonants

•The linear source-filter theory can be used to describe the acoustics of consonants as well as vowels

•For consonants, however, the source is not always at the level of the vocal folds

•some sources are in the vocal tract

•these sources are aperiodic

•durations and amplitudes also are different from vowels

•Nonetheless, source-filter theory gives us a series of expectations for the acoustic characteristics for consonants

Consonants (2)Consonants (2)•Fricatives

•Modeled as a tube with a very severe constriction

•The air exiting the constriction is turbulent

•The Reynold’s number gives the conditions for turbulence

•Re=vh/ʊ

•Notice that turbulence can be generated in two ways

•Zeros or antiformants can be found in the spectrum

•Because of the turbulence, there is no periodicity unless accompanied by voicing

•What does an aperiodic spectrum look like?

•Fricatives

•Modeled as a tube with a very severe constriction

•The air exiting the constriction is turbulent

•The Reynold’s number gives the conditions for turbulence

•Re=vh/ʊ

•Notice that turbulence can be generated in two ways

•Zeros or antiformants can be found in the spectrum

•Because of the turbulence, there is no periodicity unless accompanied by voicing

•What does an aperiodic spectrum look like?

Consonants (3)Consonants (3)

•When a fricative constriction is tapered

•the back cavity is involved

•this resembles a tube closed at both ends

•Fn=nc/2l

•such a situation occurs primarily for articulation disorders

•When a fricative constriction is tapered

•the back cavity is involved

•this resembles a tube closed at both ends

•Fn=nc/2l

•such a situation occurs primarily for articulation disorders

Consonants (4)Consonants (4)•Nasal consonants

•Velopharyngeal port is open and the oral cavity is completely blocked at some point

•The side-branch resonator produces antiformants (zeros)

•The overall vocal tract is longer than for vowels

•What effect does this have on the spectrum?

•Oral formants, nasal formants, nasal antiformants

•Nasal murmur

•Nasal consonants

•Velopharyngeal port is open and the oral cavity is completely blocked at some point

•The side-branch resonator produces antiformants (zeros)

•The overall vocal tract is longer than for vowels

•What effect does this have on the spectrum?

•Oral formants, nasal formants, nasal antiformants

•Nasal murmur

Consonants (5)Consonants (5)•Stops

•The tube model is not altered very much for stops

•However, the time domain becomes critical

•There is a complete closure of the vocal tract somewhere

•Pressure builds up behind the closure

•Rapid release

•The articulation results in a burst and transitions

•Stops

•The tube model is not altered very much for stops

•However, the time domain becomes critical

•There is a complete closure of the vocal tract somewhere

•Pressure builds up behind the closure

•Rapid release

•The articulation results in a burst and transitions

Consonants (6)Consonants (6)

•Other consonants are variations of these

•Affricates

•Liquids

•Glides

•Diphthongs

•Other consonants are variations of these

•Affricates

•Liquids

•Glides

•Diphthongs

ACOUSTICAL THEORY OF SPEECH PRODUCTION Robert A. Prosek, Ph.D. CSD 301 Robert A. Prosek, Ph.D. CSD...

Documents

Transcript of ACOUSTICAL THEORY OF SPEECH PRODUCTION Robert A. Prosek, Ph.D. CSD 301 Robert A. Prosek, Ph.D. CSD...