Behrman Chapter 5, 6

Post on 30-Dec-2015

42 views 0 download

description

Behrman Chapter 5, 6. Place less emphasis on… Minor anatomical landmarks and features Extrinsic muscles of the larynx Blood supply to the larynx Central motor control of larynx Peripheral Sensory control of larynx Stress-Strain Properties of Vocal Folds. Laryngeal Activity in Speech/Song. - PowerPoint PPT Presentation

Transcript of Behrman Chapter 5, 6

Behrman Chapter 5, 6

Place less emphasis on…

• Minor anatomical landmarks and features

• Extrinsic muscles of the larynx

• Blood supply to the larynx

• Central motor control of larynx

• Peripheral Sensory control of larynx

• Stress-Strain Properties of Vocal Folds

Laryngeal Activity in Speech/Song

• Sound source to excite the vocal tract– Voice– Whisper

• Prosody– Fundamental frequency (F0) variation– Amplitude variation

• Realization of phonetic goals– Voicing– Devoicing– Glottal frication (//, //)– Glottal stop (//)– Aspiration

• Para-linguistic and extra-linguistic roles– Transmit affect– Speaker identity

The vocal fold through life…

• Newborns– No layered structure of LP– LP loose and pliable

• Children– Vocal ligament appears 1-4

yrs– 3-layered LP is not clear

until 15 yrs

• Old age– Superficial layer becomes

edematous & thicker– Thinning of intermediate

layer and thickening of deep layer

– Changes in LP more pronounced in men

– Muscle atrophy

The Glottal Cycle

Myoelastic Aerodynamic Theory of Phonation

Necessary and Sufficient Conditions

• Vocal Folds are adducted (Adduction)

• Vocal Folds are tensed (Longitudinal Tension)

• Presence of Aerodynamic pressures

2-mass model

Lower part of vocal fold

Upper part of vocal fold

Mechanical coupling stiffness

TA muscle

Coupling between

mucosa & muscle

•VF adducted & tensed → myoelastic pressure (Pme )•Glottis is closed•subglottal air pressure (Psg) ↑ •Psg ~ 8-10 cm H20, Psg > Pme

•L and R M1 separate•Transglottal airflow (Utg) = 0

As M1 separates, M2 follows due to

mechanical coupling stiffnessPsg > Pme

glottis begins to openPsg > Patm therefore Utg > 0

Utg ↑ ↑ since glottal aperature << tracheal circumference

Utg ↑ Ptg ↓ due to

Bernoulli effectPressure drop across the glottis

Bernoulli’s Law

P + ½ U2 = K

where

P = air pressure

= air density

U = air velocity

Utg ↑ Ptg ↓ due to Bernoulli effect

Plus “other” aerodynamic effects

Ptg < Pme

M1 returns to midlineM2 follows M1 due to

mechanical coupling stiffness

Utg = 0

Pattern repeats 100-200 times a second

Limitations of this simple model

The Glottal Cycle

Sound pressure wave

Time

Inst

anta

neou

sso

und

pres

sure

Phonation is actually quasi-periodic

• Complex Periodic– vocal fold oscillation

• Aperiodic– Broad frequency noise embedded in signal– Non-periodic vocal fold oscillation– Asymmetry of vocal fold oscillation – Air turbulence

• Voicing vs. whispering

Glottal Aerodynamics

• Volume Velocity

• Driving Pressure

• Phonation Threshold Pressure– Initiate phonation– Sustain phonation

• Laryngeal Airway Resistance

Measuring Glottal Behavior

• Videolaryngoscopy – Stroboscopy– High speed video

Photoglottography (PGG)

Time

illum

inat

ion

Electroglottography (EGG)• Human tissue = conductor • Air: conductor• Electrodes placed on each

side of thyroid lamina• high frequency, low current

signal is passed between them

• VF contact = impedance• VF contact = impedance

Electroglottogram

Glottal Airflow (volume velocity)

• Instantaneous airflow is measured as it leaves the mouth

• Looks similar to a pressure waveform

• Can be inverse filtered to remove effects of vocal tract

• Resultant is an estimate of the airflow at the glottis

Flow Glottogram

Synchronous plots

Sound pressure waveform(at mouth)

Flow glottogram(inverse filtered mask signal)

Photoglottogram

Electroglottogram

F0 Control

• Anatomical factorsMales ↑ VF mass and length = ↓ Fo

Females ↓ VF mass and length = ↑ Fo

• Subglottal pressure adjustment – show example↑ Psg = ↑ Fo

• Laryngeal and vocal fold adjustments↑ CT activity = ↑ Fo

TA activity = ↑ Fo or ↓ Fo

• Extralaryngeal adjustments↑ height of larynx = ↑ Fo

Fundamental Frequency (F0)

Average F0

• speaking fundamental frequency (SFF)

• Correlate of pitch

• Infants– ~350-500 Hz

• Boys & girls (3-10) – ~ 270-300 Hz

• Young adult females– ~ 220 Hz

• Young adult males– ~ 120 Hz

Older females: F0 ↓

Older males: F0 ↑

F0 variability• F0 varies due to

– Syllabic & emphatic stress– Syntactic and semantic factors– Phonetics factors (in some

languages) • Provides a melody (prosody)

• Measures– F0 Standard deviation

• ~2-4 semitones for normal speakers

– F0 Range

Maximum Phonational Frequency Range

• highest possible F0 - lowest possible F0

• Not a speech measure

• measured in Hz, semitones or octaves

• Males ~ 80-700 Hz1

• Females ~135-1000 Hz1

• 3 octaves often considered normal

1Baken (1987)

Fundamental Frequency (F0) Control

• Ways to measure F0

– Time domain vs. frequency domain– Manual vs. automated measurement– Specific Approaches

• Peak picking• Zero crossing• Autocorrelation• The cepstrum & cepstral analysis

Autocorrelation

Data Correlation

+ 1.0

+ 0.1

- 0.82

+ 0.92

Cepstrum

Amplitude Control

• Subglottal pressure adjustment↑ Psg = ↑ sound pressure

• Laryngeal and vocal fold adjustments↑ medial compression = ↑ sound pressure

• Supralaryngeal adjustments

Measuring Amplitude

• Pressure

• Intensity

• Decibel Scale

Sound Pressure Level (SPL)

Average SPL• Correlate of loudness• conversation:

• ~ 65-80 dBSPL

SPL Variability SPL to mark stress• Contributes to prosody• Measure

– Standard deviation for neutral reading material:

• ~ 10 dBSPL

Dynamic Range

• Amplitude analogue to maximum phonational frequency range

• ~50 – 115 dB SPL

Vocal Quality

• no clear acoustic correlates like pitch and loudness

• However, terms have invaded our vocabulary that suggest distinct categories of voice quality

Common Terms• Breathy• Tense/strained• Rough• Hoarse

Are there features in the acoustic signal that correlate with these

quality descriptors?

BreathinessPerceptual Description• Audible air escape in the voice

Physiologic Factors• Diminished or absent closed phase• Increased airflow

Potential Acoustic Consequences• Change in harmonic (periodic) energy

– Sharper harmonic roll off• Change in aperiodic energy

– Increased level of aperiodic energy (i.e. noise), particularly in the high frequencies

harmonics (signal)-to-noise-ratio (SNR/HNR)

• harmonic/noise amplitude HNR

– Relatively more signal– Indicative of a normality

HNR– Relatively more noise– Indicative of disorder

• Normative values depend on method of calculation

• “normal” HNR ~ 15

Harmonic peak

Noise ‘floor’

Noise ‘floor’

Frequency

Am

plitude

Harmonic peak

From Hillenbrand et al. (1996)

First harmonic amplitude

Prominent Cepstral Peak

Spectral Tilt: Voice Source

Spectral Tilt: Radiated Sound

Peak/average amplitude ratio

From Hillenbrand et al. (1996)

WMU Graduate Students

Tense/Pressed/Effortful/Strained Voice

Perceptual Description• Sense of effort in production

Physiologic Factors• Longer closed phase• Reduced airflow

Potential Acoustic consequences• Change in harmonic (periodic) energy

– Flatter harmonic roll off

Pressed

Breathy

Spectral Tilt

Acoustic Basis of Vocal Effort

100.000000 200.000000 300.000000 400.000000 500.000000

effort

100.000000

200.000000

300.000000

400.000000

500.000000

Reg

ress

ion

Ad

just

ed (

Pre

ss)

Pre

dic

ted

V

alu

eDependent Variable: effort

Scatterplot

F0 + RMS + Open Quotient

Perc

epti

on o

f E

ffor

t

Tasko, Parker & Hillenbrand (2008)

Roughness

• Perceptual Description– Perceived cycle-to-cycle variability in voice

• Physiologic Factors– Vocal folds vibrate, but in an irregular way

• Potential Acoustic Consequences– Cycle-to-cycle variations F0 and amplitude– Elevated jitter– Elevated shimmer

Period/frequency & amplitude variability

• Jitter: variability in the period of each successive cycle of vibration

• Shimmer: variability in the amplitude of each successive cycle of vibration

Jitter and Shimmer

Sources of jitter and shimmer• Small structural asymmetries

of vocal folds• “material” on the vocal folds

(e.g. mucus)• Biomechanical events, such as

raising/lowering the larynx in the neck

• Small variations in tracheal pressures

• “Bodily” events – system noise

Measuring jitter and shimmer• Variability in measurement

approaches• Variability in how measures are

reported• Jitter

– Typically reported as % or msec– Normal ~ 0.2 - 1%

• Shimmer– Can be % or dB– Norms not well established

Vocal Register

What is a vocal register?

Vocal Registers

Pulse (Glottal fry)– 30-80 Hz, mean ~ 60 Hz– Closed phase very long (90 % cycle)– May see biphasic pattern of vibration (open,

close a bit, open and close completely)– Low subglottal pressure (2 cm water)– Energy dies out over the course of a cycle so

parts of the cycle has very little energy– Hear each individual cycle

Vocal Registers

Modal– VF are relatively short and thick – Reduced VF stiffness– Large amplitude of vibration– Possesses a clear closed phase– The result is a voice that is relatively loud and

low in pitch – Average values cited refer to modal register

Vocal Registers

Falsetto– 500-1100 Hz (275-600 Hz males)– VF are relatively long and thin– Increased VF stiffness– Small amplitude of vibration– Vibration less complex– Incomplete closure (no closed phase)– The result is a voice that is high in pitch