Pitch Perception. Objective To understand the method (s) by which the auditory system processes a...

Post on 16-Dec-2015

218 views 2 download

Transcript of Pitch Perception. Objective To understand the method (s) by which the auditory system processes a...

Pitch Perception

Objective

• To understand the method (s) by which the auditory system processes a sound in order to determine its pitch.

• Audible range: 20 Hz – 20 kHz*• The pitch of a sound refers to its perceived

tonal height and is subjective; it requires the listener to make a perceptual judgement

• Variations in pitch create a sense of melody

Measuring pitch

• A method sometimes employed as an objective measure of assigning a pitch to a sound:

• the listener adjusts the frequency of a sound with a variable known frequency and similar timbre until the pitch of both sounds are perceived as being equal.

• This method gives the unit of Hertz (Hz) as a measure of the pitch frequency.

• A complex sound is a sound containing more than one frequency component.

• The sound is harmonic if the frequency components occur at integer multiples of the frequency of a common (though not always present) fundamental component.

• The waveform of a harmonic sound repeats periodically at a rate equal to the frequency of the fundamental component.

0.15 0.155 0.16 0.165 0.17 0.175 0.18 0.185 0.19 0.195 0.2-0.1

-0.05

0

0.05

0.1

Time

Fre

quen

cy

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90

262523786

10481310

2000

• Examples of harmonic sounds are the notes produced from musical instruments such as the violin, oboe and flute.

• Very clear sense of pitch• Diagram: periodic waveform (upper) and

spectrum (lower) of an oboe playing C4• Repetition rate of the waveform• Fundamental component in the frequency

spectrum

Pitch of a pure tone

• If the sound is a pure tone the perceived pitch generally corresponds to and varies with the frequency of the tone.

• Its pitch varies also to some extent with level (asa trk 28).

• What duration must a tone be, before it gives a sense of pitch (i.e. from ‘click’ to ‘tone’)? (asa trk 29)

• Pitch of a sound may be influenced by the presence of another sound (asa trk 30)

The pitch of harmonic sounds

• The pitch of a harmonic sound is found to vary mainly with changes in the fundamental frequency and so this is used as a measure of the pitch.

• Harmonic sounds - clearly defined sense of pitch• Pitch of the ‘missing fundamental’• A pitch corresponding to the fundamental

frequency may be perceived when it has been removed from the sound (asa trk 37)

Theories of Pitch Perception

• There are two important theories of how the auditory system is believed to code the pitch of a sound: the place theory and the temporal theory.

• The place theory is based on the fact that different frequency components of the input sound stimulate different places along the basilar membrane and in turn auditory nerve fibres with different characteristic frequencies.

Place theory of pitch perception

• The pitch of the sound is assumed to be related to the excitation pattern it produces on the BM.

• The pitch of a pure tone may be explained by the position of maximum excitation.

• For a sound made up of many frequency components, many different maxima occur along the basilar membrane at places corresponding to the frequencies of the components.

• The position of the overall maximum, or the position of the maximum due to the lowest frequency component may not correspond to the perceived pitch of the sound.

• It is known that the pitch of a harmonic sound can remain the same even when energy at its fundamental frequency has been removed.

• This cannot be explained by the place theory

Temporal theory of pitch perception

• The waveform of a sound with a strong unambiguous pitch is periodic.

• The basis for the temporal theory of pitch perception is the timing of neural firings, which occur in response to vibrations on the basilar membrane.

• Nerve firings occur at particular phases of the waveform; a process called phase locking.

Temporal theory

• Due to phase locking the time intervals between the successive firings occur at approximately integer multiples of the period of the waveform.

• In this way the waveform periodicity that occurs at each place on the basilar membrane is coded.

• At some point in the auditory system these time intervals have to be measured.

Temporal theory

• The precision with which the nerve firings are linked to a particular phase in the waveform declines at high frequencies: upper limit of ~ 4-5 kHz.

• the ability to perceive pitches of sounds with fundamental frequencies greater than 5 kHz cannot be explained by this theory.

• It has been found that musical interval and melody perception decreases for sounds with fundamental frequencies greater than 5 kHz, although differences in frequency can still be heard.

• Sounds produced by musical instruments, the human voice and most every day sounds have fundamental frequencies below 5 kHz.

In sum

• The auditory processing parts of the brain are supplied with information concerning the place of stimulation on the basilar membrane (place theory) and neural firing patterns (temporal theory).

• The importance of both types of information may depend on the frequencies present and the type of sound.

• Place coding may dominate for frequencies above 5 kHz where phase locking is reduced - below this temporal information may be dominant.

Frequency discrimination

• The ability to detect changes in frequency over time.• difference limen for frequency (DLF) – the smallest

detectable change in frequency• For a pair of pure tones the listener judges whether

the second tone is higher or lower in pitch than the first. The DLF is the frequency separation for which there is a certain percentage of correct responses, e.g. 75%.

• Frequency discrimination of pure tones (asa trk 33)

Frequency discrimination

• DLF varies from person to person• The DLF has been found to depend on frequency,

level, duration, suddenness of the frequency change and musical training

• A large increase in DLF for frequencies > 4-5 kHz• DLF improves with increasing duration (at least

up to 200ms, Moore 2003)• DLF improves with increasing level.

Models of pitch perception

• Models to account for pitch perception in complex tones

• more than one frequency component, and hearing a pitch corresponding to the missing fundamental.

• Pattern recognition models – frequency analysis to determine individual components present, pattern recogniser to determine the pitch from the components

Pattern recognition models

• The pattern recogniser attempts to find the fundamental frequency that corresponds to the harmonics detected – template matching process.

• Some models of this type are: • Goldstein, J. L., 1973. “An optimum processor

theory for the central formation of the pitch of complex tones.” J. Acoust. Soc. Am., 54(6), 1496-1516

Pattern recognition models

• Terhardt, E., G. Stoll and M. Seewann 1982. “Algorithm for the extraction of pitch and pitch salience from complex tonal signals.” J. Acoust. Soc. Am., 71(3), 679-687

• Terhardt E., G. Stoll and M. Seewann 1982. “Pitch of complex signals according to virtual-pitch theory: Tests examples and predictions.” J. Acoust. Soc. Am., 71(3), 671

Model for pitch perception in complex tones

• Some models of pitch perception combine both place and temporal information

• Model proposed by Moore (2003) in earlier editions of his book

Bank of bandpass filters

Neural transduction

Analysis of spike intervals

Combine intervals across CFs

Most common interval

Pitch

• Bank of bandpass filters: - spectral analysis on the input sound

• filters organised according to their centre frequencies, representing tonotopic organisation of frequencies on the BM

• Each filter may be thought of as representing the frequency response of one point on the BM.

• bandwidths according to the ERB

• Neural transduction – represents mechanical to neural transduction at the hair cell auditory nerve fibre synapse

• output stream of spike events precisely located in time

• to represent the signal produced by the auditory nerve fibres

• reflects the waveform structure produced at each point on the BM

• Analysis of intervals – periodic output of filter channel – find period / frequencies present in each channel

• Compare time intervals across channels – for a harmonic sound the lowest common interval is that of the fundamental frequency

• Graph represents waveform at 4 points – first 4 harmonics for a 200 Hz fundamental

0 0.005 0.01 0.015 0.02 0.025-0.5

0

0.5

0 0.0025 0.005 0.01 0.015 0.02 0.025-0.5

0

0.5

0 0.0016 0.005 0.01 0.015 0.02 0.025-0.5

0

0.5

0 0.0013 0.005 0.01 0.015 0.02 0.025-0.5

0

0.5

time (s)

200, 400, 600 and 800 Hz waveforms - first 4 harmoincs

Harmonic Hz

Period s

200 0.005

400 0.0025 0.005

600 0.00167 0.0033 0.005

800 0.00125 0.0025 0.00375 0.005

• Intervals between successive firings for a complex sound consisting of the frequency components 200, 400, 600 and 800 Hz.

• Intervals between successive nerve firings indicate the period of each individual harmonic, when sufficiently resolved.

• Lowest common time interval at 0.005s – corresponds to the 200 Hz fundamental frequency

• Computer implementation of the above model:

• Meddis, R. and L. O’Mard. 1997. “A unitary model of pitch perception.” J. Acoust. Soc. Am., 102 (3), 1811-1820

Pitch organisation in WTM

• The pitches of notes in WTM are most often tuned according to the system of equal temperament.

• Equal tempered tuning was formed out of a requirement for equally spaced intervals in terms of frequency ratio regardless of tonality (the musical key).

• Two notes are an octave apart if their frequencies are in the ratio 2:1.

• In equal tempered tuning the octave is divided into twelve equal logarithmic steps called semitones.

• The fundamental frequency of adjacent semitones differs by a factor of 21/12.

• The semitone may be subdivided into ‘cents’. • There are 100 cents in a semitone and therefore

1200 cents in an octave. • The ratio of all equal tempered musical intervals

(except for the unison and the octave) match approximately to the ratio of the corresponding pure tone interval in the harmonic series; the octave and the unsion match exactly

• Only the octave and unison intervals can be described in terms of small integer ratios.

• All other intervals are tempered slightly from the small integer ratio (e.g. the fifth is tempered slightly less than pure, the fourth is slightly greater than pure and the major third is greater than pure).

Pitch in two dimensions

• Pitch perception in music is often thought of in two dimensions, pitch height and pitch chroma (Shepard, 1964). (asa trk 52)

• This is to account for the perceived similarity of pitches that are separated by octaves.

• Pitch height is the low / high dimension of pitch. • The relative position of a pitch within a given

octave is referred to as its chroma.

• In Western music theory this is indicated by the octave equivalent pitch classes, C, C# etc.

• Music notes are identified first by their position within the octave, their chroma, and then by the octave in which they are placed (e.g. G3, F#6).