Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. ·...

37
Neural Processing of Amplitude-Modulated Sounds P. X. JORIS, C. E. SCHREINER, AND A. REES Laboratory of Auditory Neurophysiology, Division of Neurophysiology, K.U. Leuven, Leuven, Belgium; Coleman Laboratory, Department of Otolaryngology, Keck Center for Integrative Neuroscience, University of California at San Franscisco, San Francisco, California; and School of Neurology, Neurobiology, and Psychiatry, The Medical School, University of Newcastle upon Tyne, Newcastle upon Tyne, United Kingdom I. Temporal Dimensions of Sound 542 II. Human Sensitivity to Amplitude Modulation 544 III. Neural Response Measures 545 IV. Auditory Nerve: Bottleneck to the Central Nervous System 547 A. Basic auditory nerve properties 547 B. Average response rate and magnitude of synchronization 548 C. Phase of synchronization 549 V. Cochlear Nucleus: Parallel Channels 550 A. Basic organization of the CN 550 B. AM responses of neuronal types in the CN 551 VI. Superior Olivary Complex: An Example of Time-to-Rate Conversion 553 VII. The Nuclei of the Lateral Lemniscus 555 VIII. Amplitude Modulation Encoding in the Inferior Colliculus: A Center for Convergence 555 A. Basic organization of the IC 555 B. Modulation transfer functions for IC units: synchronization 556 C. Modulation transfer functions for IC units: average rate 557 D. What determines the MTF upper limit in the IC? 558 E. Is AM encoded in the IC by rate or synchronization? 559 F. Relationship between AM responses and other neuronal properties 559 G. Is modulation frequency represented topographically in the IC? 559 H. Responses to interaural time disparities in modulation envelopes 561 I. Contribution of nonlinearities 561 IX. Amplitude Modulation Encoding in Auditory Thalamus and Cerebral Cortex 562 A. Basic layout of the thalamocortical system 562 B. Temporal responses in the MGB 562 C. Responses to AM in primary auditory cortex: synchronization 564 D. Responses to AM in primary auditory cortex: average rate 565 E. Responses to AM in primary auditory cortex: influence of modulation parameters 566 F. Differences of temporal coding between cortical fields 567 G. Cortical mechanisms 567 H. Temporal coding of complex sounds 567 I. Plasticity of temporal coding properties in auditory cortex 568 X. Neurophysiological and Psychological Studies in Humans 569 XI. Conclusion 569 Joris, P. X., C. E. Schreiner, and A. Rees. Neural Processing of Amplitude-Modulated Sounds. Physiol Rev 84: 541–577, 2004; 10.1152/physrev.00029.2003.—Amplitude modulation (AM) is a temporal feature of most natural acoustic signals. A long psychophysical tradition has shown that AM is important in a variety of perceptual tasks, over a range of time scales. Technical possibilities in stimulus synthesis have reinvigorated this field and brought the modulation dimension back into focus. We address the question whether specialized neural mechanisms exist to extract AM information, and thus whether consideration of the modulation domain is essential in understanding the neural architecture of the auditory system. The available evidence suggests that this is the case. Peripheral neural structures not only transmit envelope information in the form of neural activity synchronized to the modulation waveform but are often tuned so that they only respond over a limited range of modulation frequencies. Ascending Physiol Rev 84: 541–577, 2004; 10.1152/physrev.00029.2003. www.prv.org 541 0031-9333/04 $15.00 Copyright © 2004 the American Physiological Society

Transcript of Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. ·...

Page 1: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

Neural Processing of Amplitude-Modulated Sounds

P. X. JORIS, C. E. SCHREINER, AND A. REES

Laboratory of Auditory Neurophysiology, Division of Neurophysiology, K.U. Leuven, Leuven, Belgium;

Coleman Laboratory, Department of Otolaryngology, Keck Center for Integrative Neuroscience, University of

California at San Franscisco, San Francisco, California; and School of Neurology, Neurobiology, and

Psychiatry, The Medical School, University of Newcastle upon Tyne,

Newcastle upon Tyne, United Kingdom

I. Temporal Dimensions of Sound 542II. Human Sensitivity to Amplitude Modulation 544

III. Neural Response Measures 545IV. Auditory Nerve: Bottleneck to the Central Nervous System 547

A. Basic auditory nerve properties 547B. Average response rate and magnitude of synchronization 548C. Phase of synchronization 549

V. Cochlear Nucleus: Parallel Channels 550A. Basic organization of the CN 550B. AM responses of neuronal types in the CN 551

VI. Superior Olivary Complex: An Example of Time-to-Rate Conversion 553VII. The Nuclei of the Lateral Lemniscus 555

VIII. Amplitude Modulation Encoding in the Inferior Colliculus: A Center for Convergence 555A. Basic organization of the IC 555B. Modulation transfer functions for IC units: synchronization 556C. Modulation transfer functions for IC units: average rate 557D. What determines the MTF upper limit in the IC? 558E. Is AM encoded in the IC by rate or synchronization? 559F. Relationship between AM responses and other neuronal properties 559G. Is modulation frequency represented topographically in the IC? 559H. Responses to interaural time disparities in modulation envelopes 561I. Contribution of nonlinearities 561

IX. Amplitude Modulation Encoding in Auditory Thalamus and Cerebral Cortex 562A. Basic layout of the thalamocortical system 562B. Temporal responses in the MGB 562C. Responses to AM in primary auditory cortex: synchronization 564D. Responses to AM in primary auditory cortex: average rate 565E. Responses to AM in primary auditory cortex: influence of modulation parameters 566F. Differences of temporal coding between cortical fields 567G. Cortical mechanisms 567H. Temporal coding of complex sounds 567I. Plasticity of temporal coding properties in auditory cortex 568

X. Neurophysiological and Psychological Studies in Humans 569XI. Conclusion 569

Joris, P. X., C. E. Schreiner, and A. Rees. Neural Processing of Amplitude-Modulated Sounds. Physiol Rev 84:541–577, 2004; 10.1152/physrev.00029.2003.—Amplitude modulation (AM) is a temporal feature of most naturalacoustic signals. A long psychophysical tradition has shown that AM is important in a variety of perceptual tasks,over a range of time scales. Technical possibilities in stimulus synthesis have reinvigorated this field and brought themodulation dimension back into focus. We address the question whether specialized neural mechanisms exist toextract AM information, and thus whether consideration of the modulation domain is essential in understanding theneural architecture of the auditory system. The available evidence suggests that this is the case. Peripheral neuralstructures not only transmit envelope information in the form of neural activity synchronized to the modulationwaveform but are often tuned so that they only respond over a limited range of modulation frequencies. Ascending

Physiol Rev

84: 541–577, 2004; 10.1152/physrev.00029.2003.

www.prv.org 5410031-9333/04 $15.00 Copyright © 2004 the American Physiological Society

Page 2: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

the auditory neuraxis, AM tuning persists but increasingly takes the form of tuning in average firing rate, rather thansynchronization, to modulation frequency. There is a decrease in the highest modulation frequencies that influencethe neural response, either in average rate or synchronization, as one records at higher and higher levels along theneuraxis. In parallel, there is an increasing tolerance of modulation tuning for other stimulus parameters such assound pressure level, modulation depth, and type of carrier. At several anatomical levels, consideration of modu-lation response properties assists the prediction of neural responses to complex natural stimuli. Finally, someevidence exists for a topographic ordering of neurons according to modulation tuning. The picture that emerges isthat temporal modulations are a critical stimulus attribute that assists us in the detection, discrimination, identifi-cation, parsing, and localization of acoustic sources and that this wide-ranging role is reflected in dedicatedphysiological properties at different anatomical levels.

I. TEMPORAL DIMENSIONS OF SOUND

Among the sensory systems, audition excels in itsspeed of operation. This is perhaps not too surprising,since our entire sense of hearing depends on the analysisof rapid changes in acoustic pressure at the two ears. Theimportance of the temporal dimension is manifest inmany structural and functional specializations, starting atthe peripheral sense organ and carried through the sub-sequent stages in the central nervous system. The strikingsensitivity of auditory structures to temporal features ofthe acoustic stimulus has been observed since the earliestelectrophysiological recordings, and this sensitivity isequally prominent in behavioral observations of humansand experimental animals.

Importantly, there are multiple temporal dimensionsin acoustic stimuli (238). It is useful to distinguish “fine-structure” and “envelope” as two components of a timewaveform. The fast pressure variations that determine thespectral content constitute the fine-structure. This fine-structure waxes and wanes in amplitude, and the contourof this amplitude modulation (AM) is the envelope. Forexample, the waveform of a speech utterance showsbursts of energy that correspond to phonemes. The tem-poral characteristics of these bursts carry much informa-tion (44, 108, 214, 265, 272, 281), but their dominantmodulation frequency is rather slow (typically 3–4 Hz,extending up to �20 Hz) vis-a-vis the temporal capabili-ties of the peripheral auditory system. Faster modulationsof several hundred Hertz are also very common, e.g., insegments of voiced speech where they are perceptuallyassociated with voice pitch. These envelope componentsarise from interactions between fine-structure compo-nents and are not present as such, i.e., as acoustic energy,in the waveform. This is illustrated by the superpositionof two sine waves, equal in amplitude but separated by asmall difference frequency (fd): constructive and destruc-tive interference of the two components generate AM inthe form of “beating” at frequency fd. The same principleextends to environmental sound sources, which com-monly produce quasi-periodic signals consisting of arange of frequency components (harmonics) that are mul-tiples of a fundamental frequency: the combination ofeven a limited number of components, e.g., within a co-

chlear filter, reconstitutes the fundamental frequency inthe form of a temporal envelope modulation. (For exam-ples of spectrograms, waveforms, and treatment of AM,see Refs. 99, 100, 177, 180, 302.)

The laboratory stimulus most often used in physio-logical studies of modulation is a pure tone (sinusoid)modulated by another tone. Figure 1A and Equation 1

represent the waveform [s(t)] of a tone with frequency fc

(the carrier), whose amplitude is modulated by a lowerfrequency fm (the modulator) at a modulation depth m

(0 � m � 1)

s(t) � [1 � m sin (2�fmt)] sin (2�fct) (1)

For fc �� fm the first term [1 � msin(2�fmt)] is thetime-varying amplitude or envelope.1 Using trigonometricidentities, s(t) can be rewritten as the sum of three com-ponents at fc and at fc � fm (the upper and lower side-bands)

sin (2�fct) � m/2[sin 2�(fc � fm)t � sin 2�(fc � fm)t] (2)

This signal does not contain energy at fm (Fig. 1, A and B);the modulation in the time waveform is due to the inter-action of the components in the signal which are sepa-rated by a difference frequency fm.

The sinusoidal AM stimulus is special because itsenvelope consists of a single sinusoidal component. Inreal-world stimuli, a range of modulations is usuallypresent, which can be summarized by the modulationspectrum: the distribution of modulation energy for thewhole waveform or for a selected band of carrier frequen-cies in the waveform. The subjectively experienced qual-ity of a modulated signal depends on modulation fre-quency so that the modulation spectrum also defines dif-ferent perceptual ranges (see sect. II).

The impetus in early physiological studies to usemodulated stimuli (57, 62, 78, 183, 196) was a desire to go

1 The relationship of m to the waveform is the same as that of theRayleigh or Michelson contrast ratio used in vision research: m equalsthe difference between the maximum and minimum luminance dividedby their sum.

542 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 3: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

beyond the arsenal of simple stimuli (pure tones, clicks,noise) that dominated much of the research at that time.Somewhat similar to gratings in the visual domain, AMand frequency modulation (FM) were regarded as elemen-tary features of natural stimuli, which could reveal dy-namic properties of the auditory system not addressedwith simpler stimuli. Interest in responses to AM wasrekindled in the 1980s and 1990s through a convergenceof different lines of research concerned with the “dynamicrange problem,” speech coding, pitch, and spatial local-ization of high-frequency sounds, among others. However,AM signals are more than just a convenient laboratorytool to study a diversity of psychophysical and physiolog-ical phenomena. The question that we are concerned withhere is whether envelope processing is embedded in theauditory system, as may be expected from the ecologicalprominence of envelopes.

Given the theory of natural selection, one can assumethat animals are well adapted to their specific acousticenvironment and that the statistical structure of the nat-ural auditory environment or the “acoustic ecology” (5) is

reflected in the structure and function of the auditorysystem. Acoustic ecology can be defined as the totalensemble of sounds present in an animal’s environment,from both inanimate as well as biological sources. Indeed,the auditory systems of acoustically specialized animalshave revealed the existence of highly developed adapta-tions. Prominent examples include the echolocation sys-tem of bats (e.g., Ref. 61), the mating call detection sys-tem in frogs (245), and the alarm call differentiation invervet monkeys (275). Common to these examples is thatparticular behaviors are elicited by a small set of signalswith specific, fairly invariant acoustic properties. Charac-terization of these lower order physical sound attributesled to the discovery of special neuronal mechanisms.

Relatively little work has been done on the quantita-tive analysis of amplitude modulation statistics in acous-tic ecologies and their consequences for neuronal pro-cessing. Not only overtly specialized but all animals arelikely to exploit consistencies in statistical properties ofthe acoustical environment. Nelken et al. (194) found thatlow-frequency amplitude modulations are prominent in

FIG. 1. A: superimposed waveforms of an unmodulated 1,000-Hz tone (thin line) and the same tone sinusoidallyamplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according to Equation 1. Dashedlines indicate the envelope. The amplitude is referenced to the peak amplitude of the unmodulated tone. B: idealizedspectrum of the AM tone in A. At 100% modulation, the amplitude of the sidebands is half that of the carrier, i.e., adifference of 6 dB. C: average response in the form of a poststimulus time (PST) histogram of a nerve fiber to the signalshown in A (stimulus duration, 50 ms). D: spectrum of the PST histogram in C. The components at carrier frequency (fc)and fc � modulation frequency (fm) indicate that there is phase-locking to the fine-structure of the stimulus waveform.The component at fm is prominently present in the response but is absent in the stimulus (B). The small circle on theordinate indicates the average firing rate.

AUDITORY MODULATION PROCESSING 543

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 4: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

natural environments and are often coherent over differ-ent frequency regions, and may be exploited by the audi-tory system in signal detection. Voss and Clarke (288)computed temporal correlations of music passages anddiscovered a 1/f scaling relation over a few decades. Morerecently, Attias and Schreiner (6) decomposed music,speech, and animal vocalizations into narrow-band fre-quency channels and studied the statistics of the ampli-tude and phase distributions for each channel. They alsofound a distribution of modulation frequencies followinga power-law, indicating that the amplitude modulationstatistics of natural sound are non-Gaussian, cover a widerange of modulation frequencies, and scale universally,i.e., the frequency dependence is similar over differentfrequency ranges. Using a mutual information metric be-tween stimulus and spike trains, it was also found (7) thatneurons in the cat inferior colliculus are more efficient atcoding naturalistic stimuli than nonnaturalistic stimuli:the information rate per spike for naturalistic stimuli wasmore than 60% higher than for nonnaturalistic signals.Similar results have been seen in the frog (232). Thisimplies that neural processing is adapted and perhapsoptimized for the encoding of naturally occurring modu-lation information.

Our purpose is to review physiological mechanismsthat may be important for the processing of temporalenvelope information. We first briefly highlight findingsfrom human psychophysics to illustrate some of the per-ceptual consequences of AM, but we refrain from a moresubstantial discussion of the relationship between physi-ological mechanisms and perception. Rather, our focus ison a simpler and more basic question; namely, withinwhat limits is AM encoded by single auditory neurons,and does the form of encoding suggest that the temporalenvelope dimension is a fundamental organizing principlein the auditory system; in the manner that tuning toorientation, direction, or spatial frequency are consideredfundamental in vision.

For reasons of space, only occasional reference willbe made to the extensive research in bats or nonmamma-lian vertebrates, even though AM is often an importantfeature in echolocation signals (156, 198, 258) and theirstudy often preceded the research reviewed here.

II. HUMAN SENSITIVITY TO

AMPLITUDE MODULATION

The ability of human listeners to detect and discrim-inate AM has been a topic of study since the 18th century.The earliest means of producing a sound with a fluctuat-ing amplitude envelope was to mix two pure tones differ-ing slightly in frequency to generate beats. Thomas Youngand Helmholtz (287) both described the sensation of fluc-tuating amplitude experienced when listening to beats,

and Helmholtz described the changing quality of thesound as the beat frequency was increased. He noted that“the ear easily follows slow beats of not more than 4 to 6in a second” while at 30 beats/s it is still possible to hearthe pulses of the tone, but it is no longer possible to hearthem as distinct events and they have a “jarring andrough” quality.

With improvements in technology, subsequent stud-ies (see Ref. 131 for historical review) extended andquantified these findings. Zwicker (324) showed that thethreshold for detecting AM is very small at low modula-tion frequencies (threshold m �2% for fm of 1–4 Hz and fc

of 1 kHz) and increases to a maximum with increasing fm

(m �5% for fm of 32 Hz and fc of 250 Hz; and for fm of 125Hz and fc of 4 kHz). Above this maximum, thresholddecreases and falls below the values obtained at lowmodulation frequencies, but in this range subjects per-ceive the carrier and the modulation frequency as distincttones. Zwicker (324) also determined that, for a givencarrier, thresholds for the detection of AM and FM mea-sured in terms of their modulation depths coincide on theupper side of the maximum at a modulation frequency hetermed the Phasengrenzfrequenz. This led Zwicker to pos-tulate that above the Phasengrenzfrequenz [now termedthe critical modulation frequency (CMF) (250, 263)] thecarrier and sideband components are analyzed in differ-ent critical bands (auditory filters), and thus subjects arenot sensitive to differences in the relative phase of themodulation components that enable them to distinguishAM from FM below the CMF. More recent evidence sug-gests that the situation is more complex than this (180,263), but nevertheless, it appears that when listening toAM imposed on pure tone carriers detection may rely onspectral rather than temporal cues over some ranges ofmodulation frequency.

One means of eliminating spectral cues, and there-fore estimating the temporal resolving power of the audi-tory system, is to measure the detection of sinusoidalmodulation imposed on noise rather than a tonal carrier.The broadband spectrum of the noise precludes the lis-tener detecting the individual spectral components of thestimulus spectrum. The use of such stimuli (9, 285) dem-onstrated that the relationship between threshold andmodulation frequency (the psychophysical temporal mod-ulation transfer function) is essentially a low-pass func-tion with a 3-dB cut-off around 50 Hz and a slope of�4 dB/octave. The minimum threshold modulation depthis �5% at low modulation frequencies (�10 Hz) wheresubjects detect the individual amplitude changes in thestimulus. The upper limit of modulation detection extendsto �2.2 kHz (68, 285, 286). As will become apparent later,this coincides with the very highest limits of neural phase-locking to envelopes obtained for some neurons in theauditory periphery in cats (Fig. 2, Refs. 127, 229) and

544 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 5: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

exceeds the limit for phase-locking to envelopes in morecentral neurons. This raises questions as to the nature ofmodulation encoding in the central auditory system, evenwhen one takes into account the encoding of modulationsby changes in average rate that become apparent at morecentral sites.

Although, as Zwicker noted, a distinct pitch at thefrequency of modulation is perceived when componentsof the stimulus spectrum can be resolved, weaker butnevertheless clear pitches are also perceived with modu-lations containing no resolved components (179, 233).Even modulations imposed on noise carriers can generatepitches which though weaker than those generated withtonal stimuli are able to support melody recognition (21,22). Taken together, these findings demonstrate that theperiodicity or residue pitches of some modulations mustresult solely from temporal analysis, but when resolvedcomponents are present, pitch salience is increased. Fig-ure 2 schematically indicates the combinations of carrier

and modulation frequencies resulting in the percepts offluctuation, roughness, and residue pitch. (Sensitivity tobinaural envelope disparities is discussed in section VI.)

Two competing models have been proposed to ex-plain the detection of AM. The first consists of a bandpassfilter and half-wave rectifier representing processing bythe cochlea, followed by a low-pass filter (285). Somemeasure of the output of this filter provides the basis forthe subject’s response (see Ref. 181 for discussion). Inessence, therefore, this model is an envelope detector.The second scheme models the detection of modulationby a bank of bandpass filters that are sensitive to differentranges of modulation frequency. A channel or filterbankmodel of modulation analysis was first proposed by Kayand colleagues (84, 132) on the basis of adaptation studieswith FM and AM. Subsequently, the adaptation paradigmwas questioned (178, 289), but the concept of a modula-tion filterbank persists because studies using differentpsychophysical paradigms have since reported findingswhich support the concept of modulation frequency tun-ing. Evidence for such selectivity comes from modulationmasking experiments (8, 107), and modulation detectioninterference (MDI), a phenomenon in which the detectionof AM is influenced by modulation at the same frequencybut on a very different carrier (318). Dau et al. (36)invoked a model consisting of a modulation filterbankassociated with each auditory filter to account for thedetection and masking of sinusoidally amplitude-modu-lated narrowband noise. The latter model was extended(283) to account for comodulation masking release, an-other phenomenon, like MDI, that indicates some elementof modulation waveform analysis across different carrierfrequencies (96) (see Ref. 180 for review). Such across-frequency interactions between similar modulation enve-lopes are likely to contribute to grouping and the con-struction of auditory images (90). Despite different linesof evidence favoring some form of modulation filterbank,the concept remains controversial, and the experimentalfindings discussed above do not concur in their estimatesof the bandwidths for these putative channels.

III. NEURAL RESPONSE MEASURES

In neurophysiology, one can generally think of avariety of ways in which stimulus features may be “en-coded” and processed (208), and it is not immediatelyobvious which aspects of neuronal behavior are the mostrelevant for the perceptual task at hand. With few excep-tions, the response measures used in studies of AM areaverage discharge rate (i.e., the number of spikes evokedover several modulation cycles), or some measure ofsynchronization of the timing of action potentials to theenvelope waveform.

FIG. 2. Amplitude modulation (AM) stimuli generate different per-cepts that encompass several regions of modulation and carrier frequen-cies. At very low fm, most strongly near 4 Hz and disappearing around 20Hz, a sensation of fluctuation or rhythm is produced (hatched). The rateat which the temporal envelope of fluent speech varies is also typically4 Hz (syllables/s). Fluctuation makes a smooth transition to a percept ofroughness, which starts at �15 Hz (bottom curved line), is strongestnear 70 Hz, and disappears below 300 Hz (top curved line). Harmoniccomplex tones produce a pitch that corresponds to a frequency close tothe fundamental frequency. However, the lower harmonics can be re-moved without affecting the pitch, resulting in “residue pitch” if fc andfm are chosen within the shaded region. Finally, small interaural timedifferences (ITD) can be detected between modulated stimuli to the twoears for a region of combinations of fm and fc that overlaps with theregion for residue pitch (thick line). Note that these are regions instimulus space where modulation is perceptually relevant, but the pre-cise relationship of these percepts to physiological response modulationis usually unclear. For reference, the small dots indicate �10 dB cutoffvalues for modulation transfer functions (MTFs) of auditory nerve fibers(cf. Fig. 3C) [based on further analysis of data reported by Joris and Yin(127)]. Delineation of psychophysical regions is based on References 16,104, 233, 278, 325. The ordinate is truncated at 4 Hz.

AUDITORY MODULATION PROCESSING 545

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 6: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

The earliest single-unit studies of peripheral auditoryneurons already reported synchronization to the fine-structure of tones, in the sense that discharges occur at aparticular phase of the cyclical waveform. For example,auditory nerve fibers have the striking capability to“phase-lock” to low-frequency tones up to several kilo-Hertz [4–5 kHz in the cat (121), but the upper limit isspecies dependent (298)]. Phase-locking also occurs tostimulus envelope; both forms of phase-locking are im-mediately apparent in the poststimulus time (PST) histo-gram (Fig. 1C) to the AM stimulus of Figure 1A. The finespacing of peaks at intervals of 1 ms indicates phase-locking to the 1-kHz fine-structure; the grouping intobroader peaks spaced by 10 ms indicates phase-locking tothe 100-Hz envelope. In contrast to the stimulus spectrum(Fig. 1B), the response spectrum (Fig. 1D) shows energyat fm, i.e., the AM signal is demodulated. Several cochlearnonlinearities with asymmetry between the positive andnegative part of the transfer function can contribute tothis demodulation, the most important being half-waverectification in the relationship between displacement ofhair cell stereocilia and receptor potential, and in theabsence of negative firing rates (135). The response spec-trum also shows a value at 0 Hz (Fig. 1D: small circle onordinate) which equals the average firing rate. In thisreview, we will use the terms envelope synchronization

and envelope phase-locking synonymously to refer to syn-chronization of the response to the stimulus envelopewaveform, and use the term rate coding for changes inaverage firing rate during manipulation of the stimulusmodulation parameters.

Different synchronization measures have been used,sometimes leading to seemingly contradictory state-ments. The most popular metric is “vector strength” R,also called synchronization index (81). Each spike istreated as a vector of unit length and with phase �i be-tween 0 and 2� measured as the spike time modulo thestimulus period of interest. The x- and y-components ofthe vector are xi � cos�i and yi � sin�i. The n spikes in aresponse are combined by vector addition, and the resul-tant vector is normalized to n

R ����

i

n

xi�2

� ��i

n

yi�2

n(3)

which takes values between 0 and 1. R can also be ob-tained from the Fourier spectrum of the PST or periodhistogram, in which case it equals the magnitude of thefirst harmonic, normalized by the DC component (averagefiring rate). Phase � is also retrieved with either tech-nique. Statistical significance of synchronization is usuallyquantified with the Rayleigh test (23, 168).

As will become clear in this review, envelope codingat peripheral stages is predominantly temporal ratherthan rate-based, but these two aspects of the responseprogressively reverse in prominence at successive stagesalong the neuraxis. Because both average firing rate andsynchronization may contribute to the impact that a neu-ron has on its postsynaptic targets, many experimentershave combined the two metrics by multiplication (nR,with n � total number of spikes, variously called “mod-ulated rate,” “phase-locked rate,” “synchronized rate”), or,equivalently, by reporting the unnormalized Fourier com-ponent, expressed in spikes per second (33, 141, 224, 314).Recently, some authors have used 2nR2, which is also thestatistic used in the Rayleigh test of significance (157,266). Finally, envelope synchronization is often reportedas a gain value (in dB), defined as 20 log10 (2R/m), whichrelates output directly to input and facilitates comparisonacross studies which use different modulation depth m.

The vector strength metric, often under differentnames (e.g., selectivity index), has found general use inthe quantification of periodic neural signals in sensoryand even motor physiology (43). Despite its pervasive use,it is important to be aware of its limitations. First, themetric gives only the degree to which the response ismodulated to the frequency at which R is calculated (weuse the subscripts m and c to indicate modulation fre-quency and carrier frequency, respectively). It does notcapture the full harmonic content of the cycle histogramat fm so that histograms with a rather different shape canresult in the same Rm value (see Ref. 127 for an example).An Rm value of one only results from perfect alignment ofall spikes at one phase, but a value of zero does notnecessarily indicate a random distribution of spike times.For example, if spike times are equally divided betweenphase � and � � �, the average vector has zero magni-tude. Thus a low vector strength should not necessarily beequated to absence of temporal structure in the spiketrain, but rather is an indication of lack of energy at thefrequency for which R was calculated. Second, high R

values indicate that spikes are distributed over a narrowtime window relative to the period of interest, but suchvalues do not imply a faithful replica of the stimulusmodulation waveform in the probability of discharge. As areference, a PST histogram that closely resembles a half-wave rectified sinusoidal AM signal with m � 1 gives R �0.5. Higher R values are obtained when the period histo-grams are more “peaked” than the original sinusoidalmodulation signal. Third, R is a compressive metric and istherefore sometimes graphed on an expansive scale (120).Finally, a problem at a more general level is that calcula-tion of Rm requires knowledge of fm, a strategy that thebrain cannot use. It may be argued that a “clock” signal isavailable in the form of the highly synchronized dischargeof some types of cochlear nucleus neurons, which couldbe used to perform a vector strength type calculation in

546 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 7: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

which degree of synchronization is translated into aver-age firing rate, e.g., as suggested in the periodicity extrac-tion scheme by Langner (150). Some authors have usedinterspike interval or autocorrelation analysis to bring outthe time structure of responses that may be more relevantto the operations performed by the central processor (27,85, 123, 141, 226, 301). In this context it is important toremember that the envelope of most natural sounds is notstrictly periodic in the first place and that the raw acousticwaveform is not available as such to the auditory nervoussystem. Rather, this waveform is decomposed into a mul-titude of waveforms by virtue of cochlear narrowbandfiltering (reviewed in Refs. 206, 234). This process pro-foundly affects the modulation spectrum present in eachfrequency channel, which is thus determined jointly bythe spectrotemporal properties of the acoustic stimulusand of those of the peripheral filtering process (for illus-trations, see Ref. 286). In summary, while most studiesdiscussed here have used deterministic stimuli with peri-odic envelopes and have applied the R metric, it is impor-tant to keep in mind that, for natural stimuli, the relation-ship between neural response modulation and stimulusmodulation is more complex and that the neural opera-tions by which the central processor extracts envelopeinformation likely differ fundamentally from the analyti-cal ways of the experimenter.

The bulk of studies on AM coding have used the samestimulus strategy, which is to tailor the stimulus to thecell under study. Early work (78, 183) established thatperipheral neurons display envelope phase-locking only ifthe stimulus energy falls within a cell’s tuning curve. Forexample, Javel (114) shows the lack of response of anauditory-nerve fiber tuned to 800 Hz to a high-frequencyAM complex (fc � 5 kHz) modulated at 800 Hz. Moststudies using AM stimuli with tonal carriers match fc tothe neuron’s characteristic frequency (CF, frequency oflowest rate threshold), and usually also optimize otherstimulus parameters for the cell under study. The com-plementary approach, in which the population responseof cells at many different CFs is studied to a limited set ofstimuli, has been little used (27, 293).

A description employed both acoustically, psycho-physically, and physiologically, is the modulation transferfunction or MTF, which is response modulation relative toinput modulation as a function of modulation frequency.Schroeder (257) predicted more than 20 years ago that theconcept of MTF would increase in importance becausethe modulation rather than the carrier usually containsthe important information and because highly nonlineartransmission systems often exhibit a quasi-linear re-sponse to modulation. Physiologically, MTFs are usuallymeasured as the phase-locking to AM tones of fixed m andfc presented at consecutive modulation frequencies, but

other methods have been employed (see sect. IXB).Marked effects on average rate occur so that a distinctionbetween temporal MTF (tMTF) and rate MTF (rMTF) isusually drawn.

IV. AUDITORY NERVE: BOTTLENECK TO THE

CENTRAL NERVOUS SYSTEM

A. Basic Auditory Nerve Properties

Activity in the auditory nerve represents both theoutput of the cochlea and the input to the central nervoussystem, and studies of envelope phase-locking have beenconducted both to gain more insight into cochlear pro-cessing and to define the limits within which the centralprocessor has to operate. Compared with optic and pe-ripheral somatic nerves, the auditory nerve is highly uni-form both morphologically (in caliber and branching pat-tern) and physiologically. We only discuss type I auditorynerve fibers, which form the bulk of the nerve, since nearto nothing is known about the physiology of the unmyeli-nated type II fibers. Because each type I nerve fiber con-tacts only a single inner hair cell, its activity can, to a firstapproximation, be understood from basilar membranemotion at a single point in the cochlea followed by furthersignal modifications by the inner hair cell and hair cell/nerve synapse (76, 136, 137, 243). The most salient prop-erties are 1) sharp V-shaped tuning to a narrow range offrequencies; 2) a limited dynamic range of �20–30 dB,reflected in an sigmoidal rate-level function; 3) adaptationof firing rate to sustained stimuli, rather modest com-pared with adaptation of peripheral nerve fibers in othersystems; and 4) phase-locking to low-frequency puretones (�4–5 kHz in the cat).

Auditory nerve fibers show a bimodal distribution ofspontaneous rate (SR), on the basis of which severalclasses of fibers are defined that differ in a number ofproperties (158, 246, 305). Fibers with high SR (�18spikes/s), which in cat form �60% of the total population,have low thresholds and limited dynamic range. Fiberswith medium and low SR have higher thresholds and tendto have “sloping” saturation, i.e., their rate-level functionsshow a decrease in slope at �30 dB above threshold butdo not fully saturate. Also, low-SR fibers show less adap-tation than high-SR fibers (230). Differences between theSR classes have been documented mostly with pure toneand spectrally complex stimuli, but AM stimuli have re-vealed response differences in the time domain as well.We first discuss how the basic AM parameters m, soundpressure level (SPL), fm, and fc (Fig. 3) influence synchro-nization and average rate, then describe the responsephase.

AUDITORY MODULATION PROCESSING 547

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 8: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

B. Average Response Rate and Magnitude

of Synchronization

When a tone is presented at a fiber’s CF at a fixedsuprathreshold level and is modulated with increasingdepth, the nerve fiber shows a monotonic, saturating in-crease in synchronization Rm (Fig. 3A). Although Rm in-creases with m in absolute terms, synchronization mag-nitude decreases in relative terms, i.e., the gain (responsemodulation relative to stimulus modulation) decreases(127). The gain can be as large as 10 dB for m of 10% anddecreases to values near 0 dB for m of 100%.

Responses to AM as a function of stimulus intensityhave been studied extensively in a variety of animals(guinea pig, Ref. 33; chinchilla, Ref. 114; cat, Refs. 127,135, 294; gerbil, Ref. 270). The rate-level function with AMshows only small differences relative to the function ob-tained with an unmodulated carrier wave (127, 270). The

synchronization-level (Rm vs. SPL) function shows a ste-reotypic nonmonotonic shape; a maximum is reached atlow suprathreshold levels, with a decrease in Rm forfurther increases in SPL (Fig. 3B). It is easy to see howthis relationship is expected from the compressive rela-tionship between firing rate and SPL, especially when themodulation depth m is small; maximal modulation offiring rate should occur for amplitude changes centeredon the steepest part of the rate-level function, betweenfiring threshold and saturation. At high SPLs, amplitudefluctuations should not translate into fluctations in firingrate because firing rate is saturated. Qualitatively thesynchronization-level function does indeed show the ex-pected nonmonotonic shape. However, compared withquantitative predictions based on the rate-level function,the observed synchronization shows 1) larger maximal R

values, 2) a maximum that is displaced towards a higherSPL, and 3) higher synchronization values at high SPLs

FIG. 3. Basic dimensions and manipulations in anAM signal and their effect on auditory nerve activity.The relationship of an auditory filter (curve) and AMspectrum are shown schematically for variations inmodulation depth m (A), sound pressure level (SPL)(B), modulation frequency (fm) (C), and carrier fre-quency (fc) (D). For each manipulation, three mea-sures of the responses of an auditory nerve fiber areshown: average rate (rate, dashed line), synchroniza-tion magnitude (R, solid line), and synchronizationphase (�, thin line).

548 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 9: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

and a shallow downward slope. These deviations arepredicted when adaptation over a short time scale istaken into account (33, 270, 311). Basically, adaptationboosts the coding of stimulus changes so that the oper-ating range over which changes in SPL result in changesin firing rate is larger for responses to AM than for steady-state responses to pure tones.

There are systematic differences in AM responses ofthe different SR classes of auditory nerve fibers. Onedescriptor commonly used to compare envelope phase-locking across cell populations is the maximal R value ofthe synchronization-level function (Rmax). Cells with lowand medium SR tend to have higher Rmax values than cellswith high SR, and this difference is particularly marked atlow CFs (�5 kHz) (127, 294). However, the difference insynchronization between these different auditory nerveclasses strongly depends on the synchronization metricused (33, 127, 183, 295). In contrast to earlier reports,Cooper et al. (33) concluded that fibers with high SRshowed larger envelope synchronization values than lowSR fibers. Their result is less of a conflict than it appearsif it is taken into account that the metric used by theseauthors was (unnormalized) modulated rate rather thanRm, that the average discharge rate of fibers with low SRis generally lower than that of fibers with high SR (158),and that the sample of Cooper et al. is biased to high CFs(�8 kHz).

Synchronization is robust in high SR cells at low SPLsand in low and medium SR cells at mid and high SPLs(294). However, the different fiber populations reachmaximal synchronization at the same level relative to ratethreshold (33, 294). Low SR fibers have a larger dynamicrange over which significant modulation is present (33),lending further support to the general hypothesis thatthese fibers are particularly important for hearing at highSPLs.

The narrow bandpass filtering by the cochlea limitsthe range of modulation frequencies transmitted by nervefibers. As schematized in Figure 3C, increase of fm causesthe sidebands in the stimulus spectrum to move awayfrom fc. If fc is centered at the CF of the fiber studied, theenergy in the sidebands is increasingly attenuated, result-ing in a loss of modulation at the output of the peripheralfilter. The response as a function of fm is usually referredto as the modulation transfer function (MTF) and againone should clearly distinguish effects on average rate(rMTF) from effects on synchronization to fm (tMTF). TherMTF is usually flat but may show some decrease in ratewith increasing fm, particularly in low-SR fibers (127). Incontrast, tMTFs all have a low-pass shape (guinea pig,Ref. 203; cat, Ref. 127; rat, Ref. 186; Fig. 3C). Thesefunctions are smooth and do not show any structurerelated to harmonic ratios, i.e., whether or not the AMcomponents (fc and the two sidebands) are integer mul-tiples of fm is inconsequential. The absolute bandwidth of

frequency tuning curves, e.g., at 10 dB above threshold,increases with CF (59, 86, 230), and the cut-off frequencyof tMTFs shows a concomittant increase with CF (Fig. 2).At very low CFs (a few hundred Hz), a tMTF cut-offfrequency can often not be determined because of thebroad frequency tuning. Interestingly, for CFs above �10kHz, the increase in cut-off frequency is not commensu-rate with the increase in bandwidth of frequency tuning atthese high CFs. This presumably reflects temporal filter-ing at the hair cell/synaptic level rather than spatial filter-ing at the mechanical level (86, 127). The highest modu-lation frequency at which significant envelope phase-lock-ing is observed, in high-CF nerve fibers, is �2 kHz (127,229). A less marked feature of many tMTFs is a shallowpositive slope in the low-frequency skirt (94, 127). Accord-ing to Cooper et al. (33), this slope tends to becomesteeper at high SPLs, consistent with models that includeeffects of response adaptation (311).

Clearly, the extent of envelope phase-locking in theauditory nerve is sufficiently wide to encompass psycho-physical existence regions (Fig. 2). Javel and Mott (115)attributed the disappearance of residue pitch at fc �5 kHzto increased sharpness of tuning of high-CF fibers (59,230). However, while bandwidth limitations may contrib-ute to the upper fm limit of �800 Hz, they do not explainthe disappearance of residue pitch altogether.

The dependence of envelope phase-locking on car-rier frequency, relative to CF, has not been explored ingreat detail (114, 127, 295). It merits further study becausethe available data suggest an important effect. If fc ismoved away from CF, the synchronization-level functionshifts to higher SPLs. Consequently, for moderate to loudstimuli, strongest phase-locking is present in fibers withCFs that differ from fc, provided that the stimulus is ableto excite these fibers (Fig. 3D). Thus, for all but theweakest signals, the representation of stimulus envelopemay be carried mainly by fibers tuned to frequencies thatdiffer from fc.

C. Phase of Synchronization

Few studies reported phase or latency data for AMstimuli. For a given fiber, the phase of response to theenvelope shows a slight lead with increasing SPL (127)and, at fixed suprathreshold levels, varies little withchanges in carrier frequency (122). In contrast, responseenvelope phase increases nearly linearly with fm. Theslope of this relationship has been used as an estimate ofthe total delay accrued between the acoustic stimulus andthe site of recording, similar to earlier such measure-ments on responses to pure tones in low-CF fibers (4).The linearity of the phase-fm relationship indicates that itis mostly determined by fixed mechanical and neuraltransmission delays. Consistent with other delay or onset

AUDITORY MODULATION PROCESSING 549

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 10: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

latency measures, the values obtained vary systematicallyand inversely with CF (127, 294), as expected from thetravelling wave on the basilar membrane which starts atthe base of the cochlea and reaches its more apicallylocated maximum after some delay. However, many pro-cesses contribute to the total delay (242, 244). Gummerand Johnstone (93) scanned envelope delay of nerve fi-bers near their tuning curve threshold, using AM com-plexes of fixed fm and low modulation depth over a largerange of carrier frequencies. They found a delay compo-nent that was large for carrier frequencies near CF andsmaller in the tuning curve tail, and the authors provideseveral arguments to suggest that this component reflectsa delay associated with cochlear bandpass filtering.

The preceding descriptions are based on synchroni-zation of the response to the envelope frequency. Again, itis important to bear in mind that such descriptions areincomplete. The shape of cycle histograms can departseverely from the shape (usually sinusoidal) of the stim-ulus envelope, particularly at high SPLs and at large mod-ulation depths. Therefore, the spectrum of the cycle his-togram typically consists of a number of spectral peaks,of which the peak at fm is only one, and not necessarilythe largest, component (135, 294). Also, the most salienttemporal information present in the discharge patterns isnot necessarily revealed by calculation of synchronizationto stimulus components. For example, robust phase-lock-ing to fm does not imply that the most common interspikeintervals are at the period of fm: for envelope periods ofseveral tens of milliseconds multiple spikes occur perenvelope cycle, while periods shorter than a few millisec-onds succeed each other too fast to allow a spike in everyenvelope cycle. An interesting discrepancy between en-velope phase-locking and dominant interspike intervals isin “pitch-shift” effects of changes in fc (27, 114): phase-locking to fm stays roughly constant, while the most dom-inant interspike interval shifts in a direction which paral-lels the subjective pitch of the AM stimulus.

In summary, envelope information is abundantlyavailable in auditory nerve discharges in temporal form.Each nerve fiber transmits envelope information over astereotypical range of modulation frequencies, carrier fre-quencies, and intensities. These ranges are consistent, atleast at a qualitative level, with known auditory nerveproperties of frequency tuning, compression, adaptation,and spontaneous activity, and computer models incorpo-rating these properties reproduce the main features of AMresponses (105, 117, 271). The main way in which theauditory nerve is a bottleneck to the central nervoussystem for AM signals is in the extent of modulationfrequencies over which synchronization occurs. Thisrange cannot be enlarged centrally, except possibly forfrequencies at which fine-structure information is avail-able (�4–5 kHz), because AM arises from a time-domaininteraction of stimulus components.

V. COCHLEAR NUCLEUS: PARALLEL CHANNELS

The key dynamic properties of cells in the cochlearnucleus (CN) and the differences with the auditory nervewere described in the pioneering studies of Møller (183,184, 187): enhanced gain over a large dynamic range, lowlevels of distortion to sinusoidal modulation, i.e., a ratherfaithful tracking of the sinusoidal envelope, presence ofbandpass tMTFs particularly at high SPLs, and similartMTF shape for different forms of modulation (sinusoidalAM of pure tone or noise carriers, noise-modulated tones,noise-modulated noise). However, the marked diversity ofCN cells supports a variety of AM response patterns,evident in the earliest CN studies (78), and necessitates adiscussion of AM responses per cell type rather thanglobal statements about the CN or its subdivisions. Lim-ited attempts have been made (not reviewed here) touncover the mechanisms underlying the auditory nerve toCN transformations, for gain enhancement in particular(72, 228, 296, 323).

A. Basic Organization of the CN

An important insight that emerged from study of theCN with simple stimuli was that a limited number ofresponse patterns or “classes” could be discerned andthat these patterns are related to morphological cellclasses (18, 202). Especially through the technique ofintracellular labeling, many of the structure-function re-lationships that were surmised earlier on the basis ofindirect evidence were solidified. The physiological diver-sity of these different cell types, combined with the diver-sity of their central projections (297), led to the conceptof functionally specialized, parallel pathways (for review,see Refs. 26, 69, 112, 227, 319).

Briefly, three subnuclei are defined on the basis ofthe bifurcation pattern of the auditory nerve. The antero-ventral cochlear nucleus (AVCN) has three principal celltypes. Stellate cells project to the inferior colliculus (IC)and respond to tones with a burst of regularly spacedaction potentials called a “chopper” pattern. Bushy cells,which derive their name from their small and confineddendritic tree and which are remarkable for their stronginputs from the auditory nerve, occur in two types. Spher-ical bushy cells receive large calyceal auditory nerve ter-minals (end bulbs of Held) and show responses similar toauditory nerve fibers and are therefore called “primary-like” (PL). Their main projection is to binaural nuclei inthe superior olivary complex. Globular bushy cells alsoreceive large nerve terminals in the form of modified endbulbs of Held, and show a characteristic “primary-like-with-notch” (PLN) pattern in response to tones. Theirmain projection is contralaterally in the superior olivarycomplex where they give rise to giant calyceal endings on

550 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 11: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

cells in the medial nucleus of the trapezoid body, whichare inhibitory on binaural cells in the lateral superior olive(LSO). The posteroventral cochlear nucleus (PVCN) con-tains octopus cells that project to the ventral nucleus ofthe lateral lemniscus (VNLL) and show pure onset (Oi)responses to tones. It also contains inhibitory multipolarcells that project to the dorsal cochlear nucleus (DCN)and the contralateral CN and which show onset-chopper(Oc) responses. The principal neurons of the DCN arethe fusiform cells, which project to the IC and displayremarkably nonlinear spectral properties. These proper-ties arise through local inhibitory interactions with inter-neurons in DCN (type II cells) and presumably with the Oc

cells (195).The classification of CN cells is mostly based on

subjective criteria, which contributes to discrepancies inconclusions of different studies. Although there is by nomeans an agreed upon “task” for each of these circuits, itis clear that each cell type performs a different analysis ofthe auditory nerve input and conveys its output to adifferent part of the auditory brain stem. The bushy cellsare clearly involved in binaural analysis important forspatial localization of sounds. Stellate cells are able torepresent vowel spectrum over a wide range of intensi-ties. Fusiform cells integrate somatosensory and spectralinformation and may signal important auditory events.Responses to AM offer another illustration of how CN celltypes differ in their processing of auditory nerve input.

B. AM Responses of Neuronal Types in the CN

The relationship between AM coding and physiolog-ical cell class, as defined by the response to pure tones,was first examined by Frisina and co-workers in the gerbil(70, 71). These authors found that envelope phase-lockingin ventral cochlear nucleus (VCN) was generally en-hanced relative to the auditory nerve, and they describeda hierarchy of enhancement that correlated with the pre-cision of timing of response onset to pure tones. Of thefour physiological VCN cell types studied, cells with well-timed onset responses showed the highest gains, followedby choppers, PLN, and PL. The decrease in synchroniza-tion with increasing intensity is less than in the auditorynerve and in some cell types depends on fm, resulting in apeaked or tuned tMTF at high SPLs. Particularly theselatter two response features, extended dynamic range andselectivity to fm, received much attention in later studies(Fig. 4). The general behavior of synchronization as afunction of SPL and fm described by Frisina et al. (71) wasconfirmed and extended to other cell types in many sub-sequent studies, even though not all studies agree on theexact hierarchical ordering and the discreteness of theordering.

Some of the most interesting responses were ob-

served in cells with chopper responses. Choppers aretemporally tuned for fm, as reflected in bandpass tMTFsparticularly at higher SPLs (gerbil, Ref. 71; cat, Ref. 229).A small percentage of choppers also shows bandpasstuning in their rMTFs (228). The fm causing the strongestsynchronization is called the temporal best modulationfrequency (tBMF). The occurrence of bandpass tuning isof obvious importance to the concept of a “modulationfrequency filter bank” or “modulation channels” (131).This concept has some popularity, particularly in thepsychophysical literature (see sect. II), and will be takenup again in our discussion of IC and auditory cortex.

As mentioned, “chopping” reflects the intrinsic ten-dency to fire a regular burst of spikes at the beginning orsometimes entire duration of the stimulus, and these cellshave therefore been viewed as resonators or intrinsicoscillators (150). SPL-dependent bandpass tuning and os-

FIG. 4. Two important transformations between the auditory nerve(dashed lines) and cochlear nucleus (solid lines). A: enhancement ofenvelope synchronization and extended dynamic range is present inmany cell types. B: some cell types show bandpass tMTFs.

AUDITORY MODULATION PROCESSING 551

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 12: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

cillatory responses were also described earlier by Møller(187) in the rat. In a subclass of cells in the guinea pig, theintrinsic behavior is invariant with SPL and affects thetemporal characteristics of the response to nondetermin-istic stimuli (301). There is a possibility that the intrinsicproperties make these cells function as envelope filtersthat decompose the envelope spectrum, much in the waythat inner hair cells in the turtle cochlea decomposestimulus frequency by virtue of an intrinsic electricalresonance mechanism (63). Several authors have there-fore looked for correlations between AM and intrinsicoscillation behavior. Frisina et al. (71) compared the fre-quency of chopping with the tBMF for a sample of sus-tained choppers in VCN. The tBMFs spanned a range(170–700 Hz) roughly similar to the range of choppingfrequencies (80–520 Hz), but the correlation between thetwo response properties was poor. There was a sugges-tion of interaction between chopping frequency and fm inthat the tBMF only rarely exceeded the chopping fre-quency, which therefore seemed to set an upper bound. Ina subpopulation of choppers (sustained choppers with awell-defined tBMF between 150 and 450 Hz), Rhode andGreenberg (229) noted a tendency for maximal envelopesynchronization when fm matched the discharge rate to atone at the same intensity.

A strong and more general relationship, not re-stricted to choppers, was found by Kim et al. (141) inDCN/PVCN neurons of the unanesthetized decerebratecat. In this study, the “intrinsic oscillation” frequency of aneuron was measured from the autocorrelation of itsresponses to pure or AM tones. Frequency of intrinsicoscillation and BMF were well correlated (r � 0.86) withregression close to the diagonal of equality, and the fre-quency ranges were roughly similar (50–500 Hz) to thosereported for VCN choppers (71, 229). Importantly, theremarkably good correlation arose from the pooling ofdifferent cell groups, rather than from a within-populationtrend, complicating any AM-coding scheme based on in-trinsic oscillators. At least five cell types contributed tothe data, surprisingly also including auditory nerve fibers.

Besides choppers, the other main constituent celltypes of the AVCN are the two types of bushy cells withPL and PLN responses. As expected from their powerfulauditory nerve inputs, PL and PLN cells resemble auditorynerve fibers in many regards, and indeed, their Rmax andtMTF cut-off frequency distributions at different CFslargely overlap that of the auditory nerve (129, 229). ForPL cells this overlap is virtually complete, but for CFsbelow �7 kHz, PLN cells synchronize much better toenvelopes than auditory nerve fibers. At very low CFssome bushy cells have enhanced synchronization to bothfine-structure and envelopes (124).

Comparisons of cell types across studies illustratethat one has to be careful with simple characterizations tomulti-dimensional stimuli like AM. As remarked by Rhode

and Greenberg (229), a single response parameter is notsufficient to characterize envelope synchronization. Thehighest gains found in choppers exceed those of PL cellsbut are mostly at fm values below 500 Hz (129, 229) so thatat higher modulation frequencies PL cells are superior tochoppers in transmitting envelope information. Conse-quently, the hierarchy of modulation enhancementstrongly depends on the range of modulation frequenciesof interest and also, as pointed out earlier (see sect. IVB),on the chosen metric (266). Rather than providing anexhaustive listing of response parameters for all celltypes, we emphasize here the properties by which differ-ent CN cells stand out most from the auditory nerve andfrom each other. For chopper cells this is the bandpasstuning of tMTFs; for bushy cells it is the extent of thetMTF (high cut-off frequencies).

The two main response types found in PVCN areonset (Oi and Oc), associated with the octopus and mul-tipolar morphology, respectively. Both cell types showremarkable envelope phase-locking, in line with the pre-cision of their onset response to pure tones. Oc cells havebeen particularly well-studied (cat, Refs. 125, 140, 228,229). These cells show some of the highest gains, over thewidest fm and SPL range, which is why Kim et al. (140)proposed that these cells have a special role in the ex-traction of the fundamental frequency of voiced speechsounds. Moreover, large changes in fc and even use of awideband carrier have little effect on magnitude of syn-chronization (228). Oi cells have been studied very little,but the few existing data reveal interesting properties, inline with their biophysical specializations (199). Thesecells show the highest gains of all CN cells, reaching Rm

values near 1 (228). Moreover, their tMTFs are high ingain and invariant for SPL, but all-pass. The rMTFs ofthese two classes of onset cells also appear unique amongCN cell classes because they can be sharply bandpass.It is unclear whether these bandpass rMTFs can sustaina rate code for modulation frequency: among the hand-ful of Oi cells reported, the range of rBMFs was only350 – 450 Hz.

Onset units have wider frequency tuning than audi-tory nerve fibers (80, 118, 231). They therefore provide atest case of the suggestion that is sometimes made thattMTF bandwidths may broaden centrally by virtue ofconvergence of cells tuned to different CFs (180, 286).However, this would require phase information on theindividual spectral components of the AM stimulus, andfor frequencies above the pure-tone phase-locking range(�4–5 kHz in cat), such information is not available to thecentral processor. Indeed, despite their wider frequencytuning, tMTF cut-off frequencies of onset cells do notexceed the limits imposed by the auditory nerve (125,228, 229).

The DCN has traditionally been regarded as a part ofthe CN which has poor timing properties (79, 82, 154), and

552 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 13: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

initial studies with AM seemed consistent with that view(horseshoe bat, Ref. 282; kangaroo rat, Ref. 29). However,more recent studies emphasized good AM coding in DCN(cat, Refs. 125, 229, 254; guinea pig, Refs. 322, 323) andspecific roles for DCN in temporal processing have beenproposed [pitch (150); extraction of envelopes in back-ground noise (73) or at high SPLs (229)]. The tMTFs aretypically low-pass or bandpass and differ from other CNcell types in their upper fm limit of phase-locking whichnever exceeds 800 Hz. To some extent, differences be-tween studies reflect the complexity of this nucleus, bothin diversity of response types and in nonlinearity of be-havior (319). Oc cells can be found in deep DCN and mayexplain some of the high-gain responses to AM reportedfor DCN. Second, simple measures like maximum syn-chronization or cut-off frequency do not reveal the fullcomplexity of DCN responses and give DCN a misleading“AVCN-like” appearance. Even though DCN interneuronsand principal neurons can display high gain responses toAM stimuli, their response often shows strong nonmono-tonicities, not only in average rate but also in magnitudeand phase of envelope synchronization (125, 254, 322).These nonmonotonicities are likely a manifestation in thetemporal domain of the intricate inhibitory and excitatoryinteractions that have been invoked to explain similarcomplexities in the frequency domain.

A preliminary study by Frisina et al. (73) in the chin-chilla suggests that envelope synchronization of DCNneurons can be enhanced by background noise, but moresystematic data and comparisons with auditory nerve andVCN are needed to evaluate whether DCN neurons arespecial in this regard. Rhode and Greenberg (229) studiedenvelope synchronization in the presence of wide-bandnoise in different CN cell types of the cat and found thatin general there is remarkable preservation of envelopesynchronization even at high noise levels.

As in the auditory nerve, few authors have systemat-ically reported envelope phase data. Cells in the CN alsoshow a linear increase in envelope phase with increasingfm, but the slopes are systematically steeper than in theauditory nerve, consistent with additional time delaysrequired for conduction and synaptic transmission (125,129). Delays calculated from response envelope phase aremore tightly distributed and shorter than traditional mea-sures of latency based on response onset (94, 185), as isthe case for delay estimates based on fine-structure (65).Most CN studies of AM coding considered only tMTFmagnitude and not phase when trying to infer functionalconsequences of AM tuning for the perception of naturalstimuli. Delgutte et al. (40) used both tMTF magnitudeand phase of responses in auditory nerve, CN, and IC topredict responses of the same neurons to speech utter-ances (see below) and stressed the importance of incor-porating phase, particularly at very low modulation fre-quencies, to make succesful predictions.

To summarize, the CN shows marked differences inAM coding relative to its auditory nerve input: widerdynamic ranges, higher gains, appearance of bandpasstMTFs, and less sensitivity to the presence of backgroundnoise. Furthermore, different cell types show marked di-versity in their synchronization and average rate behaviorto AM signals. A simple hierarchical ranking does not dojustice to the differences among cell types and dependson whether one emphasizes Rmax values (71, 295), breadthof the tMTF (129), or statistical reliability of phase-locking(266). As in the nerve, AM coding is almost entirely tempo-ral: bandpass rMTFs occur rarely, in a few cell classes.

Our knowledge of CN responses to AM is still lackingin many ways and basically does not go far beyond phe-nomenology. Perhaps the most pressing question is therobustness and relevance of bandpass tMTFs, whichmany investigators regard as genuine envelope filters.More studies are needed to determine how invariant tMTFtuning is with stimulus parameters, what range of tBMFsis spanned at different CFs, and whether tMTF tuningindeed supports filtering of envelope energy in naturalstimuli. Such information would be particularly valuablefor carrier frequencies in the range of phase-locking tofine-structure (�4–5 kHz), which is poorly sampled inmost studies in small animal species with higher-fre-quency hearing than humans. There are other lacunae.Data are sparse for certain cell types, most notably pureonset units in PVCN. In most studies, the stimulus isoptimized for the cell under study; there is a need forpopulation studies in which the response to a limited setof stimuli is examined for an entire population. Finally,there is currently no evidence for any kind of within-classtopographic organization (e.g., within an isofrequencystrip) of AM response properties in the CN.

VI. SUPERIOR OLIVARY COMPLEX: AN

EXAMPLE OF TIME-TO-RATE CONVERSION

Part of the CN output is directed toward nuclei in thesuperior olivary complex (SOC). This is an amalgam oflarge and small nuclei some of which take part in well-studied circuits whose function is in feedback to theperiphery (middle ear reflex and the olivocochlear effer-ent systems) or in the extraction of binaural differencesimportant in spatial hearing. The preceding and followingsections illustrate that, with some notable exceptions,envelope coding in the CN is largely temporally basedwhile at the level of the IC partial conversion to a ratecode is apparent. In our discussion of SOC physiology wehighlight one aspect of these circuits: the conversion of anenvelope time code to an average rate code.

The duplex theory of sound localization holds thatthe azimuthal spatial position of low-frequency signals isdetermined primarily on the basis of the minute differ-

AUDITORY MODULATION PROCESSING 553

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 14: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

ences in time at which the acoustic waveform reaches thetwo ears, interaural time differences (ITDs), while high-frequency signals are localized on the basis of interauralSPL or level differences (ILDs). This classical psycho-physical theory seems to be embodied anatomically andphysiologically in two binaural circuits in the SOC of mostmammals. The circuit centered on the medial superiorolive (MSO) detects ITDs and contains primarily low-frequency cells. Another circuit, centered on the lateralsuperior olive (LSO), detects ILDs and has a bias towardshigh CFs. The detailed physiology of these circuits andtheir afferents is beyond the scope of this review (seeRefs. 279, 312, 316).

Starting in the mid-1970s, a number of investigatorsreported that humans can reliably discriminate ITDs ofhigh-frequency signals at thresholds approaching thosefor low-frequency signals, i.e., �20 �s, provided that thesignals are not pure tones but have a time-varying enve-lope, as in AM sounds with the parameters illustrated inFigure 2. Clearly, subjects can detect the on-going enve-lope differences that occur when complex stimuli aredelayed between the two ears with high precision. Phys-iological studies in the IC of cat (317) and rabbit (12)provided evidence for ITD sensitivity to AM signals butindicated that this sensitivity was probably generated at alower level. Subsequent recordings in the SOC indeedrevealed cells that were sensitive to interaural delays ofAM signals, and this ITD sensitivity could be understood

from the binaural interactions known to occur in thesenuclei and the AM coding properties of their afferents.

In the MSO, ITD sensitivity to AM signals is generatedby a multiplicative, cross-correlation type operation.These cells behave as coincidence detectors, which hasbeen particularly well-documented for low-frequency sig-nals (81, 126, 313) but holds for modulated signals as well.The average firing rate of high-CF MSO cells to AM signalsvaries with ITD (Fig. 5A). Moreover, the optimal ITD ispredicted from the phases measured from the monauralresponse to an ipsi- or contralaterally presented AM signal:the firing rate is high when the envelope signals from the twoears arrive in-phase at the site of convergence (10, 122, 313).

In the LSO, ITD sensitivity to AM signals is generatedby a subtractive rather than a multiplicative process (Fig.5B). These cells have ILD sensitivity by virtue of excita-tory signals from the ipsilateral ear and inhibitory onesfrom the contralateral ear. Again bushy cells constituteboth contra- and ipsilateral pathways. For ITDs at whichthe inhibitory and excitatory phase-locked signals reachthe LSO cell coincidently, the signals cancel each otherand the cell remains silent. At other ITDs cancellation isnot perfect and the excitatory ear is now able to drive thecell. Thus the ILD sensitivity of the LSO cell combinedwith the envelope phase-locking in its afferents generatesoverall changes in discharge rate with ITD (10, 11, 122,128, 129). Interestingly, in anesthetized cats LSO neuronsshow a “chopper” pattern to ipsilateral tone bursts, but

FIG. 5. Example of sensitivity to envelope in-teraural time differences (ITDs) in medial and lat-eral superior olive (MSO, row A; and LSO, row B).Sensitivity to ITDs of binaural AM stimuli (left col-

umn) shows a complementary pattern in MSO vs.LSO and is consistent with the response phase tomonaural modulation (middle column): ITDs whichbring the monaural responses in-phase cause a highfiring rate. The complementarity arises from oppo-site signs of binaural interaction (right column): inMSO a process of coincidence detection operateson excitatory inputs from both sides, while in LSO asubtractive process operates on excitatory ipsi- andinhibitory contralateral inputs. The LSO response tocontralateral modulation (middle column, bottom)would be as obtained by presenting an unmodulatedstimulus to the ipsilateral, excitatory ear and a mod-ulated response to the contralateral, inhibitory ear.

554 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 15: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

unlike choppers in the CN, they lack tuning in the tMTFs(or rMTFs) to ipsilateral stimulation (129).

The simple time-to-rate conversion that occurs inbinaural SOC nuclei may have analogs in monaural pro-cessing, e.g., rMTFs in the SOC of the mustache batappear to be shaped by monaural excitatory and inhibi-tory interactions and delays similar to the binaural inter-actions described in cat and rabbit (91). The envelope ITDsensitivity in MSO and LSO also illustrates the generalpoint that it is probably beneficial for a time-to-rate con-version (or more generally a recoding of a stimulus-locked temporal code into another form) to occur at aperipheral neural level. Indeed, the upper frequency limit(though not necessarily the gain) of phase-locking tendsto decrease with subsequent integrative stages so that ade novo comparison of monaural phases by neurons at ahigher level in the neuraxis would yield a more restrictedITD sensitivity. The frequency and modulation frequencyrange over which ITD sensitivity occurs in the IC andhigher levels is comparable to that in the SOC but israte-based (66, 219). For example, the ITD sensitivity ofhigh-frequency cells in the IC extends to modulation fre-quencies to which the cells no longer phase-lock whentested monaurally [on average 600 Hz binaurally vs. 250Hz monaurally (12), see also sect. VIII, B and H]. Also,envelope phase-locking in the monaural inputs to the LSOextends to modulation frequencies more than an octavehigher than the highest fm at which LSO neurons showITD sensitivity (�800 Hz) (129). The use of temporalinformation may thus be one evolutionary reason for theextensive subcortical processing in the auditory systemrelative to the other sensory systems.

Little is known about envelope sensitivity in othernuclei of the SOC. Olivocochlear efferent neurons in theguinea pig are surprisingly well phase-locked to AM sig-nals below �400 Hz (94), with bandpass tMTFs peaking at�100 Hz. Maximal gains were �8 dB higher than forauditory nerve fibers recorded in the same experiments. Itis not known whether modulation differentially affectsthe targets of the medial olivocochlear neurons (the co-chlear outer hair cells), although AM signals have beenreported to be effective signals to suppress evoked oto-acoustic emissions in humans (162).

Remarkable AM responses were described in mon-aural cells in the SOC of awake rabbits (145). Cells withsustained responses showed responses to AM similar inseveral respects to CN choppers, but an unusual class of“off” cells was inhibited during the presentation of puretones and responded vigorously after stimulus termina-tion. These cells were strongly driven by AM stimuli andshowed high gains over a wide range of modulation fre-quencies, resulting in low-pass tMTFs and rMTFs. Severalproperties suggested that the responses were in effect arebound from inhibition phase-locked to the stimulus enve-lope, a mechanism also observed in the SOC of the bat (91).

VII. THE NUCLEI OF THE LATERAL LEMNISCUS

The nuclei of the lateral lemniscus (NLL) are embed-ded in the lemniscal fibers that connect the lower brainstem nuclei with the IC (166, 262). As described in thesereviews, ventral (VNLL), intermediate (INLL), and dorsalnuclei (DNLL) have been identified, although in someaccounts the intermediate nucleus is treated as part of theventral nucleus with all NLL neurons located ventral tothe DNLL referred to as the ventral complex of the NLL(165, 166). Despite considerable progress over the lastdecade in understanding the physiology of these nuclei,only two accounts (both in echolocating bats) have de-scribed responses to amplitude modulation (Ref. 109, bigbrown bat, Eptesicus fuscus; Ref. 310, mustache bat,Pternonotus parnellii parnellii). In the big brown batsynchronization to the modulation envelope occurred innearly all unit types in VNLL, INLL, and DNLL (109). Bothlow- and band-pass tMTFs were reported. Neurons inVNLL and INLL responded to the highest frequencies ofmodulation with BMFs between 100 and 1,000 Hz and apreponderance of low-pass tMTFs. In contrast, a nar-rower range of tBMFs (100–500 Hz) was observed inDNLL units with a high proportion of bandpass tMTFs insustained units. Responses to a similar range of modula-tion frequencies were recorded in the DNLL of mustachebat but with differences in the responses of onset andsustained units (310). Most onset units synchronizedequally well to modulation frequencies between 100 and300 Hz but showed markedly bandpass rMTFs. Sustainedunits responded up to 800 Hz with low-pass tMTFs andflat rMTFs. Inhibition contributes very differently to theseresponses. Blockade of GABAA receptors led to a reduc-tion in synchronization at all modulation frequencies insustained neurons while onset units either increased theirmodulation frequency cut-off to that of sustained neuronsor revealed synchronization where none existed before.The shapes of the rMTFs were not changed by blockinginhibition (310).

VIII. AMPLITUDE MODULATION ENCODING IN

THE INFERIOR COLLICULUS: A CENTER

FOR CONVERGENCE

A. Basic Organization of the IC

The several parallel pathways that diverge in thecochlear nucleus from the common input of the cochlearnerve converge again in the IC, the principal midbrainnucleus in the auditory pathway. The IC is an obligatoryprocessing center for most information ascending via themedial geniculate body to the auditory cortex. Anatomicalinvestigations of the IC in several species have identifieda broadly consistent arrangement of subdivisions: a cen-

AUDITORY MODULATION PROCESSING 555

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 16: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

tral nucleus (CNIC) receiving most of the main ascendingafferent input from many brain stem nuclei is surroundeddorsally, laterally, and rostrally by dorsal (DCIC) andexternal cortices (ECIC) (166, 200, 201). The CNIC isdistinguished from the other subdivisions by its laminarorganization. It is composed of two main cell typestermed disc-shaped or flat cells interspersed with stellateor less-flat cells (164, 182). This cytoarchitecture givesrise in three dimensions to twisted laminae of cells andfibers (167) that constitute the substrate for the highlytonotopic frequency organization in the IC (173, 237, 252,264). The frequency-band laminae are oriented so thatneuronal best frequency increases along the dorsolateralto ventromedial axis of the nucleus. A defining feature ofCNIC is the convergence of temporal, spectral, and spatialinformation extracted in parallel earlier in the pathwayonto this laminar structure. However, the full details ofhow these converging inputs map onto individual neuronshave yet to be elucidated, and it is not known to whatextent the different strands of information are processedindependently in the IC.

The DCIC and ECIC as well as differing from theCNIC in their cytoarchitecture have different inputs andoutputs. Descending projections from the cortex termi-nate, predominantly (304), although not exclusively, inthe cortical divisions (248). The IC is an important sourceof both ascending fibers to the thalamus and descendingconnections to lower brain stem structures (110).

The monaural and binaural response properties ofsingle neurons in the IC have been extensively docu-mented (see Refs. 24, 112, 113). Despite the limitations inour knowledge about its cellular organization, it is clearthat the output of the IC is considerably modified relativeto its input. This is exemplified by the response patternsof IC neurons to complex sounds including AM. For themost part, such knowledge is derived from studies inanesthetized animals that have focused on neurons re-corded in response to monaural stimulation of the earcontralateral to the side of recording, and in what followsmonaural stimulation should be assumed unless specified.Most of the studies discussed here describe recordingsattributed to the central nucleus, but depending on theage of the study and the parcellation adopted, in manycases this will have included at least part of the DCIC andECIC as well as the CNIC. Therefore, in this review theterm IC is used to indicate all subdivisions.

B. Modulation Transfer Functions for IC

Units: Synchronization

IC neurons show strongly modulated responses thatfor many modulation frequencies greatly exceed the mod-ulation in the stimulus (144, 222–224). Modulation gainscalculated from synchronized responses in the IC are

often 15–20 dB (144, 222, 224) and so are larger thanequivalent measurements obtained in the auditory nerveand for most neuron types in the CN. The shape of thetMTF depends on the parameters of the stimulus (seebelow) but is invariably either bandpass or low pass (144,152, 191, 222–224).

Modulation gain may be enhanced in the IC, butmodulation frequencies that elicit a synchronized re-sponse are restricted to a lower range than in the periph-ery. This is manifest in both the tBMFs of neurons in theIC and the range of frequencies over which there is sig-nificant modulation of the response (Fig. 9). In the rat,Rees and Møller (223) obtained a modal tBMF in the rangeof 100–120 Hz. The tBMF never exceeded 200 Hz, and thehigh-frequency cut-off of the tMTF (measured 10 dB downfrom the BMF) did not exceed 320 Hz. In guinea pig,tBMFs fall below 150 Hz with most peaking between 50and 100 Hz (224). Broadly similar values have been ob-tained in gerbil (144) and squirrel monkey (191). In thelatter, 73% of neurons showed a bandpass tMTF for AMwith tBMFs between 32 and 64 Hz. In rabbit, single unitsand multiunit clusters had a mean tBMF of 87 Hz (12).However, it is worth noting that one unit synchronized toa modulation frequency of 925 Hz. For samples of phasicneurons in both young and old mice, tBMFs were allbelow 200 Hz (291). Similarly in mustache bat, the major-ity of units (�70%) only synchronized their firing to mod-ulation frequencies below 300 Hz, but a small proportion(4.5%) synchronized up to 500 Hz (20). While these valuesare broadly similar, the differences that exist more likelyreflect species differences rather than the presence orabsence of anesthetic, since there is no segregation of thevalues consistent with anesthetic status.

Rees and Møller (223) demonstrated that the shape ofthe tMTF is highly dependent on stimulus level as in somecochlear nucleus neurons. When stimulus intensity isclose to threshold, tMTFs are usually low-pass functionsbut become more bandpass as the mean intensity of thestimulus is increased. This change may be accompaniedby an upward shift in the tBMF. For neurons with non-monotonic rate-level functions, however, the tMTF be-comes low pass at sound levels falling on the negativelysloping limb of the rate-level function (224). So the rela-tionship between tMTF shape and sound level is indirect,with firing rate, perhaps reflecting the net excitatory driveto the neuron, being the better predictor of the low-frequency slope of the tMTF. Why the effect of stimuluslevel is only apparent at low modulation frequencies is notclear and may depend on a number of factors includingadaptation. Another possibility is that the neuron’s prob-ability of firing at low stimulus intensities is only high nearthe peak of the modulation cycle resulting in highly syn-chronized firing. As intensity is increased, threshold isexceeded for a larger fraction of the modulation cycleleading to a reduction in synchronization. This effect

556 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 17: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

might not be apparent at high modulation frequenciesbecause the frequency of modulation approaches the neu-ron’s maximum firing rate, so ultimately only a singlespike occurs in each cycle giving a high degree of syn-chronization whose upper limit is determined by temporalresolution. Such effects become more apparent in thecochlear nucleus and IC than the auditory nerve becauseof the enhancing effects of time-dependent inhibition,membrane properties, and other nonlinearities in morecentral neurons, evidenced by their lower spike rates.

Further evidence for a relationship between tMTFshape and firing rate is provided by the effect of back-ground noise. Bandpass tMTFs become low pass with theaddition of progressively higher levels of backgroundnoise (223). Rees and Palmer (224) showed this changecorrelated with the noise-induced shift in the neuron’sinput/output function along the level axis and its conse-quent effect on the firing rate elicited by a stimulus (224).

C. Modulation Transfer Functions for IC Units:

Average Rate

The most striking change in AM responses betweenthe IC and its peripheral inputs is in the tuning of rMTFs;the dependence of average firing rate in the IC on modu-lation frequency is stronger, more common, and has amuch wider diversity of patterns than is the case in theCN or the SOC (Fig. 6). (But it is important to note thatwe have only limited information about rate responsesto modulation in the nuclei of the SOC and laterallemniscus.)

rMTFs show a wider range of patterns than is usuallyobserved for tMTFs. In the cat, Langner and Schreiner(152) identified specific patterns of rMTF in a populationof single- and multi-unit clusters. These included band-pass, low-pass, high-pass, band-reject, or complex types.The majority were bandpass (70% of single units, 58% ofmultiunits). Similar response patterns are also found inbat (32) and mouse (291). In guinea pig, 45% of rMTFswere bandpass; the remainder included a variety of dif-ferent shapes, with some units showing little effect ofmodulation frequency on firing rate (224). Units whoseaverage firing rate did not change with modulation fre-quency were the most common type encountered in squir-rel monkey, making up almost half of the total (191). Themost detailed study of rMTFs in the IC is that of Krishnaand Semple (144) in gerbil. In addition to confirming therMTFs shapes described previously, Krishna and Semple(144) noted that many rMTFs were characterized by dis-tinct ranges of modulation frequency over which firingrate was enhanced or suppressed. In some, regions ofenhancement were separated by a marked region of sup-pression that defined a worst modulation frequency sep-arating the two maxima.

Like synchronized responses, rate responses to mod-ulation depend on the mean level of the stimulus (144,224). Where units have bandpass rMTFs and monotonicrate level functions, the heights of the peaks in the rMTFsincrease and then decrease with the average level. Theyare highest when measured at sound levels on the slopingportion of the rate level function and decline as the stim-ulus level rises into the saturating region of the rate level

FIG. 6. Transformations between the co-chlear nucleus (CN) and inferior colliculus (IC).Both nuclei show a wide variety of AM respons-es; each column highlights only one of the typesof responses observed and how these are af-fected by parametric stimulus variations (in SPL,m, fc) in single cells. The most striking differencebetween CN and IC is in the rate modulationtransfer functions (rMTFs), which are onlyrarely sharply peaked in the CN (C) but fre-quently so in the IC (D), where they can alsoshow a degree of invariance with AM parame-ters. This is also the case for temporal modula-tion transfer functions (tMTFs) in the IC (B),which reach higher maximal synchronizationvalues than in the CN (A) and often show adegree of bandpass selectivity, but their maximaoccur at lower fm than in the CN.

AUDITORY MODULATION PROCESSING 557

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 18: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

function (224). Across a population of neurons withpeaked rMTFs, increases and decreases in BMF with levelwere observed (144). In units with rMTFs containing re-gions of suppression, the suppression often becomesmore prominent as stimulus level or modulation depth isincreased. In some instances, regions of firing rate en-hancement changed to suppression at high stimulus lev-els. Krishna and Semple (144) postulate that inhibition isan important contributor to these effects.

There is general agreement across species in themodal value of the rBMF distribution in the IC. In the cat,the modal value for rBMF lies between 30 and 100 Hz(152). These values are in keeping with those reported inrat (222, 223), guinea pig (224), gerbil (144), and bat (32).In the primate, the peak of the distribution of rBMFs ofmulti-units was 128 Hz (191).

There is less agreement over the upper frequencylimit for rBMFs. In the cat, almost 20% of multiunit clus-ters had rBMFs greater than 200 Hz as did �5% of singleunits (152). A few units had rBMFs as high as 1,000 Hz.rBMFs of up to 800 Hz were also reported for some unitsin bat (32) and mouse (291). In contrast, the maximumrBMFs recorded for single units in gerbil did not exceed300 Hz (103, 144), and in squirrel monkey, the maximumrBMF value reported was 256 Hz (191). It is quite likelythat the differences between these studies reflect truespecies differences, with there being no such creature asthe average mammal. However, other factors might becontributory. The cat data show that rBMFs �300 Hzwere more prevalent in multi-unit recordings. As Langnerand Schreiner (152) comment, multi-unit recordings maycontain responses from the fiber inputs to the IC as wellas its neurons. Given that some of these inputs originatefrom nuclei in which neurons synchronize to higher mod-ulation frequencies than in the IC, their contributioncould be misleading. On the other hand, units with highrBMFs may be more difficult to record as single units, anda small number of single units with high BMFs werereported. Krishna and Semple (144) suggest that misclas-sifying the secondary peak of enhancement as the BMF inthose units with more than one rMTF peak might explainthe high rBMFs reported in cat. Apart from species dif-ferences, the presence or absence of anesthesia is anotherfactor that could account for the observed differences inthe ranges of rBMFs. However, it seems unlikely thatanesthesia is the only factor, since some of the largestdifferences are seen when comparing data from differentspecies where no anesthetic was used [compare valuesabove for squirrel monkey (191), bat (32), and mouse(291)]. On the other hand, similar values were obtained insome anesthetized and unanesthetized preparations, e.g.,cat (152) and mouse (291). Unfortunately, definitive ex-periments comparing the presence and absence of anes-thetic have yet to be perfomed.

D. What Determines the MTF Upper Limit

in the IC?

Lower cut-off frequencies for both tBMF and tMTF inthe IC than at more peripheral stages of the pathway aregenerally observed across species. The reasons for thisare not clear. In the auditory nerve, filter bandwidth is onelimiting factor as evidenced by the correlation betweenthe upper limit of the response to modulation and a fiber’sCF (see sect. IVB and Fig. 2). However, evidence for asimilar relationship between the response to AM and CFin the IC is weak. In the cat, the upper boundary of therBMF distribution (and presumably the tMTF distributionsince rBMFs and tBMFs are reported to be similar) formultiunits increases with CF (152). But evidence of sucha correlation was not apparent in single-unit data re-corded in other species [rat tBMF (223), squirrel monkey(rate or synchronization not specified) (191), bat rBMFsand tBMFs (32), or gerbil (144)]. Krishna and Semple(144) examined a large data set and failed to find anycorrelation between CF and rBMF or between CF and thecut-off frequency of either rMTFs or tMTFs. Furthermore,the frequency bandwidths of most IC neurons are suffi-ciently wide to accommodate the stimulus spectrum.Thus it seems something other than frequency bandwidthis primarily responsible for setting the upper frequencylimit of the response to AM in the IC.

An alternative possibility is that the shift in the re-sponse to lower modulation frequencies in the IC reflectsa reduction in temporal resolution. Such a reduction issuggested by an upper frequency limit of 600 Hz forphase-locking to pure tones in the IC, a substantiallylower value than pertains in auditory nerve fibers (147).The mechanisms responsible have not been identified, butintrinsic membrane properties and synaptic mechanismsare possible candidates, as is the accumulated loss oftemporal resolution en route from the periphery. Thecontribution of synaptic processing is now being investi-gated, but thus far blockade of inhibitory or excitatorymechanisms has failed to show any significant influenceon the upper limit of synchronization. Neurons in the ICof the mustache bat seldom responded to a wider range ofmodulation frequencies following the blockade ofGABAA, GABAB, or glycinergic inhibition (20). This find-ing is in contrast to the marked increase in the upper limitof synchronization in DNLL neurons in the same specieswith GABAergic blockade (310). Similarly, neither block-ade of N-methyl-D-aspartate (NMDA) (20, 321) nor DL-�-amino-3-hydroxy-5-methylisoxazole-propionic acid (AMPA)excitatory receptors (321) resulted in changes in the upperlimit of synchronization. Similarly, in chinchilla, Casparyet al. (28) found no change in the temporal response toAM with blockade of GABAA receptors, but they didreport changes selectively affecting the low-frequencylimb of rMTFs in some units.

558 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 19: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

E. Is AM Encoded in the IC by Rate

or Synchronization?

Whether AM is encoded in the IC by synchronizationor by average firing rate remains an open question. Ofcourse, both measures may be important either indepen-dently or combined as synchronized rate. tMTFs andrMTFs and BMFs match in many units, but in a significantpercentage of neurons they are different, with, in somecases, no obvious dependence of rate on modulation fre-quency despite a clearly tuned tMTF (144, 152, 191, 222,224). Population data on the MTF types obtained usingsynchronized or average rate measurements were re-ported in the cat (152). Seventy percent of units hadbandpass rMTFs, and only 7% were low pass. In contrast,a much larger proportion of tMTFs showed low-pass func-tions (48%) compared with bandpass functions (33%),such that 60% of units with low-pass tMTFs had bandpassrMTFs.

Nevertheless, the relationship between firing rate andmodulation frequency that emerges in IC might signal atransformation in the encoding of AM from a temporal toa rate-based representation, and models have been pro-posed explaining how this might be achieved (105, 149,160). A common approach invokes coincidence detectionin IC neurons operating on synchronized responses tomodulation from stellate cells in the cochlear nucleus.Although elegantly simulating many modulation re-sponses of neurons in the IC, current implementationsmatch the BMFs of the IC neuron and its inputs from thecochlear nucleus despite experimental data (cf. sects. V

and VII and Fig. 9) which suggest that the BMF ranges arenot the same.

As this discussion has shown, synchronized re-sponses to the modulation envelope are well maintainedin the colliculus, and rMTFs are not simple reflections oftMTFs. It is premature, therefore, to conclude that tem-poral based encoding of the modulation envelope has nosignificance in the IC. Both rate and synchronized codingmight be retained with different functional consequences.A rate code could allow the encoding of modulation fre-quencies that exceed the synchronization limit in the IC,and the data of Schreiner and Langner (251) support thisconjecture as does the finding in squirrel monkey that thedistribution of rMTFs peaks at a higher frequency than thedistribution of tMTFs (191). On the other hand, somestudies show that synchronization and rate measures ex-tend over broadly similar ranges of modulation frequency(see sect. VIIIC).

F. Relationship Between AM Responses and Other

Neuronal Properties

Possible functional relationships between responseto AM and other physiological properties have not been

well explored in the IC (at least partly because there is nogenerally accepted physiological classification scheme, asis the case for the CN). A variety of firing patterns to tonesare recorded in the IC, and most authors have distin-guished onset and sustained responses (see Refs. 112, 113for review), which can be further subdivided into distinctclasses (e.g., Refs. 221, 290). Such patterns depend on thestate of intrinsic membrane conductances that in turn aremodulated by inhibition (155, 209, 268). Both sustainedand onset units can respond to continuous AM stimulithat last several seconds (144, 222, 224). Although someonset units fail to respond to AM, those that do respond atmodulation depths well below 100% negating the argu-ment that the response is effectively to a series of tonebursts. It does seem that onset units are the least likely torespond to AM. In both bat and the rat, most of the unitsfailing to respond to modulation were onset types (32,204). Other differences in the response to AM betweendifferent unit types are also beginning to emerge. In bat,average rBMFs increased progressively when comparingthe responses of tonic, chopper, and onset neurons (32).Sinex et al. (267) report differences between unit typesand their responses to sinusoidal and trapezoidal AM.Krishna and Semple (144) describe rMTFs with two peaksseparated by a region of suppression. These were pre-dominantly seen in units with sustained or pauser PSThistograms. Onset or onset-sustained neurons showedonly a single peak of enhancement.

Another property of IC neurons correlating with theresponse to modulation is regularity of firing. Regularfiring, as measured by calculating the coefficient of vari-ation (320), is apparent in a number of different neuronaltypes (221). A preliminary report (225) shows that unitswith highly regular intrinsic oscillations show a strongcorrelation between tBMF and the oscillation frequency.On the other hand, cells with peaked rMTFs are mainlylimited to neurons that fire irregularly to tones.

G. Is Modulation Frequency Represented

Topographically in the IC?

Some of the responses discussed so far, in CN and IC,provide suggestive evidence for a physiological imple-mentation of a modulation filter bank. This view would bestrengthened if neurons were found to be spatially orga-nized according to their AM tuning properties, since thecreation of spatial maps is a common strategy in nervoussystems. Evidence for a topographic representation ofmodulation frequency in the IC of cat was reported bySchreiner and Langner (251). rBMFs and tBMFs weredetermined for units encountered in multiple penetra-tions through the IC at recording sites reconstructed fromthe coordinates of the electrode penetration and the re-cording depth. The measured values, together with inter-

AUDITORY MODULATION PROCESSING 559

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 20: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

polated points, were assembled to create a map of BMF.Two patterns of rBMF organization emerged. First, a gra-dient of rBMF extended along the dorsoventral axis of thecolliculus with CF. Measurements of rBMF along suchelectrode penetrations revealed a progressive increase inrBMF with depth, although the overall trend was accom-panied by discontinuities and reversals of rBMF. In addi-tion, a map of BMF extended across the plane of thefrequency-band laminae. The highest BMFs were foundcaudally in the lateral half of the lamina. Regions repre-senting the highest BMFs were surrounded by “quasi-concentric” iso-BMF contours representing progressivelylower BMFs. The diameter of the contour representingeach BMF and the upper limit of BMF increased with CF.Thus, considered in three dimensions, each modulationfrequency is represented on the surface of a cone havingits base located in the high-frequency region of the IC andits long axis aligned with the dorsoventrally orientatedtonotopic axis of the IC (Fig. 7). Schreiner and Langner(251) propose that this map demonstrates the importanceof the IC in the perception of periodicity pitch and thatsuch a representation could facilitate the integration ofperiodicity information across carrier frequency. In sup-port of the map, they cite the corroborative evidence thatresponse latency is spatially mapped across the frequencyband laminae in the IC (153) and that BMF is negativelycorrelated with response latency. This implies that thereshould be a mapping of BMF along the same axis as thelatency map. Evidence for a mapping of modulation fre-quency has also been reported in a developmental study

in the gerbil with responses to the highest modulationfrequencies found most laterally as in the cat (103).

The publication of such a mapping of BMF has beeninfluential in the development of theories and models oftemporal processing in the auditory pathway (35–37, 105,149). However, a correlation of BMF with location or withCF has not been confirmed in other studies; indeed, asdiscussed above, there is still debate about the range ofmodulation frequencies represented in the IC. Given theconcentric organization of the modulation map describedin the cat, it is unlikely that a pattern of such complexitywould be found unless it were the primary objective of thestudy. But, as discussed in section VIIIC, the determinationof BMFs from multiunit data, on which most of the map-ping is based, must proceed with caution. On the otherhand, it is difficult in single-unit studies to achieve thenecessary sampling density that such mapping ideallyrequires. An additional complicating factor in this discus-sion is the lack of invariance of both tBMFs and rBMFswith stimulus level (144, 223). Resolution of this issuemay depend on the development of techniques that en-able the modulation response properties of large popula-tions of neurons to be determined with high spatial andtemporal resolution. Finally, it should be emphasized thatthe absence of a map would not invalidate the existenceof a modulation filter bank. As an analogy, there is someevidence for a map of ITD tuning in the MSO (14, 269,313), but a spatial organization in the IC has not beenconvincingly demonstrated (315). Nevertheless, the rele-

FIG. 7. Bandpass temporal (A) and ratemodulation transfer functions (B) in the in-ferior colliculus (IC), with indication of bestmodulation frequencies (tBMF and rBMF)and cutoff frequencies. Various definitionshave been used for cutoff frequency, usuallybased on a decrease in gain (e.g., the fre-quency at which the synchronization valueis 3 dB down from the maximal gain at theBMF) or statistical significance (e.g., thehighest frequency at which significant syn-chronization is observed). C: schematic il-lustration of the proposed map of BMFs inthe IC. The concentric circles indicate iso-BMF contours within an iso-frequencyplane. Dashed lines connect contours of thesame BMF.

560 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 21: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

vance of ITD tuning for binaural hearing is not in ques-tion.

H. Responses to Interaural Time Disparities

in Modulation Envelopes

Human subjects can localize sounds using on-goingITDs, generated by the amplitude envelope even when thecarrier frequency of the sound is above 1.5 kHz andsubjects can no longer localize using interaural time dif-ferences in the carrier (see sect. VI). Physiological re-sponses to such binaurally disparate amplitude modula-tions were first investigated systematically by Yin et al.(317). Firing varied cyclically as a function of ITD, at aperiod equal to that of fm, indicating that the neuronswere responding to the interaural delay of the modulationwaveform, not of the carrier. In many respects, ITD sen-sitivity in the IC strongly resembles that in the SOC, e.g.,it reflects the same two basic forms of interaction (seesect. VI and Fig. 5). There are also differences, indicatingan elaboration of response properties between SOC andIC, but these are outside the scope of this review (13, 66,172).

The width of ITD tuning to sinusoidal signals is ba-sically determined by the period of the stimulus. Lowfrequencies are weighted more heavily in responses basedon envelope than in those based on fine structure, be-cause envelope MTFs of IC cells typically extend furtherto low frequencies than their tuning to fine structure. ITDtuning therefore is typically broader to AM signals than totones. However, even at high CFs, where phase-locking tofine structure is completely lacking, the ITD tuning tobroadband noise can be surprisingly sharp (123). Thepresence of such tuning, in the absence of any ITD sen-sitivity to pure tones, indicates that envelope fluctuationsgenerated by the interaction of the cochlear bandpassfilters with the broadband stimulus can effectively beused in the computation of ITDs.

I. Contribution of Nonlinearities

For all but the lowest modulation depths, the re-sponse to a sinusoidal AM in the IC is not sinusoidal butmore peaked with firing restricted to only part of themodulation cycle (144, 196, 222). As modulation depth isincreased, changes also occur in the phase of the re-sponse histograms relative to the stimulus (144, 196, 222).Such changes are consistent with the response followingthe amplitude envelope at low modulation depths butchanging to one which is sensitive to the rate of amplitudechange at high depths. Sometimes this is associated withthe appearance of a smaller second peak in the histogramindicative of a response to the downward amplitudechange in the modulation cycle (222). Direct evidence for

such responses comes from experiments using modula-tions with exponential envelopes (215).

Similarly, asymmetries have been reported in boththe rate and temporal responses of IC neurons in guineapig to exponentially ramped and damped sinusoids (197).When such ramped and damped stimuli have the samehalf-life, their long-term spectra are identical, but theirdifferent temporal structures generate quite distinct per-cepts (205). The percentage of units showing asymmetryin the magnitude of their temporal or rate responses tothese stimuli is greater than obtained using similar anal-yses in the VCN (216), and the proportion of neuronsshowing response asymmetry at each stimulus half-lifeclosely matched human psychophysical performance(205).

A few studies have investigated nonlinearities in theresponses of IC neurons to AM using more complex mod-ulation waveforms. Møller and Rees (189) recorded spikehistograms synchronized to the period of a pseudoran-dom noise used to modulate a tone carrier. Cross-corre-lation of the pseudorandom noise with the histogram toobtain the impulse response followed by Fourier tranfor-mation generates the tMTF. This estimate of the linearcomponent of the response correlates well with re-sponses obtained using sinusoidal modulation. An esti-mate of the nonlinear component can be obtained byusing the impulse response to model the neuron, with thedifference between the neuronal and model outputs pro-viding a measure of the nonlinearities present in the neu-ronal response. The nonlinearities were predominantlyeven order, perhaps representing asymmetry in the re-sponse to increasing and decreasing sound intensity. Ap-plication of this technique to the owl IC similarly demon-strated the presence of significant nonlinearity (133).Such nonlinearities are more prominent in the response ofIC neurons than those in the cochlear nucleus (184, 188).

The AM stimulus that ultimately holds the greatestinterest for auditory neuroscience is human speech. Del-gutte et al. (40) compared the encoding of modulatednoise and a speech utterance at the levels of the auditorynerve, CN, and IC. Step responses derived from the re-sponses to modulation indicate that responses to ampli-tude changes in the IC are more phasic than those in theauditory nerve and, to a lesser extent, the CN. This wasborne out by the responses to speech sounds that werecharacterized by bursts of activity at the onsets of sylla-bles. When the responses to the speech waveform wereestimated with the linear component of the modulation,the model accurately predicted the neural response forneurons in the auditory nerve and cochlear nucleus, butthe match for the IC was poor.

Although much less abundant than reports using si-nusoidal modulation, these studies indicate that the emer-gence of nonlinear responses to modulated stimuli is adefining characteristic of processing in the IC, and the

AUDITORY MODULATION PROCESSING 561

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 22: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

greater application of such nonsinusoidal AM stimuli islikely to add substantially to our knowledge of nonlinearmechanisms in the IC.

IX. AMPLITUDE MODULATION ENCODING

IN AUDITORY THALAMUS AND

CEREBRAL CORTEX

A. Basic Layout of the Thalamocortical System

The medial geniculate body (MGB) of the thalamus isan obligatory station for auditory information from themidbrain to the cerebral cortex. Based on cytoarchitec-ture, connectivities, and physiological response proper-ties, three main thalamic regions can be defined (304).Similarly, auditory cortex consists of several distinctfields that can be grouped into core, belt, and parabeltregions according to connectivity and physiology (130,218). We discuss the projection systems set up in thethalamus and their relationship with the parcellation ofauditory cortex.

The ventral division of the MGB (MGBv) is consid-ered the principal part and is functionally distinguishedby a clear tonotopy that is related to its laminar dendriticorganization. The MGBv is functionally homogeneouswith sharp frequency selectivity, short latencies, and lowresponse thresholds. Several properties, such as the den-sity of inhibitory interneurons, sharpness of tuning, onsetlatency, and strength of pure-tone phase-locking, varysystematically along the anterior-posterior axis, i.e., or-thogonal to the frequency gradient (236). The axons fromthe ventral division terminate predominantly in tonotopi-cally organized “core” areas of auditory cortex, specifi-cally the primary auditory cortex (AI) as well as theanterior and posterior auditory fields (AAF and PAF, re-spectively) in the cat and field R in the macaque monkey.The projections from MGBv also reflect the anterior-pos-terior gradients so that, for example, AAF in the catreceives stronger input from the anterior pole, whereasPAF and the ventroposterior auditory field (VPAF) arechiefly connected with the posterior pole. The same holdsfor the numerous corticothalamic feedback projectionsfrom the cortical core regions to the MGBv.

Two further projection systems parallel to the tono-topic system have been identified. One “diffuse” or non-tonotopic system is routed through the dorsal division ofthe MGB (MGBd). MGBd and its subdivisions are charac-terized by broad tuning, weak responses to tones, andsome preference for more complex sounds. The dominantneurons are stellate cells, and the cortical projection ispredominantly to nontonotopical fields in the belt andparabelt regions of auditory cortex such as the secondauditory field (AII) in cat and CM in the macaque monkey.The third projection system is associated with the medial

division of the MGB (MGBm). This “magnocellular” areais characterized by fairly large multipolar cells and re-ceives polysensory inputs. No clear tonotopic organiza-tion is evident, and the neurons are usually broadly tunedor have multiple response areas. MGBm projects to awide range of cortical fields including areas in the core,belt, and parabelt regions, and it also receives widespreadcorticothalamic feedback. In addition, the dorsal and me-dial projection systems are distinguished by their termi-nation predominantly in layers I and VI, while inputs fromthe main tonotopic system end in layers IV and III.

Functional differences between the three projectionsystems and their associated regions have been mainlyexplored using spectral properties, such as frequency andintensity. Again, the importance of temporal dimensionsin the perception of complex sounds suggests that muchcan be gained from the study of temporal response fea-tures in the different parts of auditory thalamus and cor-tex (31, 101, 210).

B. Temporal Responses in the MGB

Relatively few studies have addressed the capabilityof thalamic neurons to encode temporal information. Astudy of thalamic neurons in the awake guinea pig (34)revealed that some neurons phase-lock to AM tones withmodulation frequencies up to 200 Hz. A more systematicstudy in the awake squirrel monkey (217) showed thatmost tMTFs were bandpass with tBMFs between 2 and128 Hz. The most commonly encountered tBMF was at 32Hz. MGBm had a higher median tBMF (16 Hz) than MGBv(8 Hz). Over the range of modulation frequencies tested,no significant difference was observed between rBMFsand tBMFs. This suggests that AM coding in the thalamus,at least below �100 Hz, is mostly conveyed by a temporalcode accompanied by rate changes due to the phasicnature of the responses. To date, there is little informationavailable that directly contributes to the question of theincreasing prominence of rate-coding in the more centralauditory stations.

Changes in modulation depth affect rate and syn-chronization differently; synchronization increased withincrease in m, while the firing rate showed a nonmono-tonic dependence. Changes in overall intensity of the AMsignal resulted in either monotonic or nonmonotonicchanges in firing rate and synchronization, with a higherpercentage of nonmonotonic changes in synchronization.

Recently, a number of studies in a variety of struc-tures have utilized complex auditory spectra to estimatethe spectrotemporal receptive field (STRF) of neuronsusing reverse correlation methods (e.g., Refs. 2, 38, 56, 58,142, 143, 148). The STRF can be interpreted as the averagesignal preceding an action potential, corresponding to thespectrotemporal impulse response of the neuron. STRF

562 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 23: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

estimates of temporal resolution can be directly related toestimates using isolated AM sounds and would yield thesame result in a linear system. Additionally, the use ofcomplex spectra can reveal nonlinearities such as thedependence of the estimated filter shape on spectral andtemporal depth of modulation and overall intensity. Arecent analysis of temporal filter properties derived fromSTRFs in MGBv of ketamine-anesthetized cats (175) (Fig.8) revealed a similar range of tBMFs (35 � 30 Hz) to thatobserved in the awake guinea pig and squirrel monkey(34, 217). As seen with isolated AM signals, individualneurons could follow modulation frequencies above 100Hz. Compared with AM responses in the IC, it appearsthat the overall range of temporal following capacity inthe auditory thalamus is considerably reduced (Fig. 9).

A number of studies that have explored the coding ofclick trains in the auditory thalamus contribute signifi-cantly to our knowledge of temporal coding in the MGB.Changes in fm of an AM stimulus result in the systematicchange of two potentially confounding aspects of thestimulus, namely, a change in the period between eventsand a change in the rise time of each event. To avoid theeffects of rise-time changes with repetition rate, clicktrains have been widely used to explore temporal codingproperties. While these two methods are not totally equiv-alent, they do capture closely related aspects of repetitionrate coding. One of the first studies of temporal coding inthe thalamus was carried out using click trains (284) inthe awake, paralyzed cat. As in AM studies, maximumlimiting rates (i.e., the highest click rate that showed anyevidence of phase-locking) varied widely between 6 and200 Hz. These findings were confirmed and expanded in aseries of studies by Rouiller and colleagues (240, 241) innitrous oxide-anesthetized cats. These investigators dis-tinguished neurons by differences in the temporal preci-sion of the responses. The largest group of neurons

(“lockers,” 71%) showed tight temporal locking to theclicks. “Groupers” (8%) responded with weak temporalsynchrony, and “special responders” (21%) showed noclear phase-locked responses although changes in firingrate did occur, occasionally resulting in strongest re-sponses for click rates between 200 and 400 Hz. Overall,limiting rates between 10 and 800 Hz were observed, and�50% of lockers had a limiting rate greater than 100 Hz.Keeping in mind that these limiting rates were not ex-tracted at the 50% value of the transfer functions (thetraditional measure of limiting rate), and the inherentdifferences between click-train analysis and AM analysis,the actual range of temporal resolution estimated by thismethod appears to be compatible with that observed inAM studies.

Rouiller and De Ribaupierre (240) reported somedifferences between thalamic subdivisions regarding thepercentage of lockers. More lockers were located in theanterior region of MGBv than in the posterior portion, andthe highest limiting rates were also encountered in theanterior part. They observed no clear CF dependency forthe distribution of lockers but noticed that the lockershad shorter latencies than groupers and special respond-ers. Furthermore, lockers with limiting rates above 100 Hzhad response latencies �2–3 ms shorter than lockers withlimiting rates below 100 Hz, similar to the latency-BMFcorrelation found in the IC (153). No obvious differencesin the distribution and range of limiting rates were foundbetween recordings made in the nitrous oxide-anesthe-tized and awake preparations.

In summary, AM phase-locking in thalamic neuronsvaries over a wide range from a few Hertz to severalhundred Hertz. Some neurons can follow high rates, butthe majority of neurons appear to peak at rates below 100Hz. A subgroup of neurons may respond to temporalinformation with changes in firing rate rather than in

FIG. 8. tMTFs in the medial geniculate body (MGB) and primary auditory cortex (AI). Typical example tMTFs(synchronized firing rate) from neurons in the ventral division of the MGB (A) and in AI (anesthetized cat) (B). C:composite tMTFs for thalamus (dashed line) and cortex. By averaging all tMTFs for thalamic and cortical unitsseparately, the temporal modulation filters of these two stations are approximated. The dotted lines indicate the 6-dBupper cut-off frequency. [Adapted from Miller et al. (176).]

AUDITORY MODULATION PROCESSING 563

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 24: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

phase-locking; however, the proportion of such a groupand its properties are still unexplored. It appears that themajority of neurons show limiting rates below that of theIC, but a detailed comparative study of the transformationof temporal coding from the IC to the MGB is still lacking.

C. Responses to AM in Primary Auditory

Cortex: Synchronization

A number of studies provided initial evidence thattemporal coding in auditory cortical neurons may be sub-stantially reduced compared with subcortical levels (Fig.9). Studies with FM and AM in the awake cat (300) and

guinea pig (34) showed neurons had maximum followingrates of �30 Hz. In later studies, the range of synchroni-zation of AI neurons to AM was systematically explored ina variety of species. A high percentage of neurons showedband-pass tMTFs (53, 75, 157, 256). The tBMF values in AIwere found to be independent of the CF of the neurons(53, 157, 256). Accordingly, temporal information in dif-ferent frequency channels can be processed indepen-dently from each other; within each spectral band, AMinformation can be decomposed by different neurons intodifferent AM ranges. Much attention has therefore beengiven to the distribution of optimal modulation frequen-cies. Preferred modulation frequencies commonly vary

FIG. 9. An overview of rMTF (left panel) and tMTF (right panel) properties at different anatomical levels. Each entryshows means or medians (circles) � SD (lines) and lowest and highest values (bar). Dark bars, thick lines, and solidcircles are for rBMFs (left) and tBMFs (right); light bars, lines, and empty circles are for upper tMTF cutoff frequencies(right). For convenient comparison, the left panel is arranged mirror-symmetric with respect to the right. The populationmeasures are taken from published data for one anatomical level, sublevel, or cell class; the numbered reference to thepublication is shown next to the data, followed by a letter indicating the species (b, bat; c, cat; g, gerbil; gp, guinea pig;m, marmoset; r, rabbit; s, squirrel monkey), and the letter “U” if unanesthetized. Note that part of the differences betweenstudies reflects differences in the metrics used (in particular upper cutoff, which is often defined as a corner frequencyor alternatively as the upper limit of significant phase-locking). Approximate ranges of perceptual and sound classes areindicated below the abscissa.

564 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 25: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

between 1 and 40 Hz with the vast majority of tBMFsbelow 20 Hz. Across all studies, tBMFs above 50 Hz wereencountered in only a very small percentage of neuronsbut could occasionally be as high as 100 Hz (17, 157, 255,256). The composite tMTF in cat AI (ketamine anesthe-sia), constructed as the weighted sum of all tMTFs mea-sured, shows a tBMF of 12.8Hz and a 50% cut-off fre-quency of 37.4 Hz (Fig. 8) (176).

It is tempting to regard the presence of modulationtuning and the range of BMFs as a physiological imple-mentation of a modulation filterbank (e.g., Ref. 35), butthe functional consequences of these cortical (and sub-cortical) observations are at present unclear and shouldnot be overstated. When “spatial-frequency channels”were first described in visual psychophysics and spatial-frequency tuning was later found physiologically, it wassuggested that these channels formed the basis for avisual Fourier analysis of the retinal image, but this notionhas been discredited (303). There is currently no unequiv-ocal evidence that modulation tuning underlies an analy-sis of the modulation spectrum in the sense that thecochlea performs an analysis of stimulus spectrum. Forexample, will an envelope with a low fundamental (e.g., tospeech syllables) but fast components (i.e., broad enve-lope spectrum) recruit neurons tuned to high modulationfrequencies? Is the relative phase of different envelopecomponents somehow reflected in neural synchronizationor average rate? Even if modulation-tuned neurons do notperform a full envelope decomposition in the Fouriersense, it is easy to see that such envelope tuning could beuseful in other ways. For example, modulation tunedchannels could parse spectral stimulus components ac-cording to their dominant modulation frequency so thatthe spectral components with a common modulation fre-quency can be grouped in a further step.

Differences in temporal processing between corticalneurons and their thalamic inputs are not only evidentfrom population comparisons but were directly observedin functionally connected thalamocortical neuron pairs(34, 175) and were also evident in current source densityanalysis of the thalamic input and cortical output layers ofAI (274). While these correlation studies reveal a reduc-tion of temporal following capacities from MGBv to AI,the temporal modulation preferences in thalamus andcortex are not correlated by rank (175), i.e., thalamic cellswith high (low) BMFs do not preferentially project tocortical cells with high (low) BMFs. These findingsstrongly suggest that a transformation of temporal re-sponse properties takes place at the thalamocortical in-terface.

The width of the transfer function provides a mea-sure of response selectivity. For individual neurons, thebandwidth of tMTFs, estimated at 50% of the maximum, isin the range of the BMF values but can vary by a factor of�5 (53, 176, 256) in the anesthetized cat. Bandwidth

variations of tMTFs in the awake marmoset monkey (157)are of similar magnitude. This means that AM selectivityvaries considerably among cortical neurons but that over-all the selectivity is relatively poor.

Variations in species, anesthetic state, and estimationmethod between the different studies do not permit aneasy comparison to sort out these different influences onenvelope processing. However, it appears that neitheranesthesia nor species-specific effects provide strong in-fluences on the tBMF distribution of cortical neurons.This is not to say that there are no anesthetic effects;however, given the fairly large range of variability in theconditions of these studies, a simple group evaluation isunlikely to provide such evidence.

The range for time-locked AM coding appears to belimited to the envelope frequencies underlying the per-ception of rhythm, roughness, and the following rate ofsyllables in communication sounds. The cortical coding ofhigher modulation frequencies, important for voicing orperiodicity pitch information, does not seem to fully uti-lize the same temporal code.

D. Responses to AM in Primary Auditory Cortex:

Average Rate

In view of the successive reduction in envelope syn-chronization already discussed for the different synapticstages leading up to cortex, it is not too surprising to findthe reduction in tBMF. Adverse effects on synchroniza-tion should however not necessarily affect rate tuning.For example, exquisite frequency and ITD selectivity inaverage rate is found at the cortical level and can besharper than in the brain stem. Therefore, we expect tofind envelope tuning in rMTFs, as it is already prominentlypresent in the IC.

Bandpass rMTFs are indeed found but appear lesscommon than bandpass tMTFs. In the rat, �90% of thetMTFs showed bandpass characteristics while only 30% ofthe rMTFs were bandpass (75). In AI of the awake squirrelmonkey (17), this difference was less pronounced, withbandpass behavior for 49% of the tMTFs compared with39% of rMTFs. The remaining neurons were either lowpass, high pass, all pass, or had complex filter shapes.Similar results were reported for the cat (48). In awakemarmosets, 73% of AI units had bandpass rMTFs, andmany neurons were only driven when temporal modula-tions were present (157).

An important difference with tMTF tuning is the con-sistent observation that the tuning for rMTFs extends tohigher modulation frequencies, although it is still quitelimited compared with the brain stem. There is also afairly large variance, possibly related to the use of anes-thesia, in the reported range of rBMFs and upper cut-offfrequencies (e.g., as defined by a 50% reduction in rate)

AUDITORY MODULATION PROCESSING 565

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 26: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

obtained across the various studies in AI (Fig. 9). Themajority of rBMFs in anesthetized studies are below 50 Hz(46, 49, 53, 75, 256). Studies in awake animals (17, 34, 157,190, 247, 260) yielded rBMFs that were either not substan-tially different from those in anesthetized animals or dif-fered by less than a factor of two. The effect of anesthesiaseems to affect the strength of the response (sustained inunanesthetized animals, onset under anesthesia) morethan the range of BMFs. The reduction of the uppercut-off frequencies in tMTFs by anesthesia may be moresubstantial than on rMTFs (52, 83, 163) and may affect thetemporal coding capacity for the highest temporallycoded AM frequencies including the range of AM frequen-cies associated with the perceptual attributes of rough-ness and periodicity pitch (64).

The general finding that BMFs and upper cut-off fre-quencies are higher in the rMTF than in the tMTF ledBieser and Muller-Preuss (17) to suggest that “low mod-ulation rates were mostly encoded by phase-locked neu-ral responses and the higher AM sounds by non-phase-locked spike rate variations.” While the experimental ev-idence for this claim was suggestive but not conclusive,Lu et al. (161) demonstrated more forcefully that thisnotion might indeed be true and proposed a two-stagemodel in which temporal modulations are combined overan integration window of �30 ms; temporal patterns sep-arated by intervals longer than 30 ms are coded explictlyin temporal form, while more rapid patterns are codedimplicitly by average rate.

It is not entirely clear whether this scheme can fullyaccount for the coding of modulations since, even inawake animals and for only a small fraction of the cells,rBMFs reach maximal values of only a few hundred Hertz.This is only an octave above the highest tBMFs (evenwhen measured on the same cells, e.g., Ref. 157) andlower than the upper limit for periodicity pitch (� 800 Hz)and modulation detection (�2.2 kHz). The markedly re-duced cortical upper limit, particularly compared with thebrain stem, is in stark contrast to the upper limit for ITDsensitivity to AM signals, which appears not to differbetween cortex and brain stem and extends to modula-tion frequencies up to 1,000 Hz (awake rabbit, Ref. 67).Thus envelope-based ITD tuning created in the brain stemis relayed without degradation or recoding to AI, whereasthis does not appear to be the case for AM bandpasstuning.

Schulze and Langner (259, 261) suggested an alterna-tive coding strategy; in AI of the awake as well as theanesthetized gerbil, these investigators showed rate tun-ing of cortical neurons to AM between 50 and 3,000 Hz,clearly outside the range of cortical phase-locking, butonly when the carrier frequency was placed far above thecell’s CF. A preliminary study (171) reported similar sen-sitivity in the IC but attributed the mechanism to differ-ence tones generated in the cochlea, i.e., interpreted it as

a spectral rather than a temporal effect. Since psycho-physical studies indicate that the perception of periodic-ity pitch does not depend on difference tones, it is unclearwhether the mechanism proposed by Schulze and Lang-ner provides its neural basis, although the authors raiseseveral indirect counterarguments against the role of dif-ference tones as the explanation for their observations.

Overall, then, the timing of cortical discharge en-codes low modulation frequencies corresponding to theperceptual ranges characterized by rhythm and fluctua-tion strength (48, 53, 60) and, potentially, roughness (64,255). A code based on the mean firing rate may representfast AMs such as those associated with periodicity pitch,but it remains unclear whether these two coding strate-gies adequately explain AM coding over the entire percep-tual range.

E. Responses to AM in Primary Auditory Cortex:

Influence of Modulation Parameters

The results discussed above were mostly derivedwith a modulation depth (m) of 100%. Decrease in m

results in monotonically reduced synchronization (60),especially for m � 0.5 (49). In the awake squirrel monkey,86% of the neurons had maximum synchronization for80–100% modulation and showed a monotonic decreasewith reduction of m. Average firing rate was essentiallyconstant as function of modulation depth (17). Values ofrBMF and tBMF were little affected by m in the awakemarmoset (157).

Changes in the overall intensity resulted in minorinfluences on BMF, cut-off frequency, and shape of theMTF (46, 157, 255). However, the firing rate showed astrong effect with intensity revealing a limited range ofbest levels (49). Phillips and colleagues (211, 212) noticedintensity-specific differences between the responses tolow and high modulation frequencies. Better responseswere observed for higher modulation frequencies at lowintensities and for low modulation frequencies at higherintensities; that is, the shape of MTFs can be level depen-dent. The rMTF appears to be more resistant to changesin SPL than the tMTF (157).

In a few studies, the effect of the modulation wave-form was investigated. These observations suggest a com-mon temporal window within which afferent signals areintegrated. Rectangular AM resulted in stronger responsesynchrony than sinusoidal AM, but the tBMFs were sim-ilar (255, 256). Modulation with an exponential sine-waveenvelope increased the sharpness of modulation tuningwith decreasing duty cycle but showed no dramatic ef-fects on BMF or cut-off frequency (49). Temporal syn-chronization to binaural beats (generated by binaural in-teraction in the brain stem, see sect. VI) also revealedcut-off frequencies of �40 Hz (219). Moreover, results

566 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 27: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

from the awake primate (157) indicate that BMFs for AMand FM are often closely matched for single neurons.

Using dynamic ripple spectra, i.e., spectral envelopesthat are periodic along the frequency axis, to determinethe temporal impulse response properties in AI by reversecorrelation in anesthetized ferrets (142) and cats (175)revealed tBMFs that essentially overlapped with the valuerange seen in several other species estimated with AMtones. Direct comparison between two carrier typesshowed either no significant difference in the tBMFs fortonal and noise carriers (53, 217) or an average tBMF thatis slightly lower for tonal carriers (49). This suggests thatthe carrier bandwidth may have little influence on tempo-ral coding properties.

F. Differences of Temporal Coding Between

Cortical Fields

In view of the differences between thalamic subdivi-sions in terms of thalamocortical connectivity (see sect.IXA) and temporal responses (see sect. IXB), it is of inter-est whether neurons in different cortical fields also differin their ability to code temporal information (Fig. 9). FieldAAF in the cat, a component of the core area like AI,shows evidence of higher BMFs and limiting rates than AI(111, 255, 256). There is some evidence of spatial cluster-ing in AAF with faster following neurons more abundantfor CFs above 10 kHz (53, 111, 255). Further evidence offaster following rates in AAF over AI has been obtainedfrom STRF measurements in mice (159). The duration ofSTRFs from AAF was found to be shorter than in AI.Because STRF duration is inversely related to the BMF oftMTFs, it follows that AAF neurons have higher BMFscompared with AI. Another predictor for repetition fol-lowing capacity is the onset latency of isolated CF tonesor clicks. Schreiner and Raggio (253) reported a weak butsignificant negative correlation in cat AI for click latencyand BMF, similar to results in the IC (153) and MGB (240).Onset latencies in AAF of cats (111) and mice (159) areshorter than in AI, further supporting the notion that AAFhas a higher following capacity than AI.

Cortical fields outside the core areas seem to performat even lower temporal fidelity than that found in AI. Inthe cat, tBMFs and rBMFs of cortical fields AII, PAF, andVPAF were 20–80% of those seen for AI (53, 256). Similarresults were found in the awake squirrel monkey (17). Inthe latter study, three groups of cortical fields could bedistinguished based on their temporal properties. A groupcontaining AI had average BMFs of �8 Hz; a group thatincluded the rostral field and the insula had BMFs of 4 Hzand below, and a group containing the anterior-lateralfield had a predominance of BMFs around 2 Hz. Com-bined, these findings suggest that hierarchically “higher”auditory cortical fields primarily receiving input from tha-

lamic projections other than the ventral nucleus appear toshow slightly but consistently slower following capacitywhen tested with AM stimuli than primary cortical fields.

G. Cortical Mechanisms

The cause for the reduced temporal following capac-ity of cortical neurons compared with subcortical stationsis still not entirely clear. A diversity of cellular and net-work properties are likely to affect cortical temporal be-havior. These include mechanisms of adaptation and post-excitation suppression (19, 25, 116), postsuppression re-bound (42, 47, 75), intrinsic oscillation (42, 75, 106, 134,249), and synaptic depression (1, 169, 170). It has beensuggested that tBMFs are largely determined by processesintrinsic to the cortical-thalamic network while cut-offfrequency seems to be influenced by intrinsic pyramidalcell mechanisms (51). Models that include dynamic syn-aptic processes have been proposed that can account formany aspects of cortical responses to various repetitivesignal envelopes, including sinusoidal AM stimuli (41, 54,55). Eggermont (55) demonstrated that the envelope syn-chronization of cortical activity can be modeled based ontwo main components: the degree of input or presynapticsynchrony and the shape of a temporal filter that is de-termined by properties of synaptic dynamics. The inputsynchrony is highly dependent on the shape of the enve-lope waveform and reflects peripheral integrative mecha-nisms that determine response latency and spiking jitter(102). The properties of the synaptic dynamics are lessstimulus dependent and reflect cortical synaptic activitychanges after repeated stimulation that cause short-termsynaptic depression or facilitation (1, 169, 170). The syn-aptic dynamic acts as a temporal low-pass filter on thesynchronized input and is dominated by synaptic depres-sion. This two-stage model of cortical modulation trans-formation holds great promise in unifying many aspects oftemporal envelope processing (55) and other temporalbehaviors of cortical neurons (41). It is likely, however,that other, conceivably nonlinear, influences also contrib-ute to the shaping of MTFs. This is indicated by theobserved relationships of onset latency and the period ofintrinsic oscillations with BMF as well as the effects ofspectral and temporal stimulus composition on corticaladaptation behavior (19, 280).

H. Temporal Coding of Complex Sounds

Most studies of complex multisyllable or multi-“phrase” communication sounds in auditory cortex notedthat neuronal responses were predominantely located atthe beginning of each phrase provided that the phrasesdid not follow each other at rates of more than 20–30 Hz.This effect was not dependent on the species-specific

AUDITORY MODULATION PROCESSING 567

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 28: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

nature of the calls and was seen for speech sounds as well(50). For example, responses to bird songs in cat auditorycortex (273) showed preferred response intervals corre-sponding to �10 Hz. Responses to species-specific calls inawake squirrel monkey (74), anesthetized squirrel mon-key (192), and anesthetized marmoset (292) all showed“phrase”-locking in the response to repetitive call phrasesaround 8–12 Hz. Similar values were obtained in theawake guinea pig to various bird and guinea pig vocaliza-tions (34). Wang et al. (292) tested whether the temporalresponse to complex sounds was tuned like the responseto more elemental sounds by using stretched and com-pressed natural vocalizations of marmosets, withoutchanges in the spectral content of the calls. The respon-siveness to the calls was maximal at the natural repetitionrate of the phrases near 8 Hz. In other words, the tMTF ofmost neurons was tuned to the repetition rate of thenatural call. Similarly, Nagarajan et al. (192) reported thatthe response modulation rates of cortical neurons acti-vated by vocalizations in the marmoset monkey werehighly correlated with the BMFs found for AM tones.

The pulse repetitions in echolocation calls of bats areanother example of temporal structures that require de-tailed processing by the auditory system. Phase-lockedresponses of cortical neurons in the bat occur over similarranges as found for AM and click trains in other mamma-lian species. Sixty percent of BMFs in AI of Eptesicus

fuscus were at or below 10 Hz but could be as high as 83Hz (116). Pulse repetition coding in the awake FM batMyotis lucifungus and the mustached bat Pteronotus

parnellii had limiting rates of �100 Hz (308) and up to300 Hz (276), respectively, commensurate with the behav-iorally relevant range of timing used in echolocation.

A likely strategy for encoding of complex sounds inauditory cortex is by the temporal-spatial discharge pat-tern of distributed neuronal populations across the corti-cal fields (34, 207; see also Refs. 30, 39). Initial studies ofthe response of cortical neurons to vocalizations (34, 306,307) combined with more recent studies of the detailedrepresentation of species-specific vocalizations (192, 292)and speech sounds (309) in the primary auditory cortex ofNew World monkeys and cats provide evidence that be-haviorally relevant vocalizations are well represented byspatially distributed but temporally highly coherent neu-ronal discharges. At major transitions during the courseof the signal, a temporally coherent activation of specificneuronal subpopulations across the cortical fields is cre-ated. The synchronous timing of responses across manysites in primary auditory cortex (and in parallel in othercortical fields) may provide the necessary means for ap-propriate grouping or segregation of sequential elementsin ongoing foreground and background sounds. The rangeof modulation frequencies spanned by cortical tMTFs ofgenerally moderate selectivity may be sufficient to pro-vide representational and, perhaps, perceptual invari-

ances of complex sound sequences despite potentiallylarge variations in phoneme rate or in the sequence rate ofmusical tones. The distributed representation of temporalenvelope information in each carrier frequency band al-lows a segregated processing of different temporal phe-nomena within a given frequency “channel” as well asprocessing of similar temporal aspects across frequencychannels (194).

I. Plasticity of Temporal Coding Properties

in Auditory Cortex

Studies of representational plasticity in auditory cor-tex of adult animals have largely focused on spectralproperties, but several studies have recently examinedtemporal properties and reported use-dependent changesin the tMTF. Beitel et al. (15) trained owl monkeys todiscriminate between two different, sequentially pre-sented, AM rates and rewarded the animals when theycorrectly indicated that the second stimulus had a higherAM rate. The modulation frequencies were chosen to bein a range (4–40 Hz) where they could induce phase-locked cortical responses. Over the course of the training,AM discrimination thresholds gradually improved. Analy-sis of the tMTFs of the trained animals revealed that theshape of the transfer function changed dramatically. As aconsequence, average limiting rate more than doubledfrom 12 Hz to �30 Hz, and BMF increased from 8 to 15 Hz.This result indicates that temporal coding properties ofcortical neurons can be modified by learning.

Studies in rat AI investigated the influence of thestatistics of the input signal on the reorganization ofauditory cortex (138, 139). Stimulation of the nucleusbasalis in the basal forebrain has been shown to increasethe potential for cortical plasticity without explicit behav-ioral training of the animals (45, 92, 98, 174). Pairing ofnucleus basalis stimulation with acoustic stimulation(139) caused pronounced changes in the tMTFs whichdepended on the temporal properties of the stimuli pairedwith the electrical stimulation (Fig. 10). A 20–40% in-crease of the BMF and cut-off frequency was observedwhen the modulation frequency of the acoustic stimuluswas slightly higher than the normally observed values ofthe tMTFs. Pairing of electrical stimuli with modulationfrequencies below the normal tBMF values caused a de-crease in the neuronal cut-off frequencies.

These results indicate that important aspects of tem-poral properties of the cortex undergo plastic reorganiza-tion, reflect aspects of the temporal statistics in the inputstimuli, and can be modified by mechanisms involved inlearning to match specific auditory tasks even in fullymature animals.

568 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 29: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

X. NEUROPHYSIOLOGICAL AND

PSYCHOLOGICAL STUDIES IN HUMANS

A number of techniques are beginning to provideinformation about the analysis of AM in the human brain.MTFs generated from steady-state evoked potentials andmagnetic responses to the envelope of modulated sounds(e.g., Refs. 146, 213, 220, 235, 239) are at least qualitativelysimilar to modulation sensitivity demonstrated psycho-physically. It is noteworthy however that neither psycho-physical nor event-related potential measures show muchevidence of the bandpass tuning to modulation that is afeature of many single-unit responses. Estimates of groupdelay in evoked potentials (119, 146) suggest that re-sponses to low fm are predominantly generated at thecortical level and those to high fm in the brain stem.Magnetic responses in auditory cortex suggest a mappingof modulation frequency that lies orthogonal to the tono-topic axis (151). These magnetic responses lock to thetemporal envelope of speech signals, and the degree oflocking correlates with speech comprehension (3). Func-tional imaging of the brain with functional magnetic res-onance imaging (fMRI) has also been applied to the studyof modulation: the repetition rate at which a tone burstbest elicits a BOLD (blood oxygen level dependent) re-sponse decreases progressively from midbrain to thala-mus to cortex, with values not dissimilar to those found insingle-unit recordings from these structures in other

mammals (97). A progressive shift in favor of low modu-lation frequencies at more central locations was also re-ported in an fMRI study using sinusoidally amplitudemodulated white noise (77). In addition, this study alsoreported some evidence for restricted cortical regionsresponding better to low or high modulation frequenciesbut no systematic topographic representation of modula-tion frequency. At the cortical level other nonsensoryfactors are likely to play a role in the processing ofmodulation. Hall et al. (95) have demonstrated that acti-vation of the planum temporale caudal to primary audi-tory cortex is influenced by attention to modulation.

Ablations and lesions of auditory cortex have beenshown to interfere with the processing of temporal tasks,such as the order of events (193), discrimination between10- and 300-Hz trains of noise bursts (277), the detectionof AM frequencies below but not above �30 Hz (89), andthe perception of periodicity pitch (299), to name a fewexamples. Studies in patients with primary cortical le-sions resulting in “word deafness” also show evidence fordeteriorated temporal processing capacities (88). In addi-tion, it has been argued (87) that the pathway up to andincluding primary auditory cortex is not sufficient for thedetection of continuous AM in humans. The range ofthese perceptual deficits encompasses the cortical rangeof temporal as well as the rate-encoded AM frequencies,corroborating the importance of the coding of envelopephenomena in auditory cortex and in some of the corticalregions to which it connects.

XI. CONCLUSION

Our examination of modulation processing at differ-ent anatomical levels reveals a patchy picture with manyunsolved issues. As is generally the case in sensory sys-tems, the representation evolves from isomorphic in theperiphery to abstracted at the cortical level. Two generaltrends are clearly discernable with ascending levels: 1) arecoding of modulation selectivity from temporal form toaverage rate and 2) a decrease in the highest modulationfrequencies encoded (either temporally or in averagerate). While the first trend seems sensible, the secondtrend is puzzling, in particular the limited upper frequencylimit at the cortical level. This observation, as well asothers, may lead to the skeptical view that modulationencoding and selectivity at the different anatomical levelsis epiphenomenal, in the sense that it is a necessaryoutcome of other properties (e.g., frequency tuning, ad-aptation, sensitivity to rise time, connectivity, membraneproperties, synaptic dynamics) and that the gradualchanges with anatomical level merely reflect change inthese properties but do not indicate processing (e.g., theassembly of higher-order selectivities or the recoding of

FIG. 10. Temporal plasticity in primary auditory cortex. Prolongedpairing of an AM stimulus (15-Hz trains of tones, random carrier fre-quencies) with electrical stimulation of the nucleus basalis resulted in ashift of the population tMTF to higher AM frequencies (dashed lines)compared with unstimulated control animals (solid line). [Adapted fromKilgard et al. (139).]

AUDITORY MODULATION PROCESSING 569

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 30: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

envelope synchronization into a spatially distributed ratecode).

However, if we ignore differences along the auditoryneuraxis for a moment and take stock of the variety ofresponses reviewed, a rather optimistic view emerges ofneural mechanisms dedicated to AM processing. Indeed,these responses show some of the key properties that aregenerally considered indicative for the coding of stimulusparameters. Tuning to modulation frequency is promi-nently present temporally and in average rate, and therange of optimal modulation frequencies so representedspans perceptually relevant ranges. The tuning can showinvariance with SPL, modulation depth, and type of car-rier and be predictive of the response to complex modu-lation waveforms in natural stimuli. There is even sug-gestive evidence for topographic mapping of modulationfrequency. Selectivity to modulation waveforms or mod-ulation paradigms more complex than the basic sinu-soidally modulated tone are beginning to be reported.

There are several neurobiological avenues to furtherexplore and strengthen the case for dedicated modulationmechanisms and their link to perception. Review of theavailable data suggests that the most immediate gain, withexisting tools, can be expected from inventive stimulusparadigms. Although sinusoidal AM may be considered acomplex stimulus in the frequency domain, it is an ele-mentary but simple stimulus in the modulation domain.The vast majority of studies of modulation processinghave used single sinusoidal AM tones and have focused onmodulation tuning. This is a necessary starting point, butto make a convincing case for the relevance of the tuningobserved, the stimulus arsenal should be expanded. Cur-rent technology enables synthesis of more complex stim-uli that are amenable to parametric exploration yet a stepcloser to natural stimuli. There are still basic unansweredquestions to be addressed with sinusoidal AM, but it isequally clear that important properties and selectivitiesare only manifest with the use of nonsinusoidal envelopesor stimulus paradigms that involve modulation in waysthat are closer to real-world tasks faced by the auditorysystem. Clever use of such paradigms is likely to makeeither the skeptical or optimistic view prevail.

We are very grateful to A. Palmer and R. Batra for criticalreading of the manuscript.

During the preparation of this review, P. X. Joris wassupported by the Fund for Scientific Research-Flanders GrantsG.0297.98 and G.0083.02 and Research Fund K.U. Leuven GrantOT/01/42; C. E. Schreiner was supported by National Institutesof Health Grants DC-02260 and NS-34835; and A. Rees wassupported by the Wellcome Trust.

Address for reprint requests and other correspondence:P. X. Joris, Laboratory of Auditory Neurophysiology, K.U. Leu-ven, Campus Gasthuisberg, B-3000 Leuven, Belgium (E-mail:[email protected]).

REFERENCES

1. Abbott LF, Varela JA, Sen K, and Nelson SB. Synaptic depres-sion and cortical gain control. Science 275: 220–224, 1997.

2. Aertsen AM and Johannesma PIM. The spectro-temporal recep-tive field. A functional characteristic of auditory neurons. Biol

Cybern 42: 133–143, 1981.3. Ahissar E, Nagarajan S, Ahissar M, Protopapas A, Mahncke

H, and Merzenich MM. Speech comprehension is correlated withtemporal response patterns recorded from auditory cortex. Proc

Natl Acad Sci USA 98: 13367–13372, 2001.4. Anderson DJ, Rose JE, Hind JE, and Brugge JF. Temporal

position of discharges in single auditory nerve fibers within thecycle of a sine-wave stimulus: frequency and intensity effects. J

Acoust Soc Am 49: 1131–1139, 1971.5. Atick J. Could information theory provide an ecological theory of

sensory processing? Network 3: 213–251, 1992.6. Attias H and Schreiner CE. Low-order temporal statistics of

natural sounds. In: Advances in Neural Information Processing

Systems 9, edited by M. C. Mozer, M. I. Jordan, and T. Petsche.Cambridge, MA: MIT Press, 1997, p. 27–33.

7. Attias H and Schreiner CE. Coding of naturalistic stimuli byauditory midbrain neurons. In: Advances in Neural Information

Processing Systems 10, edited by M. I. Jordan, M. Kearns, and S.Solla. Cambridge, MA: MIT Press, 1998, p. 103–109.

8. Bacon SP and Grantham DM. Modulation masking: effects ofmodulation frequency, depth, and phase. J Acoust Soc Am 85:2575–2588, 1989.

9. Bacon SP and Viemeister NF. Temporal modulation transferfunctions in normal hearing and hearing-impaired listeners. Audi-

ology 24: 117–134, 1985.10. Batra R, Kuwada S, and Fitzpatrick DC. Sensitivity to interaural

temporal disparities of low- and high-frequency neurons in thesuperior olivary complex. I. Heterogeneity of responses. J Neuro-

physiol 78: 1222–1236, 1997.11. Batra R, Kuwada S, and Fitzpatrick DC. Sensitivity to interaural

temporal disparities of low- and high-frequency neurons in thesuperior olivary complex. II. Coincidence detection. J Neuro-

physiol 78: 1237–1247, 1997.12. Batra R, Kuwada S, and Stanford TR. Temporal coding of

envelopes and their interaural delays in the inferior colliculus ofthe unanesthetized rabbit. J Neurophysiol 61: 257–268, 1989.

13. Batra R, Kuwada S, and Stanford TR. High-frequency neurons inthe inferior colliculus that are sensitive to interaural delays ofamplitude-modulated tones—evidence for dual binaural influ-ences. J Neurophysiol 70: 64–80, 1993.

14. Beckius GE, Batra R, and Oliver DL. Axons from anteroventralcochlear nucleus that terminate in medial superior olive of cat:observations related to delay lines. J Neurosci 19: 3146–3161, 2001.

15. Beitel R, Schreiner CE, Wang X, Cheung S, Jenkins W, and

Merzenich MM. Effects of psychophysical training on the entrain-ment of primary auditory cortical neurons to amplitude modulatedtones. Soc Neurosci Abstr 21: 1180, 1995.

16. Bernstein LR and Trahiotis C. Detection of interaural delay inhigh-frequency sinusoidally amplitude-modulated tones, two-tonecomplexes, and bands of noise. J Acoust Soc Am 95: 3561–3567,1994.

17. Bieser A and Muller-Preuss P. Auditory responsive cortex in thesquirrel monkey: neural responses to amplitude-modulated sounds.Exp Brain Res 108: 273–284, 1996.

18. Brawer JR, Morest DK, and Kane EC. The neuronal architectureof the cochlear nucleus of the cat. J Comp Neurol 155: 251–282,1974.

19. Brosch M and Schreiner CE. Sequence selectivity of neurons incat primary auditory cortex. Cereb Cortex 10: 1155–1167, 2000.

20. Burger RM and Pollak GD. Analysis of the role of inhibition inshaping responses to sinusoidally amplitude-modulated signals inthe inferior colliculus. J Neurophysiol 80: 1686–1701, 1998.

21. Burns EM and Viemeister NF. Nonspectral pitch. J Acoust Soc

Am 60: 863–869, 1976.22. Burns EM and Viemeister NF. Played-again SAM: further obser-

vations on the pitch of amplitude-modulated noise. J Acoust Soc

Am 70: 1655–1660, 1981.

570 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 31: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

23. Buunen TJF and Rhode WS. Responses of fibers in the cat’sauditory nerve to the cubic difference tone. J Acoust Soc Am 64:772–781, 1978.

24. Caird D. Processing in the colliculus. In: The Neurobiology of

Hearing: The Central Auditory System, edited by R. A. Altschuler,R. P. Bobbin, B. M. Clopton, and D. W. Hoffman. New York: Raven,1991, p. 253–292.

25. Calford MB and Semple MN. Monaural inhibition in cat auditorycortex. J Neurophysiol 73: 1876–1891, 1995.

26. Cant NB and Benson CG. Parallel auditory pathways: projectionpatterns of the different neuronal populations in the dorsal andventral cochlear nuclei. Brain Res Bull 60: 457–474, 2003.

27. Cariani PA and Delgutte B. Neural correlates of the pitch ofcomplex tones. II. Pitch shift, pitch ambiguity, phase invariance,pitch circularity, rate pitch, and the dominance region for pitch.J Neurophysiol 76: 1717–1734, 1996.

28. Caspary DM, Palombi PS, and Hughes LF. GABAergic inputsshape responses to amplitude modulated stimuli in the inferiorcolliculus. Hear Res 168: 163–173, 2002.

29. Caspary DM, Rupert AL, and Moushegian G. Neuronal codingof vowel sounds in the cochlear nuclei. Exp Neurol 54: 414–431,1997.

30. Chistovich LA, Lublinskaja VV, Malinnikova EA, Ogorodni-

kova EA, Stoljarova EI, and Zhukov SJ. Temporal processing ofperipheral auditory patterns of speech. In: The Representation of

Speech in the Peripheral Auditory System, edited by R. Carlsonand B. Granstrom. Amsterdam: Elsevier, 1982, p. 165–180.

31. Clarey JC, Barone P, and Imig TJ. Physiology of thalamus andcortex. In: The Mammalian Auditory Pathway: Neurophysiology,edited by A. N. Popper and R. R. Fay. New York: Springer, 1992, p.232–334.

32. Condon CJ, White KR, and Feng AS. Neurons with differenttemporal firing patterns in the inferior colliculus of the little brownbat differentially process sinusoidal amplitude-modulated signals.J Comp Physiol A Sens Neural Behav Physiol 178: 147–157, 1996.

33. Cooper NP, Robertson D, and Yates GK. Cochlear nerve fiberresponses to amplitude-modulated stimuli: variations with sponta-neous rate and other response characteristics. J Neurophysiol 70:370–386, 1993.

34. Creutzfeldt OD, Hellweg FC, and Schreiner CE. Thalamocor-tical transformation of responses to complex auditory stimuli. Exp

Brain Res 39: 87–104, 1980.35. Dau T, Kollmeier B, and Kohlrausch A. Modelling auditory

processing of amplitude modulation. I. Detection of masking withnarrow-band carriers. J Acoust Soc Am 102: 2892–2905, 1997.

36. Dau T, Kollmeier B, and Kohlrausch A. Modeling auditory pro-cessing of amplitude modulation. II. Spectral and temporal integra-tion. J Acoust Soc Am 102: 2906–2919, 1997.

37. Dau T, Verhey J, and Kohlrausch A. Intrinsic envelope fluctua-tions and modulation-detection thresholds for narrow-band noisecarriers. J Acoust Soc Am 106: 2752–2760, 1999.

38. Decharms RC, Blake DT, and Merzenich MM. Optimizing soundfeatures for cortical neurons. Science 280: 1439–1443, 1998.

39. Delgutte B. Auditory neural processing of speech. In: The Hand-

book of Phonetic Sciences, edited by W. J. Hardcastle and J. Laver.Oxford, UK: Blackwell, 1997, p. 507–538.

40. Delgutte B, Hammond BM, and Cariani PA. Neural coding ofthe temporal envelope of speech: relation to modulation transferfunctions. In: Psychophysical and Physiological Advances in

Hearing, edited by A. R. Palmer, A. Rees, A. Q. Summerfield, and R.Meddis. London: Whurr, 1997, p. 595–603.

41. Denham SL and Denham MJ. An investigation into the role ofcortical synaptic depression in auditory processing. In: Emergent

Neural Computational Architectures Based on Neuroscience: To-

wards Neuroscience-Inspired Computing, edited by S. Wermter,D. J. WIllshaw, and J. Austin. Berlin: Springer, 2001, p. 494–506.

42. Dinse HR, Krueger K, Akhavan AC, Spengler F, Schoenor G,

and Schreiner CE. Low-frequency oscillations of visual, auditoryand somatosensory cortical neurons evoked by sensory stimula-tion. Int J Psychophysiol 26: 205–227, 1997.

43. Drew T and Doucet S. Application of circular statistics to thestudy of neuronal discharge during locomotion. J Neurosci Meth-

ods 38: 171–181, 1991.

44. Drullman R, Festen JM, and Houtgast T. Effect of temporalmodulation reduction on spectral contrasts in speech. J Acoust Soc

Am 99: 2358–2364, 1996.45. Edeline JM, Hars B, Maho C, and Hennevin E. Transient and

prolonged facilitation of tone-evoked responses induced by basalforebrain stimulations in the rat auditory cortex. Exp Brain Res 97:373–386, 1994.

46. Eggermont JJ. Rate and synchronization measures of periodicitycoding in cat primary auditory cortex. Hear Res 56: 153–167, 1991.

47. Eggermont JJ. Stimulus induced and spontaneous rhythmic firingof single units in cat primary auditory cortex. Hear Res 61: 1–11,1992.

48. Eggermont JJ. Differential effects of age on click-rate and ampli-tude modulation-specific coding in primary auditory cortex of thecat. Hear Res 74: 51–66, 1993.

49. Eggermont JJ. Temporal modulation transfer function for AM andFM stimuli in cat auditory cortex. Effects of carrier type, modulat-ing waveform and intensity. Hear Res 74: 51–66, 1994.

50. Eggermont JJ. Representation of a voice onset time continuum inprimary auditory cortex of the cat. J Acoust Soc Am 98: 911–920,1995.

51. Eggermont JJ. How homogeneous is cat primary auditory cortex?Evidence from simultaneous single-unit recordings. Audit Neuro-

sci 2: 79–96, 1996.52. Eggermont JJ. Firing rate and firing synchrony distinguish dy-

namic from steady state sound. Neuroreport 8: 2709–2713, 1997.53. Eggermont JJ. Representation of spectral and temporal sound

features in three cortical fields of the cat. Similarities outweighdifferences. J Neurophysiol 80: 2743–2764, 1998.

54. Eggermont JJ. The magnitude and phase of temporal modulationtransfer functions in cat auditory cortex. J Neurosci 19: 2780–2788,1999.

55. Eggermont JJ. Temporal modulation transfer functions in catprimary auditory cortex: separating stimulus effects from neuralmechanisms. J Neurophysiol 78: 305–321, 2002.

56. Eggermont JJ, Johannesma PIM, and Aertsen AMHJ. Reverse-correlation methods in auditory research. Q Rev Biophys 16: 341–414, 1983.

57. Erulkar SD, Butler RA, and Gerstein GL. Excitation and inhi-bition in cochlear nucleus. II. Frequency-modulated tones. J Neu-

rophysiol 31: 537–548, 1968.58. Escabi MA, Schreiner CE, and Miller LM. Dynamic time-fre-

quency processing in the cat midbrain, thalamus, and auditorycortex: spectrotemporal receptive fields obtained using dynamicripple spectra. Soc Neurosci Abstr 24: 1879, 1998.

59. Evans EF. Cochlear nerve and cochlear nucleus. In: Handbook of

Sensory Physiology, edited by W. D. Keidel and W. D. Neff. Berlin:Springer, 1975, p. 1–108.

60. Fastl H, Hesse A, Schorer E, Urbas J, and Muller-Preuss P.

Searching for neural correlates of the hearing sensation fluctuationstrength in the auditory cortex of squirrel monkeys. Hear Res 23:199–203, 1986.

61. Fenton MB. Natural history and biosonar signals. In: Hearing by

Bats, edited by R. R. Fay and A. N. Popper. New York: Springer,1995, p. 37–86.

62. Fernald RD and Gerstein GL. Response of cat cochlear nucleusneurons to frequency and amplitude modulated tones. Brain Res

45: 417–435, 1972.63. Fettiplace R and Fuchs PA. Mechanisms of hair cell tuning.

Annu Rev Physiol 61: 809–834, 1999.64. Fishman YI, Reser DH, Arezzo JC, and Steinschneider M.

Complex tone processing in primary auditory cortex of the awakemonkey. I. Neural ensemble correlates of roughness. J Acoust Soc

Am 108: 235–246, 2000.65. Fitzgerald JV, Burkitt AN, Clark GM, and Paolini AG. Delay

analysis in the auditory brainstem of the rat: comparison with clicklatency. Hear Res 159: 85–100, 2001.

66. Fitzpatrick DC, Batra R, Stanford TR, and Kuwada S. A neu-ronal population code for sound localization. Nature 388: 871–874,1997.

67. Fitzpatrick DC, Kuwada S, and Batra R. Neural sensitivity tointeraural time differences: beyond the Jeffress model. J Neurosci

20: 1605–1615, 2000.

AUDITORY MODULATION PROCESSING 571

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 32: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

68. Forest TG and Green DM. Detection of partially filled gaps innoise and the temporal modulation transfer function. J Acoust Soc

Am 82: 1933–1943, 1987.69. Frisina RD. Subcortical neural coding mechanisms for auditory

temporal processing. Hear Res 158: 1–27, 2001.70. Frisina RD, Smith RL, and Chamberlain SC. Differential encod-

ing of rapid changes in sound amplitude by second-order auditoryneurons. Exp Brain Res 60: 417–422, 1985.

71. Frisina RD, Smith RL, and Chamberlain SC. Encoding of am-plitude modulation in the gerbil cochlear nucleus. I. A hierarchy ofenhancement. Hear Res 44: 99–122, 1990.

72. Frisina RD, Smith RL, and Chamberlain SC. Encoding of am-plitude modulation in the gerbil cochlear nucleus. II. Possibleneural mechanisms. Hear Res 44: 123–142, 1990.

73. Frisina RD, Walton JP, and Karcich KJ. Dorsal cochlear nu-cleus single neurons can enhance temporal processing capabilitiesin background noise. Exp Brain Res 102: 160–164, 1994.

74. Funkenstein HH and Winter P. Responses to acoustic stimuli ofunits in the auditory cortex of awake squirrel monkeys. Exp Brain

Res 18: 464–488, 1973.75. Gaese BH and Ostwald J. Temporal coding of amplitude and

frequency modulation in the rat auditory cortex. Eur J Neurosci 7:438–450, 1995.

76. Geisler CD. From Sound to Synapse. Oxford, UK: Oxford Univ.Press, 1998.

77. Giraud A, Lorenzi C, Ashburner J, Wable J, Johnsrude I,

Frackowiak RSJ, and Kleinschmidt A. Representation of thetemporal envelope of sounds in the human brain. J Neurophysiol

84: 1588–1598, 2000.78. Glattke TJ. Unit responses of the cat cochlear nucleus to ampli-

tude-modulated stimuli. J Acoust Soc Am 45: 419–425, 1968.79. Godfrey DA, Kiang NYS, and Norris BE. Single unit activity in

the dorsal cochlear nucleus of the cat. J Comp Neurol 162: 269–284, 1975.

80. Godfrey DA, Kiang NYS, and Norris BE. Single unit activity inthe posteroventral cochlear nucleus of the cat. J Comp Neurol 162:247–268, 1975.

81. Goldberg JM and Brown PB. Response of binaural neurons ofdog superior olivary complex to dichotic tonal stimuli: some phys-iological mechanisms of sound localization. J Neurophysiol 22:613–636, 1969.

82. Goldberg JM and Brownell WE. Discharge characteristics ofneurons in anteroventral and dorsal cochlear nuclei of cat. Brain

Res 64: 35–54, 1973.83. Goldstein MH Jr, De Ribaupierre F, and Brown RM. Responses

of the auditory cortex to repetitive acoustic stimuli. J Acoust Soc

Am 31: 356–364, 1959.84. Green GG and Kay RH. Channels in the human auditory system

concerned with the waveform of modulation present in amplitudeand frequency-modulated tones. J Physiol 241: 50–52, 1974.

85. Greenberg S. Possible role of low and medium spontaneous ratecochlear nerve fibers in the encoding of waveform periodicity. In:Auditory Frequency Selectivity, edited by B. C. J. Moore and R. D.Patterson. New York: Plenum, 1986, p. 241–248.

86. Greenwood DD and Joris PX. Mechanical and “temporal” filter-ing as codeterminants of the response by cat primary fibers toamplitude-modulated signals. J Acoust Soc Am 99: 1029–1039,1996.

87. Griffiths TD, Penhune V, Peretz I, Dean JL, Patterson RD,

and Green GG. Frontal processing and auditory perception. Neu-

roreport 11: 919–922, 2000.88. Griffiths TD, Rees A, and Green GG. Disorders of human com-

plex sound processing. Neurocase 5: 365–378, 1999.89. Grigoreva TI, Figurina II, and Vasilev AG. Role of the medial

geniculate body in the production of conditioned reflexes to am-plitude-modulated stimuli in rats. Zh Vyssh Nervn Deyat 37: 265–271, 1988.

90. Grimault N, Bacon SP, and Micheyl C. Auditory stream segre-gation on the basis of amplitude-modulation rate. J Acoust Soc Am

111: 1340–1348, 2002.91. Grothe B. Interaction of excitation and inhibition in processing of

pure tone and amplitude-modulated stimuli in the medial superiorolive of the mustached bat. J Neurophysiol 71: 706–721, 1994.

92. Gu Q and Singer W. Effects of intracortical infusion of anticho-linergic drugs on neuronal plasticity in kitten striate cortex. Eur

J Neurosci 5: 475–485, 1993.93. Gummer AW and Johnstone BM. Group delay measurement

from spiral ganglion cells in the basal turn of the guinea pigcochlea. J Acoust Soc Am 76: 1388–1400, 1984.

94. Gummer M, Yates GK, and Johnstone BM. Modulation transferfunction of efferent neurones in the guinea pig cochlea. Hear Res

36: 41–52, 1988.95. Hall DA, Haggard MP, Akeroyd MA, Summerfield AQ, Palmer

AR, Elliott MR, and Bowtell RW. Modulation and task effects inauditory processing measured using fMRI. Hum Brain Mapp 10:107–119, 2000.

96. Hall JW, Haggard MP, and Fernandes MA. Detection in noise byspectro-temporal pattern analysis. J Acoust Soc Am 76: 50–56,1984.

97. Harms MP and Melcher JR. Sound repetition rate in the humanauditory pathway: representations in the waveshape and amplitudeof FMRI activation. J Neurophysiol 88: 1433–1450, 2002.

98. Hars B, Maho C, Edeline JM, and Hennevin E. Basal forebrainstimulation facilitates tone-evoked responses in the auditory cor-tex of awake rat. Neuroscience 56: 61–74, 1993.

99. Hartmann WM. The physical description of signals. In: Hearing,edited by B. C. J. Moore. San Diego, CA: Academic, 1995, p. 1–40.

100. Hartmann WM. Signals, Sound, and Sensation. New York:Springer, 1997.

101. Heil P. Representation of sound onsets in the auditory system.Audiol Neuro-otolaryngol 6: 167–172, 2001.

102. Heil P and Neubauer H. Temporal integration of sound pressuredetermines thresholds of auditory-nerve fibers. J Neurosci 21:7404–7415, 2001.

103. Heil P, Schulze H, and Langner G. Ontogenetic development ofperiodicity in the inferior colliculus of the mongolian Gerbil. Audit

Neurosci 1: 363–383, 1995.104. Henning GB and Ashton J. The effect of carrier and modulation

frequency on lateralization based on interaural phase and interau-ral group delay. Hear Res 4: 185–194, 1981.

105. Hewitt MJ and Meddis R. A computer model of amplitude-mod-ulation sensitivity of single units in the inferior colliculus. J Acoust

Soc Am 95: 2145–2159, 1994.106. Horikawa J, Tanahashi A, and Suga N. After-discharges in the

auditory cortex of the mustached bat: no oscillatory discharges forbinding auditory information. Hear Res 76: 45–52, 1994.

107. Houtgast T. Frequency selectivity in amplitude-modulation detec-tion. J Acoust Soc Am 85: 1676–1680, 1989.

108. Houtgast T and Steeneken HJM. The modulation transfer func-tion in room acoustics as a predictor of speech intelligibility.Acustica 28: 66–73, 1973.

109. Huffman RF, Argeles PC, and Covey E. Processing of sinusoi-dally amplitude modulated signals in the nuclei of the lateral lem-niscus of the big brown bat, Eptesicus fuscus. Hear Res 126:181–200, 1998.

110. Huffman RF and Henson OW Jr. The descending auditory path-way and acousticomotor systems: connections with the inferiorcolliculus. Brain Res Rev 15: 295–323, 1990.

111. Imaizumi K, Priebe NJ, Crum PAC, Bedenbaugh PH, Cheung

SW, and Schreiner CE. Modular functional organization in catanterior auditory field (Abstract). Program No. 488.6. 2003 Ab-

stract Viewer/Itinerary Planner. Washington, DC: Soc. Neurosci,2003, Online.

112. Irvine DRF. The Auditory Brainstem: A Review of the Structure

and Function of Auditory Brainstem Processing Mechanisms.Berlin: Springer-Verlag, 1986.

113. Irvine DRF. Physiology of the auditory brainstem. In: The Mam-

malian Auditory Pathway: Neurophysiology, edited by A. N. Pop-per and R. R. Fay. New York: Springer-Verlag, 1992, p. 153–231.

114. Javel E. Coding of AM tones in the chinchilla auditory nerve:implications for the pitch of complex tones. J Acoust Soc Am 68:133–146, 1980.

115. Javel E and Mott JB. Physiological and psychophysical corre-lates of temporal processes in hearing. Hear Res 34: 275–294, 1988.

116. Jen PHS, Hou T, and Wu M. Neurons in the inferior colliculus,auditory cortex and pontine nuclei of the FM bat, Eptesicus fucus,

572 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 33: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

respond to pulse repetition rates differently. Brain Res 613: 152–155, 1993.

117. Jenison RL. A Dynamic Model of the Auditory Periphery Based

on the Responses of Single Auditory-Nerve Fibers (PhD thesis).Madison: Univ. of Wisconsin, 1991.

118. Jiang D, Palmer AR, and Winter IM. Frequency extent of two-tone facilitation in onset units in the ventral cochlear nucleus.J Neurophysiol 75: 380–395, 1996.

119. John MS and Picton TW. Human auditory steady-state responsesto amplitude-modulated tones: phase and latency measurements.Hear Res 141: 57–79, 2000.

120. Johnson DH. The Response of Single Auditory-Nerve Fibers in

the Cat to Single Tones: Synchrony and Average Discharge Rate

(PhD thesis). Cambridge, MA: MIT, 1974.121. Johnson DH. The relationship between spike rate and synchrony

in responses of auditory-nerve fibers to single tones. J Acoust Soc

Am 68: 1115–1122, 1980.122. Joris PX. Envelope coding in the lateral superior olive. II. Char-

acteristic delays and comparison with responses in the medialsuperior olive. J Neurophysiol 76: 2137–2156, 1996.

123. Joris PX. Interaural time sensitivity dominated by cochlea-inducedenvelope patterns. J Neurosci 23: 6345–6350, 2003.

124. Joris PX, Carney LHC, Smith PH, and Yin TCT. Enhancementof synchronization in the anteroventral cochlear nucleus. I. Re-sponses to tonebursts at the characteristic frequency. J Neuro-

physiol 71: 1022–1036, 1994.125. Joris PX and Smith PH. Temporal and binaural properties in

dorsal cochlear nucleus and its output tract. J Neurosci 18: 10157–10170, 1998.

126. Joris PX, Smith PH, and Yin TCT. Coincidence detection in theauditory system: 50 years after Jeffress. Neuron 21: 1235–1238,1998.

127. Joris PX and Yin TCT. Responses to amplitude-modulated tonesin the auditory nerve of the cat. J Acoust Soc Am 91: 215–232, 1992.

128. Joris PX and Yin TCT. Envelope coding in the lateral superiorolive. I. Sensitivity to interaural time differences. J Neurophysiol

73: 1043–1062, 1995.129. Joris PX and Yin TCT. Envelope coding in the lateral superior

olive. III. Comparison with afferent pathways. J Neurophysiol 79:253–269, 1998.

130. Kaas JH and Hackett TA. Subdivisions of auditory cortex andprocessing streams in primates. Proc Natl Acad Sci USA 97: 11793–11799, 2000.

131. Kay RH. Hearing of modulation in sounds. Physiol Rev 62: 894–975, 1982.

132. Kay RH and Matthews DR. On the existence in human auditorypathways of channels selectively tuned to the modulation presentin frequency-modulated tones. J Physiol 225: 657–677, 1972.

133. Keller CH and Takahashi TT. Representation of temporal fea-tures of complex sounds by the discharge patterns of neurons inthe owl’s inferior colliculus. J Neurophysiol 84: 2638–2650, 2000.

134. Kenmochi M and Eggermont JJ. Autonomous cortical rhythmsaffect temporal modulation transfer functions. Neuroreport 8:1589–1593, 1997.

135. Khanna SM and Teich MC. Spectral characteristics of the re-sponses of primary auditory-nerve fibers to amplitude-modulatedsignals. Hear Res 39: 143–158, 1989.

136. Kiang NYS. Peripheral neural processing of auditory information.In: Handbook of Physiology. The Nervous System. Sensory Pro-

cesses. Bethesda, MD: Am Physiol Soc, 1984, sect. 1, vol. III, pt. 2,chapt. 15, p. 639–674.

137. Kiang NYS. Curious oddments of auditory-nerve studies. Hear Res

49: 1–16, 1990.138. Kilgard MP and Merzenich MM. Plasticity of temporal informa-

tion processing in the primary auditory cortex. Nature Neurosci 1:727–731, 1998.

139. Kilgard MP, Pandya PK, Vazquez J, Gehi A, Schreiner CE, and

Merzenich MM. Sensory input directs spatial and temporal plas-ticity in primary auditory cortex. J Neurophysiol 86: 326–338, 2001.

140. Kim DO, Rhode WS, and Greenberg S. Responses of cochlearnucleus neurons to speech signals: neural encoding of pitch, inten-sity and other parameters. In: Auditory Frequency Selectivity,

edited by B. C. J. Moore and R. D. Patterson. New York: Plenum,1986, p. 281–288.

141. Kim DO, Sirianni JG, and Chang SO. Responses of DCN-PVCNneurons and auditory nerve fibers in unanesthetized cats to AM andpure tones: analysis with autocorrelation/power-spectrum. Hear

Res 45: 95–113, 1990.142. Kowalski N, Depireux DA, and Shamma SA. Analysis of dy-

namic spectra in ferret primary auditory cortex. I. Characteristicsof single-unit responses to moving ripple spectra. J Neurophysiol

76: 3505–3523, 1996.143. Kowalski N, Depireux DA, and Shamma SA. Analysis of dy-

namic spectra in ferret primary auditory cortex. II. Prediction ofunit responses to arbitrary dynamic spectra. J Neurophysiol 76:3524–3534, 1996.

144. Krishna SB and Semple MN. Auditory temporal processing: re-sponses to sinusoidally amplitude-modulated tones in the inferiorcolliculus. J Neurophysiol 84: 255–273, 2000.

145. Kuwada S and Batra R. Coding of sound envelopes by inhibitoryrebound in neurons of the superior olivary complex in the unanes-thetized rabbit. J Neurosci 19: 2273–2287, 1999.

146. Kuwada S, Batra R, and Maher VL. Scalp potentials of normaland hearing-impaired subjects in response to sinusoidally ampli-tude-modulated tones. Hear Res 21: 179–192, 1986.

147. Kuwada S, Yin TCT, Syka J, Buunen TJF, and Wickesberg RE.

Binaural interaction in low-frequency neurons in inferior colliculusof the cat. IV. Comparison of monaural and binaural responseproperties. J Neurophysiol 51: 1306–1325, 1984.

148. Kvale M and Schreiner CE. Perturbative M-sequences for audi-tory systems identification. Acustica 83: 653–658, 1997.

149. Langner G. Periodicity coding in the auditory system. Hear Res 6:115–142, 1992.

150. Langner G. Neural processing and representation of periodicitypitch. Acta Otolaryngol Suppl 532: 68–76, 1997.

151. Langner G, Sams M, Heil P, and Schulze H. Frequency andperiodicity are represented in orthogonal maps in the human au-ditory cortex: evidence from magnetencephalography. J Comp

Physiol A Sens Neural Behav Physiol 181: 665–676, 1997.152. Langner G and Schreiner CE. Periodicity coding in the inferior

colliculus of the cat. I. Neuronal mechanisms. J Neurophysiol 60:1799–1822, 1988.

153. Langner G, Schreiner CE, and Merzenich MM. Co-variation oflatency and temporal resolution in the inferior colliculus of the cat.Hear Res 31: 197–202, 1987.

154. Lavine RA. Phase-locking in response of single neurons in co-chlear nuclear complex of the cat to low-frequency tonal stimuli.J Neurophysiol 24: 467–483, 1971.

155. Le Beau FEN, Rees A, and Malmierca MS. Contribution ofGABA- and glycine-mediated inhibition to the monaural temporalresponse properties of neurons in the inferior colliculus. J Neuro-

physiol 75: 902–919, 1996.156. Lesser HD, Oneill WE, Frisina RD, and Emerson RC. On-off

units in the moustached bat inferior colliculus are selective fortransients resembling “acoustic glint” from fluttering insect targets.Exp Brain Res 82: 137–148, 1990.

157. Liang L, Lu T, and Wang X. Neural representations of sinusoidalamplitude and frequency modulations in the primary auditory cor-tex of awake primates. J Neurophysiol 87: 2237–2261, 2002.

158. Liberman MC. Auditory-nerve response from cats raised in alow-noise chamber. J Acoust Soc Am 63: 442–455, 1978.

159. Linden JF, Liu RC, Sahani M, Schreiner CE, and Merzenich

MM. Spectrotemporal structure of receptive fields in areas AI andAAF of mouse auditory cortex. J Neurophysiol 90: 2660–2675,2003.

160. Lorenzi C, Micheyl C, and Berthommier F. Neuronal correlatesof perceptual amplitude-modulation detection. Hear Res 90: 219–227, 1995.

161. Lu T, Liang L, and Wang X. Neural representations of temporallyasymmetric stimuli in the auditory cortex of awake primates.J Neurophysiol 85: 2364–2380, 2001.

162. Maison S, Micheyl C, and Collet L. Medial olivocochlear efferentsystem in humans studied with amplitude-modulated tones. J Neu-

rophysiol 77: 1759–1768, 1997.

AUDITORY MODULATION PROCESSING 573

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 34: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

163. Makela JP, Karmos G, Molnar M, Csepe V, and Winkler I.

Steady-state responses from the cat auditory cortex. Hear Res 45:41–50, 1990.

164. Malmierca MS, Blackstad TW, Osen KK, and Molowny RL. Thecentral nucleus of the inferior colliculus in rat—a Golgi and com-puter reconstruction study of the neuronal and laminar structure.J Comp Neurol 333: 1–27, 1993.

165. Malmierca MS, Leergaard TB, Bajo VM, Bjaalie JG, and Mer-

chan MA. Anatomic evidence of a three-dimensional mosaic pat-tern of tonotopic organization in the ventral complex of the laterallemniscus in cat. J Neurosci 18: 10603–10618, 1998.

166. Malmierca MS and Merchan MA. The auditory system. In: The

Rat Nervous System, edited by G. Paxinos. San Diego, CA: Aca-demic, 2004, p. 995–1080.

167. Malmierca MS, Rees A, Le Beau FEN, and Bjaalie JG. Laminarorganization of frequency-defined axons within and between theinferior colliculi of the guinea pig. J Comp Neurol 357: 124–144,1995.

168. Mardia KV and Jupp PE. Directional Statistics. New York: Wiley,1999.

169. Markram H, Lubke J, Frotscher M, and Sakmann B. Regulationof synaptic efficacy by coincidence of postsynaptic APs and EPSPs.Science 275: 213–215, 1997.

170. Markram H and Tsodyks M. Redistribution of synaptic efficacybetween neocortical pyramidal neurons. Nature 382: 807–810,1996.

171. McAlpine D. Are pitch neurones the result of difference tones onthe basilar membrane? Ass Res Otolaryngol Abstr 25: 40, 2002.

172. McAlpine D, Jiang D, Shackleton TM, and Palmer AR. Conver-gent input from brainstem coincidence detectors onto delay-sensi-tive neurons in the inferior colliculus. J Neurosci 18: 6026–6039,1998.

173. Merzenich MM and Reid MD. Representation of the cochleawithin the inferior colliculus of the cat. Brain Res 77: 397–415,1974.

174. Metherate R and Weinberger NM. Cholinergic modulation ofresponses to single tones produces tone-specific receptive fieldalterations in cat auditory cortex. Synapse 6: 133–145, 1990.

175. Miller LM, Escabi MA, Read HL, and Schreiner CE. Functionalconvergence of response properties in the auditory thalamocorticalsystem. Neuron 32: 151–160, 2001.

176. Miller LM, Escabi MA, Read HL, and Schreiner CE. Spectro-temporal receptive fields in the lemniscal auditory thalamus andcortex. J Neurophysiol 87: 516–527, 2002.

177. Miller MI and Sachs MB. Representation of voice pitch in dis-charge patterns of auditory-nerve fibers. Hear Res 14: 257–279,1984.

178. Moody DB, Cole D, Davidson LM, and Stebbins WC. Evidencefor a reappraisal of the psychophysical selective adaption para-digm. J Acoust Soc Am 76: 1076–1079, 1984.

179. Moore BCJ. Effects of relative phase of the components on thepitch of three component complex tones. In: Psychophysics and

Physiology of Hearing, edited by E. F. Evans and J. P. Wilson.London: Academic, 1977, p. 349–358.

180. Moore BCJ. An Introduction to the Psychology of Hearing. SanDiego, CA: Academic, 2003.

181. Moore BCJ and Sek A. Effects of relative phase and frequencyspacing on the detection of three-component amplitude modula-tion. J Acoust Soc Am 108: 2337–2344, 2001.

182. Morest DK and Oliver DL. The neuronal architecture of theinferior colliculus in the cat: defining the functional anatomy of theauditory midbrain. J Comp Neurol 222: 209–236, 1984.

183. Møller AR. Coding of amplitude and frequency modulated soundsin the cochlear nucleus of the rat. Acta Physiol Scand 86: 223–238,1972.

184. Møller AR. Responses of units in the cochlear nucleus to sinusoi-dally amplitude-modulated tones. Exp Neurol 45: 104–117, 1974.

185. Møller AR. Latency of unit responses in cochlear nucleus deter-mined in two different ways. J Neurophysiol 38: 812–821, 1975.

186. Møller AR. Dynamic properties of primary auditory fibers com-pared with cells in the cochlear nucleus. Acta Physiol Scand 98:157–167, 1976.

187. Møller AR. Dynamic properties of the responses of single neu-rones in the cochlear nucleus of the rat. J Physiol 259: 63–82, 1976.

188. Møller AR. Coding of increments and decrements in stimulusintensity in single units in the cochlear nucleus of the rat. J Neu-

rosci Res 4: 1–8, 1979.189. Møller AR and Rees A. Dynamic properties of the responses of

single neurons in the inferior colliculus of the rat. Hear Res 24:203–215, 1986.

190. Muller-Preuss P. On the mechanisms of call coding through au-ditory neurons in the squirrel monkey. Eur Arch Psychiatry Neurol

Sci 236: 50–55, 1986.191. Muller-Preuss P, Flachskamm C, and Bieser A. Neural encod-

ing of amplitude modulation within the auditory midbrain of squir-rel monkeys. Hear Res 80: 197–208, 1994.

192. Nagarajan S, Cheung S, Bedenbaugh P, Beitel R, Schreiner

CE, and Merzenich MM. Representation of spectral and temporalenvelope of twitter vocalizations in common marmoset primaryauditory cortex. J Neurophysiol 87: 1723–1737, 2002.

193. Neff WD, Diamond DM, and Casseday JH. Behavioral studies ofauditory discrimination. In: Handbook of Sensory Physiology, ed-ited by W. D. Keidel and W. D. Neff. New York: Springer, 1975, p.307–400.

194. Nelken I, Rotman Y, and Bar Yosef O. Responses of auditory-cortex neurons to structural features of natural sounds. Nature

397: 154–157, 1999.195. Nelken I and Young ED. Two separate inhibitory mechanisms

shape the responses of dorsal cochlear nucleus type IV units tonarrowband and wideband stimuli. J Neurophysiol 71: 2446–2462,1994.

196. Nelson PG, Erulkar SD, and Bryan JS. Responses of units of theinferior colliculus to time-varying acoustic stimuli. J Neurophysiol

29: 834–860, 1966.197. Neuert V, Pressnitzer D, Patterson RD, and Winter IM. The

responses of single units in the inferior colliculus of the guinea pigto damped and ramped sinusoids. Hear Res 159: 36–52, 2001.

198. Neuweiler G. Auditory adaptations for prey capture in echolocat-ing bats. Physiol Rev 70: 615–641, 1990.

199. Oertel D, Bal R, Gardner SM, Smith PH, and Joris PX. Detec-tion of synchrony in the activity of auditory nerve fibers by octopuscells of the mammalian cochlear nucleus. Proc Natl Acad Sci USA

97: 11773–11779, 2000.200. Oliver DL and Huerta MF. Inferior and superior colliculi. In: The

Mammalian Auditory Pathway: Neuroanatomy, edited by D. B.Webster, A. N. Popper, and R. R. Fay. New York: Springer-Verlag,1992, p. 168–221.

201. Oliver DL and Shneiderman A. The anatomy of the inferiorcolliculus: a cellular basis for integration of monaural and binauralinformation. In: The Neurobiology of Hearing: The Central Audi-

tory System, edited by R. A. Altschuler, R. P. Bobbin, B. M. Clopton,and D. W. Hoffman. New York: Raven, 1991, p. 195–222.

202. Osen KK. Cytoarchitecture of the cochlear nuclei in the cat.J Comp Neurol 136: 453–483, 1969.

203. Palmer AR. Encoding of rapid amplitude fluctuations by cochlear-nerve fibres in the guinea-pig. Arch Oto-Rhino-Laryngol 236: 197–202, 1982.

204. Palombi PS, Backoff PM, and Caspary DM. Responses of youngand aged rat inferior colliculus neurons to sinusoidally amplitudemodulated stimuli. Hear Res 153: 174–180, 2001.

205. Patterson RD. The sound of a sinusoid: spectral models. J Acoust

Soc Am 96: 1409–1418, 1994.206. Patuzzi RB and Robertson D. Tuning in the mammalian cochlea.

Physiol Rev 68: 1009–1082, 1988.207. Pelleg-Toiba R and Wollberg Z. Discrimination of communica-

tion calls in the squirrel monkey: “call detectors” or “cell assem-blies.” J Basic Clin Physiol Pharmacol 2: 257–271, 1991.

208. Perkel DH and Bullock TH. Neural coding. Neurosci Res Pro-

gram 6: 221–348, 1968.209. Peruzzi D, Sivaramakrishnan S, and Oliver DL. Identification of

cell types in brain slices of the inferior colliculus. Neuroscience

101: 403–416, 2000.210. Phillips DP. Neural representation of stimulus time in the primary

auditory cortex. Ann NY Acad Sci 682: 104–118, 1993.211. Phillips DP and Hall SE. Responses of single neurons in cat

574 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 35: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

auditory cortex to time-varying stimuli: linear amplitude modula-tions. Exp Brain Res 67: 479–492, 1987.

212. Phillips DP, Hall SE, and Hollett JL. Repetition rate and signallevel effects on neuronal responses to brief tone pulses in catauditory cortex. J Acoust Soc Am 85: 2537–2549, 1989.

213. Picton TW, Dauman R, and Aran JM. Steady-state responsesproduced in humans using sinusoidal frequency-modulation. J Oto-

laryngol 16: 140–145, 1987.214. Plomp R. The role of modulation in hearing. In: Hearing: Physi-

ological Bases and Psychophysics, edited by R. Klinke and R.Hartmann. Berlin: Springer-Verlag, 1983, p. 270–276.

215. Poon PW and Chiu TW. Single cell responses to AM tones ofdifferent envelopes at the auditory midbrain. In: Acoustic Signal

Processing in the Central Auditory System, edited by J. Syka. NewYork: Plenum, 1997, p. 253–261.

216. Pressnitzer D, Winter IM, and Patterson RD. The responses ofsingle units in the ventral cochlear nucleus of the guinea pig todamped and ramped sinusoids. Hear Res 149: 155–166, 2000.

217. Preuss A and Muller-Preuss P. Processing of amplitude modu-lated sounds in the medial geniculate body of the squirrel monkey.Exp Brain Res 79: 201–211, 1990.

218. Read HL, Winer JA, and Schreiner CE. Functional architectureof auditory cortex. Curr Opin Neurobiol 12: 433–440, 2002.

219. Reale RA and Brugge JF. Auditory cortical neurons are sensitiveto static and continuously changing interaural phase cues. J Neu-

rophysiol 64: 1247–1260, 1990.220. Rees A, Green GGR, and Kay RH. Steady-state evoked responses

to sinusoidally amplitude-modulated sounds recorded in man. Hear

Res 23: 123–133, 1986.221. Rees A, Malmierca MS, and Le Beau EN. Regularity of firing of

neurons in the inferior colliculus. J Neurophysiol 77: 2945–2965,1997.

222. Rees A and Møller AR. Responses of neurons in the inferiorcolliculus of the rat to AM and FM tones. Hear Res 10: 301–330,1983.

223. Rees A and Møller AR. Stimulus properties influencing the re-sponses of inferior colliculus neurons to amplitude-modulatedsounds. Hear Res 27: 129–143, 1987.

224. Rees A and Palmer AR. Neuronal responses to amplitude-modu-lated and pure-tone stimuli in the guinea pig inferior colliculus, andtheir modification by broadband noise. J Acoust Soc Am 85: 1978–1994, 1989.

225. Rees A and Sarbaz A. The influence of intrinsic oscillations on theencoding of amplitude modulation by neurons in the inferior col-liculus. In: Acoustical Signal Processing in the Central Auditory

System, edited by J. Syka. New York: Plenum, 1997, p. 239–252.226. Rhode WS. Interspike intervals as a correlate of periodicity pitch

in cat cochlear nucleus. J Acoust Soc Am 97: 2414–2429, 1995.227. Rhode WS. Physiological-morphological properties of the cochlear

nucleus. In: Neurobiology of Hearing: the Central Auditory Sys-

tem, edited by R. A. Altschuler, R. P. Bobbin, B. M. Clopton, andD. W. Hoffman. New York: Raven, 1991, p. 47–77.

228. Rhode WS. Temporal coding of 200% amplitude modulated signalsin the ventral cochlear nucleus of the cat. Hear Res 77: 43–68, 1994.

229. Rhode WS and Greenberg S. Encoding of amplitude modulationin the cochlear nucleus of the cat. J Neurophysiol 71: 1797–1825,1994.

230. Rhode WS and Smith PH. Characteristics of tone-pip responsepatterns in relationship to spontaneous rate in cat auditory nervefibers. Hear Res 18: 159–168, 1985.

231. Rhode WS and Smith PH. Encoding timing and intensity in theventral cochlear nucleus of the cat. J Neurophysiol 56: 261–286,1986.

232. Rieke F, Bodnar D, and Bialek W. Naturalistic stimuli increasethe rate and efficiency of information transmission by primaryauditory neurons. Proc R Soc Lond B Biol Sci 262: 259–265, 1995.

233. Ritsma RJ. Existence region of the tonal residue. J Acoust Soc Am

34: 1224–1229, 1962.234. Robles L and Ruggero MA. Mechanics of the mammalian cochlea.

Physiol Rev 81: 1305–1352, 2001.235. Rodenburg M, Verveij C, and Van Den Brink G. Analysis of

evoked responses in man elicited by sinusoidally amplitude mod-ulated noise. Audiology 11: 283–293, 1972.

236. Rodrigues-Dagaeff C, Simm G, De Ribaupierre Y, Villa A, De

Ribaupierre F, and Rouiller EM. Functional organization of theventral division of the medial geniculate body of the cat: evidencefor a rostro-caudal gradient of response properties and corticalprojections. Hear Res 39: 103–126, 1989.

237. Rose JE, Greenwood DD, Goldberg JM, and Hind JE. Somedischarge characteristics of single neurons in the inferior colliculusof the cat. I. Tonotopical organization, relation of spike-counts totone intensity, and firing patterns of single elements. J Neuro-

physiol 26: 294–320, 1963.238. Rosen S. Temporal information in speech: auditory and linguistic

aspects. Philos Trans R Soc Lond B Biol Sci 336: 367–373, 1992.239. Ross B, Borgmann C, Draganova R, Roberts LE, and Pantev C.

A high-precision magnetoencephalographic study of human audi-tory steady-state responses to amplitude-modulated tones. J

Acoust Soc Am 108: 679–691, 2000.240. Rouiller E and De Ribaupierre F. Neurons sensitive to narrow

ranges of repetitive acoustic transients in the medial geniculatebody of the cat. Brain Res 48: 323–326, 1982.

241. Rouiller E, De Ribaupierre Y, Toros-Morel A, and De Ribaupi-

erre F. Neural coding of clicks in the medial geniculate body ofcat. Hear Res 5: 81–100, 1981.

242. Ruggero MA. Systematic errors in indirect estimates of basilarmembrane travel times. J Acoust Soc Am 67: 707–710, 1980.

243. Ruggero MA. Physiology and coding of sound in the auditorynerve. In: The Mammalian Auditory Pathway: Neurophysiology,edited by R. R. Fay and A. N. Popper. New York: Springer-Verlag,1992, p. 34–93.

244. Ruggero MA and Rich NC. Timing of spikes initiation in cochlearafferents: dependence on site of innervation. J Neurophysiol 58:379–403, 1987.

245. Ryan MJ and Rand AS. Phylogenetic inference and the evolutionof communication in tungara frogs. In: The Design of Animal

Communication, edited by M. D. Hauser and M. Konishi. Cam-bridge, MA: MIT, 1999, p. 535–557.

246. Sachs MB and Abbas PJ. Rate versus level functions for auditory-nerve fibers in cats: tone-burst stimuli. J Acoust Soc Am 56: 1835–1847, 1974.

247. Saitoh K, Maruyama N, and Kudoh M. Sustained response ofauditory cortex units in the cat. In: Brain Mechanisms of Sensa-

tion, edited by Y. Katsuki, R. Norgren, and M. Sato. New York:Wiley, 1981, p. 31–43.

248. Saldana E, Feliciano M, and Mugnaini E. Distribution of de-scending projections from primary auditory neocortex to inferiorcolliculus mimics the topography of intracollicular projections.J Comp Neurol 371: 15–40, 1996.

249. Sally SL and Kelly JB. Organization of auditory cortex in thealbino rat: sound frequency. J Neurophysiol 59: 1627–1638, 1988.

250. Schorer E. Critical modulation frequency based on detection ofAM versus FM tones. J Acoust Soc Am 79: 1054–1057, 1986.

251. Schreiner CE and Langner G. Periodicity coding in the inferiorcolliculus of the cat. II. Topographical organization. J Neuro-

physiol 60: 1823–1840, 1988.252. Schreiner CE and Langner G. Laminar fine structure of fre-

quency organization in auditory midbrain. Nature 388: 383–386,1997.

253. Schreiner CE and Raggio MW. Neuronal responses in cat pri-mary auditory cortex to electrical cochlear stimulation. II. Repeti-tion rate coding. J Neurophysiol 75: 1283–1300, 1996.

254. Schreiner CE and Snyder RL. Modulation transfer characteris-tics of neurons in the dorsal cochlear nucleus of the cat. Soc

Neurosci Abstr 13: 1258, 1987.255. Schreiner CE and Urbas JV. Representation of amplitude mod-

ulation in the auditory cortex of the cat. I. The anterior auditoryfield (AAF). Hear Res 21: 227–241, 1986.

256. Schreiner CE and Urbas JV. Representation of amplitude mod-ulation in the auditory cortex of the cat. II. Comparison betweencortical fields. Hear Res 32: 49–64, 1988.

257. Schroeder MR. Modulation transfer functions: definition and mea-surement. Acustica 49: 179–182, 1981.

258. Schuller G. Natural ultrasonic echoes from wing beating insectsare encoded by collicular neurons in the CF-FM bat, rhinolophus-

AUDITORY MODULATION PROCESSING 575

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 36: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

ferrumequinum. J Comp Physiol A Sens Neural Behav Physiol 155:121–128, 1984.

259. Schulze H and Langner G. Periodicity coding in the primaryauditory cortex of the Mongolian gerbil (Meriones unguiculatus):two different coding strategies for pitch and rhythm? J Comp

Physiol A Sens Neural Behav Physiol 181: 651–664, 1997.260. Schulze H and Langner G. Representation of periodicity pitch in

the primary auditory cortex of the Mongolian gerbil. Acta Otolar-

yngol Suppl 532: 89–95, 1997.261. Schulze H and Langner G. Auditory cortical responses to ampli-

tude modulations with spectra above frequency receptive fields:evidence for wide band spectral integration. J Comp Physiol A

Sens Neural Behav Physiol 185: 493–508, 1999.262. Schwartz IR. Superior olivary complex and lateral lemniscal nu-

clei. In: The Mammalian Auditory Pathway: Neuroanatomy, ed-ited by D. B. Webster, A. N. Popper, and R. R. Fay. New York:Springer-Verlag, 1992, p. 117–167.

263. Sek A and Moore BCJ. The critical modulation frequency and itsrelationship to auditory filtering at low frequencies. J Acoust Soc

Am 95: 2606–2486, 1994.264. Semple MN and Aitkin LM. Representation of sound frequency

and laterality by units in central nucleus of cat inferior colliculus.J Neurophysiol 42: 1626–1639, 1979.

265. Shannon RV, Zeng FG, Kamath V, Wygonski J, and Ekelid M.

Speech recognition with primarily temporal cues. Science 270:303–304, 1995.

266. Shofner WP, Sheft S, and Guzman SJ. Responses of ventralcochlear nucleus units in the chinchilla to amplitude modulation bylow-frequency, two-tone complexes. J Acoust Soc Am 99: 3592–3605, 1996.

267. Sinex DG, Henderson J, Li HZ, and Chen GD. Responses ofchinchilla inferior colliculus neurons to amplitude-modulatedtones with different envelopes. J Assoc Res Otolaryngol 3: 390–402, 2002.

268. Sivaramakrishnan S and Oliver DL. Distinct K currents result inphysiologically distinct cell types in the inferior colliculus of therat. J Neurosci 21: 2861–2877, 2001.

269. Smith PH, Joris PX, and Yin TCT. Projections of physiologicallycharacterized spherical bushy cell axons from the cochlear nucleusof the cat: evidence for delay lines to the medial superior olive.J Comp Neurol 331: 245–260, 1993.

270. Smith RL and Brachman ML. Response modulation of auditory-nerve fibers by AM stimuli: effects of average intensity. Hear Res 2:123–133, 1980.

271. Smith RL and Brachman ML. Adaptation in auditory-nerve fibers:a revised model. Biol Cybern 44: 107–120, 1982.

272. Smith ZM, Delgutte B, and Oxenham AJ. Chimaeric soundsreveal dichotomies in auditory perception. Nature 416: 87–90, 2002.

273. Sovijarvi ARA. Detection of natural complex sounds by cells inthe primary auditory cortex of the cat. Acta Physiol Scand 93:318–335, 1975.

274. Steinschneider M, Reser DH, Fishman YI, Schroeder CE, and

Arezzo JC. Click train encoding in primary auditory cortex of theawake monkey: evidence for two mechanisms subserving pitchperception. J Acoust Soc Am 104: 2935–2955, 1998.

275. Struhsaker CT. Auditory communication among vervet monkeys(Cercopithecus aethiops). In: Social Communication Among Pri-

mates, edited by S. A. Altmann. Chicago, IL: Univ. of Chicago Press,1967, p. 281–324.

276. Suga N, O’Neill WD, Kujirai K, and Manabe T. Specificity ofcombination sensitive neurons for processing of complex biosonarsignals in auditory cortex of the mustached bat. J Neurophysiol 49:1573–1626, 1983.

277. Symmes D. Discrimination of intermittent noise by macaques fol-lowing lesions of the temporal lobe. Exp Neurol 16: 201–214, 1966.

278. Terhardt E. uber die durch amplitudenmodulierte Sinustone her-vorgerufene Horenempfindung. Acustica 20: 210–214, 1968.

279. Tsuchitani C and Johnson DH. Binaural cues and signal process-ing in the superior olivary complex. In: Neurobiology of Hearing:

The Central Auditory System, edited by R. A. Altschuler, R. P.Bobbin, B. M. Clopton, and D. W. Hoffman. New York: Raven, 1991,p. 163–193.

280. Ulanovsky N, Las L, and Nelken I. Processing of low-probabilitysounds by cortical neurons. Nature Neurosci 6: 391–398, 2003.

281. Van Tassell DJ, Soli SD, Kirby VM, and Widin GP. Speechwaveform envelope cues for consonant recognition. J Acoust Soc

Am 82: 1152–1161, 1987.282. Vater M. Single unit responses in cochlear nucleus of horseshoe

bats to sinusoidal frequency and amplitude modulated signals?J Comp Physiol A Sens Neural Behav Physiol 149: 369–388, 1982.

283. Verhey JL, Dau T, and Kollmeier B. Within-channel cues incomodulation masking release (CMR): experiments and model pre-dictions using a modulation-filterbank model. J Acoust Soc Am 106:2733–2745, 1999.

284. Vernier VG and Galambos R. Response of single medial genicu-late units to repetitive click stimuli. Am J Physiol 188: 233–237,1957.

285. Viemeister NF. Temporal modulation transfer functions basedupon modulation thresholds. J Acoust Soc Am 66: 1364–1380, 1979.

286. Viemeister NF and Plack CJ. Time analysis. In: Human Psycho-

physics, edited by W. A. Yost, A. N. Popper, and R. R. Fay. NewYork: Springer, 1993, p. 116–154.

287. Von Helmholtz HLF. Die Lehre von den Tonempfindungen als

physiologiche Grundlage fur die Theorie der Musik. Trans. Ellis

AJ 1954. On the Sensations of Tone as a Physiological Basis for

the Theory of Music. New York: Dover, 1863.288. Voss RF and Clarke J. 1/f noise in music and speech. Nature 258:

317–318, 1975.289. Wakefield GH and Viemeister NF. Selective adaption to linear

frequency-modulated sweeps: evidence for direction-specific FMchannels. J Acoust Soc Am 75: 1588–1592, 1984.

290. Walton JP, Frisina RD, and O’Neill WE. Age-related alteration inprocessing of temporal sound features in the auditory midbrain ofthe cba mouse. J Neurosci 18: 2764–2776, 1998.

291. Walton JP, Simon H, and Frisina RD. Age-related alterations inthe neural coding of envelope periodicities. J Neurophysiol 88:565–578, 2002.

292. Wang X, Merzenich MM, Beitel R, and Schreiner CE. Repre-sentation of a species-specific vocalization in the primary auditorycortex of the common marmoset: temporal and spectral character-istics. J Neurophysiol 74: 2685–2706, 1995.

293. Wang X, Merzenich MM, Beitel R, and Schreiner CE. Repre-sentation of a species-specific vocalization in the primary auditorycortex of the common marmoset: temporal and spectral character-istics. J Neurophysiol 74: 2685–2706, 1995.

294. Wang X and Sachs MB. Neural encoding of single-formant stimuliin the cat. I. Responses of auditory nerve fibers. J Neurophysiol 70:1054–1075, 1993.

295. Wang X and Sachs MB. Neural encoding of single-formant stimuliin the cat. II. Responses of anteroventral cochlear nucleus units.J Neurophysiol 71: 59–78, 1994.

296. Wang X and Sachs MB. Transformation of temporal dischargepatterns in a ventral cochlear nucleus stellate cell model: implica-tions for physiological mechanisms. J Neurophysiol 73: 1600–1616,1995.

297. Warr WB. Parallel ascending pathways from the cochlear nucleus:neuroanatomical evidence of functional specialization. In: Contri-

butions to Sensory Physiology, edited by W. D. Neff. New York:Academic, 1982, p. 1–38.

298. Weiss TF and Rose C. A comparison of synchronization filters indifferent auditory receptor organs. Hear Res 33: 175–180, 1988.

299. Whitfield IC. Auditory cortex and the pitch of complex tones. J

Acoust Soc Am 67: 644–647, 1980.300. Whitfield IC and Evans EF. Responses of auditory cortical neu-

rons to stimuli of changing frequency. J Neurophysiol 28: 655–672,1965.

301. Wiegrebe L and Winter IM. Temporal representation of iteratedrippled noise as a function of delay and sound level in the ventralcochlear nucleus. J Neurophysiol 85: 1206–1219, 2001.

302. Wightman FL and Green DM. The perception of pitch. Am Sci 62:208–215, 1974.

303. Wilson HR and Wilkinson F. Evolving concepts of spatial chan-nels in vision: from independence to nonlinear interactions. Per-

ception 26: 939–960, 1997.304. Winer JA. The functional architecture of the medial geniculate

576 JORIS, SCHREINER, AND REES

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org

Page 37: Neural Processing of Amplitude-Modulated Soundsassmann/hcs6367/joris... · 2012. 11. 6. · amplitude modulated (AM) (thick line) at 100% with a modulation frequency of 100 Hz, according

body and primary auditory cortex. In: The Mammalian Auditory

Pathway: Neuroanatomy, edited by D. B. Webster, A. N. Popper,and R. R. Fay. New York: Springer-Verlag, 1992, p. 222–409.

305. Winter IM, Robertson D, and Yates GK. Diversity of character-istic frequency rate-intensity functions in guinea pig auditory nervefibers. Hear Res 45: 191–202, 1990.

306. Winter P and Funkenstein HH. The effects of species-specificvocalization on the discharge of auditory cortical cells in the awakesquirrel monkey (Saimiri sciureus). Exp Brain Res 18: 489–504,1973.

307. Wollberg Z and Newman JD. Auditory cortex of squirrel monkey:response patterns of single cells to species-specific vocalisations.Science 175: 212–214, 1972.

308. Wong D, Maekawa M, and Tanaka H. The effects of pulserepetition rate on the delay sensitivity of neurons in the auditorycortex of the FM bat, Myotis lucifugus. J Comp Physiol A Sens

Neural Behav Physiol 170: 393–402, 1992.309. Wong SW and Schreiner CE. Representation of CV-sounds in cat

primary auditory cortex: intensity dependence. Speech Communi-

cation 41: 93–106, 2003.310. Yang L and Pollak GD. Differential response properties to ampli-

tude modulated signals in the dorsal nucleus of the lateral lemnis-cus of the mustache bat and the roles of GABAergic inhibition.J Neurophysiol 77: 324–340, 1997.

311. Yates GK. Dynamic effects in the input/output relationship ofauditory nerve. Hear Res 27: 221–230, 1987.

312. Yin TCT. Neural mechanisms of encoding binaural localizationcues in the auditory brainstem. In: Integrative Functions in the

Mammalian Auditory Pathway, edited by D. Oertel, A. N. Popper,and R. R. Fay. New York: Springer, 2002, p. 99–159.

313. Yin TCT and Chan JCK. Interaural time sensitivity in medialsuperior olive of cat. J Neurophysiol 64: 465–488, 1990.

314. Yin TCT, Chan JCK, and Irvine DRF. Effects of interaural timedelays of noise stimuli on low-frequency cells in the cat’s inferiorcolliculus. I. Responses to wideband noise. J Neurophysiol 55:280–300, 1986.

315. Yin TCT, Chan JK, and Kuwada S. Characteristic delays andtheir topographic distribution in the inferior colliculus of the cat.In: Mechanisms of Hearing, edited by W. R. Webster and L. M.Aitkin. Clayton, Victoria, Australia: Monash Univ. Press, 1983, p.94–99.

316. Yin TCT, Joris PX, Smith PH, and Chan JCK. Neuronal pro-cessing for coding interaural time disparities. In: Binaural and

Spatial Hearing in Real and Virtual Environments, edited by R.Gilkey and T. Anderson. New York: Lawrence Erlbaum, 1997, p.427–445.

317. Yin TCT, Kuwada S, and Sujaku Y. Interaural time sensitivity ofhigh-frequency neurons in the inferior colliculus. J Acoust Soc Am

76: 1401–1410, 1984.318. Yost WA, Sheft S, and Opie J. Modulation interference in detec-

tion and discrimination of amplitude modulation. J Acoust Soc Am

86: 2138–2147, 1989.319. Young ED. The cochlear nucleus. In: The Synaptic Organization

of the Brain, edited by G. M. Shepherd. New York: Oxford Univ.Press, 1998, p. 121–158.

320. Young ED, Robert JM, and Shofner WP. Regularity and latencyof units in ventral cochlear nucleus: implications for unit classifi-cation and generation of response properties. J Neurophysiol 60:1–29, 1988.

321. Zhang HM and Kelly JB. AMPA and NMDA receptors regulateresponses of neurons in the rat’s inferior colliculus. J Neurophysiol

86: 871–880, 1902.322. Zhao HB and Liang ZA. Processing of modulation frequency in

the dorsal cochlear nucleus of the guinea pig: amplitude modulatedtones. Hear Res 82: 244–256, 1995.

323. Zhao HB and Liang ZA. Temporal encoding and transmitting ofamplitude and frequency modulations in dorsal cochlear nucleus.Hear Res 106: 83–94, 1997.

324. Zwicker E. Die Grenzen der Horbarkeit der Amplitudenmodula-tion und der Frequenzmodulation eines Tones. Acustica 2: 125–133,1952.

325. Zwicker E and Fastl H. Psychoacoustics. Berlin: Springer, 1999.

AUDITORY MODULATION PROCESSING 577

Physiol Rev • VOL 84 • APRIL 2004 • www.prv.org