Cortical Correlates of Low-Level Perception: From Neural ... · physiological data describing the...

Neuron

Perspective

Cortical Correlates of Low-Level Perception:From Neural Circuits to Percepts

Yves Fregnac1,* and Brice Bathellier11Unite Neuroscience Information et Complexite (UNIC), CentreNational de la Recherche Scientifique, FRE 3693, 91198Gif-sur-Yvette, France*Correspondence: [email protected]://dx.doi.org/10.1016/j.neuron.2015.09.041

Low-level perception results from neural-based computations, which build a multimodal skeleton of uncon-scious or self-generated inferences on our environment. This review identifies bottleneck issues concerningthe role of early primary sensory cortical areas, mostly in rodent and higher mammals (cats and non-humanprimates), where perception substrates can be searched at multiple scales of neural integration. We discussthe limitation of purely bottom-up approaches for providing realistic models of early sensory processing andtheneed for identificationof fast adaptiveprocesses, operatingwithin the timeof apercept. Futureprogresseswill depend on the careful use of comparative neuroscience (guiding the choices of experimental models andspeciesadapted to thequestionsunder study), on thedefinitionof agreed-uponbenchmarks for sensory stim-ulation, on the simultaneous acquisition of neural data at multiple spatio-temporal scales, and on the in vivoidentification of key generic integration and plasticity algorithms validated experimentally and in simulations.

IntroductionLow-level perception can be defined as the neural-based com-

putations building unconscious or self-generated inferences

during the processing of sensory events. These internal infer-

ences are in turn ‘‘psychologically projected into external space

and accepted as our most immediate reality’’ (Gregory, 1997).

Low-level perception does not necessarily require attentional

processes, and, for this reason, it is often confounded with

non-attentive perception. This operation constitutes a necessary

reformatting of sensory input in a compositional neuronal lan-

guage, which will allow more abstract processing by higher

cognitive-like cortical areas (see Abboud et al., 2015). This

grammar of perceptual primitives, which emerge (‘‘pop out’’)

effortlessly, reflects the expectations of our brain, built through

prior sensorimotor experience, about what is to be perceived

(Gregory, 1997). This largely autonomous, dynamic, and adap-

tive process can be observed in the anesthetized brain, although

it is continuously updated through closed-loop interactions in the

awake and attentive states. It feeds and guides our cortically

mediated interactions with the world through fast, but context-

adaptive, behavioral choices (Figure 1A).

Perception is based on raw information detected by sensory

receptors of the organisms, usually classified into distinct

‘‘senses’’ or modalities. Following Aristotle, five major senses

are classically distinguished in humans, depending on the type

of physical signal used for transduction (photon for vision, acous-

tic vibration for hearing, volatile anddissolved chemicals for taste

and olfaction, mechanical forces for the haptic sense). Sherring-

ton later proposed a more functional classification based on the

spatial proximity of the sensory sources relative to our own

body: vision, olfaction, and hearing are the ‘‘teleceptors’’ inform-

ing about the distant environment, while touch and taste are the

‘‘exteroceptors’’ informing about what is within our immediate

reach; in addition, ‘‘interoceptors’’ inform about the bodily func-

tions and ‘‘proprioceptors’’ inform about bodily position.

110 Neuron 88, October 7, 2015 ª2015 Elsevier Inc.

A key attribute of perception is that it usually solves ambigu-

ities that result from incomplete data about the world. Since

ancient debates opposing rationalists and empiricists, it has

been clear that ‘‘what we see (perceive) is not what we get

(receive)’’: perception departs from the physical reality sampled

by our sensors, leading to illusions that reflect ‘‘erroneous’’ infer-

ences about the sensory input. Figures 1B and 1C illustrate two

classical examples of illusions: the Mach bands illusion (Ratliff,

1965), in which discontinuities in the spatial derivative of lumi-

nance induce the perception of illusory contrast edges, and the

‘‘line motion effect,’’ in which the sequential presentation of

two static shapes (a square followed by a rectangle) produces,

for a specific range of stimulus intervals, the perception of

continuousmotion (‘‘phi’’ effect inWertheimer, 1912). Aswe later

develop, canonical circuit motifs forming built-in physiological

filters can account for such biases of visual perception.

As it has long been noticed, a complementary way of solving

ambiguity in object identification is to combine or transfer

perceptual knowledge across sensory channels. In his famous

letter to John Locke in 1688, William Molyneux asked whether

a born blind man who has learnt to distinguish a globe and a

cube by touch would be able to distinguish and name these ob-

jects simply by sight, once he had been enabled to see. Recent

observations in newly sighted humans indicate that the answer is

no, but that cross-modal associations can be rapidly formed

(Held et al., 2011). Hence the use of contextual information in

perception through ‘‘built-in’’ relationships may generalize from

unimodal to multimodal perception. Multimodal illusions support

this idea. For example, in the double flash illusion, a tone triggers

the percept of a luminance change (Watkins et al., 2007), while in

theMcGurk-McDonald effect, the percept of a vocalized syllable

is biased by the visual observation of mismatched mouth mo-

tions (McGurk and MacDonald, 1976). Multisensory integration

is also at the center of our everyday experience of ‘‘body-owner-

ship’’ (see also Blanke et al., 2015). The representation of body

mailto:[email protected]

http://dx.doi.org/10.1016/j.neuron.2015.09.041

http://crossmark.crossref.org/dialog/?doi=10.1016/j.neuron.2015.09.041&domain=pdf

Figure 1. Sensation, Perception, and Illusion(A) A conceptual framework (inspired by Richard Gregory).(B) Mach bands: from top to bottom, displayed luminance profile, measured luminance gradient in positions 1–4, perceived luminance profile. The Mach illusion(blue arrows) corresponds to the perceived lighter (position 2) and darker (position 3) bands.(C) Line motion illusion: top, spatio-temporal sequence of displayed stimuli (square followed by a bar); bottom, perceived downward motion (blue arrow). (A)–(C)adapted from Shulz and Fregnac, 2010.(D)Pictureof the surfacevasculatureoverlying the imaged regionofV1,V2, andV4with the regionof interest (ROI), alongwithdiagramsof the visual stimuli used tomapcortical responses. Scale bar = 1mm. The red arrows indicate the drift directions of the stimuli. Red outlines delineate the areas of V1, V2, and V4 that were analyzed.(E) Differential analysis of the orientation maps recorded simultaneously across the three different visual areas: response profiles produced by gratings (LG) andillusory contours (IC) were indistinguishable in V4, while those in V1 and V2 showed a clear orientation preference shift for IC stimuli. The observed shift reflects the45� angle between the solid inducer orientation and the illusory contour.(F) The relative scale of spatial integration of visual cortical receptive fields is overlaid with the pattern of illusory contours (IC) defined by spatially aligned abuttinghorizontal lines. The hierarchical functional hypothesis of Wang and colleagues is that spatially aligned neurons with smaller RFs in V1 and V2 project to IC-selective neurons with larger RFs in V4, therefore generating a more robust response to IC in V4. For illustrative purposes, the RF diameter of V1 and V2neurons was taken as 1� and that of V4 neuron as 6�. The mechanisms regulating the coordinated activity between feedforward and feedback flow are stillunknown. (D)–(F) are freely adapted from Pan et al., 2012, with permission of Wei Wang.

Neuron

Perspective

parts is intrinsically multimodal, as revealed by illusions such as

the rubber hand illusion (Botvinick and Cohen, 1998), in which

the pairing of a tactile stimulus on the real hand of a subject

with visualization of a touch on a fake hand creates an ownership

illusion that the fake hand belongs to the body. This is also partic-

ularly evident for the global consciousness of one-self as an

entire body, which can be profoundly affected in situations in

which, as in the above example, incongruent multimodal evi-

dence is given to the subject (Blanke, 2012).

To restrict the scope of this article, we will focus on the role

played by primary sensory cortical areas in low-level perception,

by taking examples in a few animal species where the substrates

of perception can be looked for at multiple scales of neural inte-

gration, mostly in the rodent and higher mammals (cats and non-

human primates). David Marr (Marr, 1982) distinguished three

hierarchical levels of analysis in the study of neural systems: (1)

the computation realized by the brain, (2) its algorithmic instanti-

ation, and (3) the biophysical substrate of implementation. Start-

ing from the psychological or psychophysical evidence, which

allow us to characterize the computation corresponding to

low-level perception, we will review current knowledge on the

elementary algorithmic principles found in cortical circuits un-

derlying perception (see Box 1) and discuss the main bottleneck

issues for extracting the rules necessary to build any realistic

model of the early sensory brain (Box 2).

This review will draw mainly from unimodal perception in

vision, which offers a unique and abundantly documentedmodel

of sensory perception necessary for higher cognitive activities

(e.g., sign communication, face recognition, and reading and

writing in humans) and structurally prevalent in the neocortex

of higher mammals (more than half of primate cortex is allocated

to the analysis of visual information distributed across several

tens of individualized retinotopically organized areas; Markov

et al., 2013). Moreover, important advances have been made

on the more holistic aspects of high-level vision (see Abboud

et al., 2015), which we could start to bridge to the abundant

physiological data describing the elementary mechanisms at

play in the early (subcortical and cortical) visual system.

Phenomenology of Low-Level Perception: A HolisticPerspectiveA natural starting point for characterizing perception is to ask:

how are sensory ‘‘bits’’ bound in a ‘‘whole’’ that is meaningful

to the observer?When facing a natural visual scene, human sub-

jects have an immediate conscious perception of the elementary

features that compose it (segmentation), as well as of the higher-

Neuron 88, October 7, 2015 ª2015 Elsevier Inc. 111

Box 1. Current Status of the Field

d ‘‘Pop-out’’ non-attentive perception is constrained by ho-

listic laws that reflect expectations about the composi-

tional structure of what is to be ‘‘perceived.’’

d No definitive neural-based bottom-up model accounts for

the emergence of these psychic laws.

d Models of receptive fields fail to consider the synaptic

nature of their underlying processes and consequently

underestimate the impact of local recurrent networks.

d Models of receptive fields fail to account for the functional

diversity of contextual dependencies and for larger-scale

interactions across features.

d Presentmodels of perception rely on a dominantly feedfor-

ward hierarchy of cortical areas, but multiple examples

attest for the backpropagation of higher-order feature

selectivity onto primary sensory cortical areas.

d Algorithms of fast and reversible plasticity necessary

for transient percept emergence remain to be validated

in vivo.

d The jump from rodent to humans regarding the neural or-

ganization of low-level perception seems presently too

high.

Box 2. Future Directions

d Amathematical framework that goes beyond the receptive

field concept is necessary to define the complex nonlinear

transformation applied by brain circuits tomap physical re-

ality into the perceptual space

d Multiscale models that realistically account both for struc-

tural and functional observations will help to access the

algorithmic principles of cortical circuits.

d Large-scale recordings during uni- and multimodal

perception will make it possible to test if specific attracting

states of cortical neuron populations represent elementary

tokens of percepts.

d New techniques to precisely control spatio-temporal acti-

vation sequences in cortical assemblies will help to eluci-

date the exact nature of the neural code and the temporal

precision below which the percept is lost.

d Comparative physiology approaches should be encour-

aged to discriminate the common principles shared by cir-

cuit motifs from a species-specific use of a sensorymodal-

ity or multimodal strategy.

Neuron

Perspective

order global objects that emerge from their associations (bind-

ing), although not necessarily in this order. Contours, colors,

textured surfaces, shapes, and three-dimensional objects pop

out unambiguously, in a fraction of a second to seconds, accord-

ing to the background or context.

Various conceptual schools have emerged from the holistic

approach. The first attempt came from Gestalt theory, intro-

duced by von Ehrenfels and generally attributed to Kolher. In

its simplest form, Gestalt theory posits that the ‘‘whole’’ and

the ‘‘parts’’ of a percept coexist in a reciprocal dependency

that obeys specific relational rules, such as proximity, similarity,

uniform density, common fate, direction, good continuation,

closure, and symmetry, to name but a few (for review seeWage-

mans et al., 2012). Gestalt psychology remains relevant to our

current understanding of the emergence of relational structure

and higher-order binding. By definition, Gestalt effects are illu-

sion makers, but these illusions are the expression of our brain

grouping information and building, according to Aristotle’s

view, global constructs ‘‘other than the sum of their parts.’’ A

fascinating illustration of the compositional power of perception

is given by the simultaneous extraction, within a single glance, of

both local (fruit) and global (face) features, from Arcimboldo’s

portrait of Emperor Vertumnus (the Roman God of the Seasons).

The original claim of the ‘‘Gestaltists’’—the existence of a

‘‘grammar of seeing’’ as phrased by Kanizsa—has been criti-

cized and extended by other schools of thought. The field of ‘‘en-

action’’ (Palacios and Bozinovic, 2003), for example, insists on

the learnt nature of the elements of this grammar, reflecting

active perception and interactions with the environment, as

opposed to fixed perceptual primitives reflecting the structure

of objects. Other approaches focused on mathematical formula-

tions of this idea: e.g., artificial vision models (Morel et al., 2010)

or symbolic modeling (for review see Wagemans et al., 2012).


Nevertheless, there is a consensus that key perceptual illusions

reveal the existence of perceptual laws, characterizing the pro-

cess of recognition of objects, and that these illusions can be

used to reveal the neural correlates of perception and its compu-

tational underpinnings (Eagleman, 2001).

Several visual illusions in which the specificity of the output is

not predicted by that of the ‘‘parts’’ and their topological union

provide other important clues. For example, binocular rivalry, in

which a different image, or different non-corresponding image

patches (Diaz-Caneja, 1928) are simultaneously presented to

each eye, results in random switching between two possible per-

cepts. This paradigm illustrates the holistic nature of the percep-

tual process by providing evidence for a single percept among

several possible under multiple conflicting inputs. Moreover, it

enables the neurobiological study of how this percept emerges.

Studies performed at the single-cell level in the behaving non-hu-

man primate indicate that the percentage of visual cortical neu-

rons exhibiting activity correlated with the perceptual report in-

creases along the visual cortical area hierarchy, starting with a

few ‘‘reporters’’ in V1 and culminating in the inferotemporal cortex

and the superior temporal sulcus (Leopold and Logothetis, 1996).

Most remarkably, unilateral hemisphere activation (by a caloric

stimulus applied to one inner ear or by transcranial magnetic

stimulation) forces the alternation between the reported rivaling

percepts, which suggests that the conscious percept is selected

through competition between the two hemispheres (the ‘‘inter-

hemispheric switch hypothesis’’ (Miller et al., 2000). These results

suggest that the ambivalence is present in the information distrib-

uted across the sensory cortical hierarchy and that only one of the

two encoded percepts is raised to awareness through a switch

operating at the scale of interacting networks (mesoscopic scale).

The emergence of a global percept together with its corre-

sponding features implies that the brain brings certain features

(potentially even ones absent from physical reality) to conscious-

ness and discards others. For example, in the tactile funneling

Neuron

Perspective

illusion, simultaneous stimulations of two adjacent fingers evoke a

single focal sensation at the center of the stimulus pattern (central

to the two digit tips or spanning the two stimulation sites), even

when no physical stimulus occurs at that site (Chen et al.,

2003). In the ‘‘line-motion’’ illusion (Figure 1C), a stationary square

briefly precedes a long stationary bar presentation and produces

a perception of continuousmotion, absent in the stimulus (Jancke

et al., 2004). In the case of subjective or illusory contours (ICs;

Figure 1D), the emergence of ‘‘non-existent’’ features (contours)

is induced by precise geometrical alignment of real stimulus fea-

tures such as abutting lines (Pan et al., 2012; Seghier and Vuil-

leumier, 2006). These illusions show that the brain does more

than mirroring retinal inputs; it also computes knowledge or infer-

ences encoded in the processing circuits, such as the continuity

of objects boundaries or displacements. Extensive studies

show the cortical origin of these effects, at many levels. Simple

fusion illusions such as the ‘‘funneling’’ and ‘‘line-motion’’ effects

are already reflected in the activity of V1, as shown by voltage-

sensitive dye imaging in monkey and cat (Chen et al., 2003;

Jancke et al., 2004). These may be accounted for by lateral diffu-

sion processes intrinsic to primary sensory areas (Fregnac, 2012).

By contrast, the emergenceof illusory contours (as in Figure 1D)

involves reverberation across multiple networks including sec-

ondary visual areas. fMRI studies in humans have demonstrated

robust activations to illusory contours in V1 and V2 (for review,

see Seghier and Vuilleumier, 2006), but also distinct activations

in areas such as V4 (Mendola et al., 1999; Montaser-Kouhsari

et al., 2007). Electrophysiology in animal models shows a propor-

tional increase in the number of visual cells responding to illusory

contours between V1 and V2 (Lee and Nguyen, 2001; von der

Heydt et al., 1984), but these responses occur a few tens of milli-

seconds later than in downstream areas such as V4 (De Weerd

et al., 1996; Lee and Nguyen, 2001), suggesting that they may

reflect ‘‘top-down’’ cortico-cortical feedback. A recent imaging

and electrophysiology study (Pan et al., 2012) in rhesus ma-

caques, confirming partially a pioneer study in cat V1 and V2

(Sheth et al., 1996), shows that the orientation preference maps

for drifting gratings (DG) and virtual contours (VC) are shifted by

the angle between the solid inducer and the virtual contour in V1

and V2, whereas they superimpose perfectly for V4 (Figures 1D–

1F). Electrophysiological population data confirmed a ‘‘salt and

pepper’’ organization of VC-responsive cells in V1 and V2 already

seen by others. Thus, the orientation domains mapped in early vi-

sual areas V1 and V2mainly encode the local physical features of

the inducer stimulus, whereas a complete overlap of orientation

domains to process real and illusory contours emerges only in

V4. The roleofhigher-orderareas in thisphenomenon issupported

by lesion studies, showing that the perception of ICs, but not of

luminance-defined real contours, is severely impaired after V4

lesion (DeWeerd et al., 1996). In summary, the global organization

of virtual contour decoding seems to agree (superficially) with a

global and distributed multilayered feedforward hierarchy culmi-

nating in V4, but its functional emergence obviously requires the

concomitant activation of cortico-cortical feedback processes in

V1, through mechanisms which are far from being elucidated

(Figure 1F).

Similar back-propagation of the functional influence of higher-

order selectivity onto primary sensory cortical areas is seen also

in multiple context-modulation effects (Sharma et al., 2003; Su-

per et al., 2003; Zipser et al., 1996) and during multimodal

discrimination task learning (Lemus et al., 2010).

The idea of a ‘‘perceptual grammar’’ shaping sensory informa-

tion according to defined compositionality rules extends from

uni- tomultisensory processing. The ventriloquist effect, in which

vision biases sound source localization, provides a good ex-

ample. Just as the facial motion of a puppet expertly animated

by a ventriloquist gives us the illusion that the puppet speaks,

the coincident appearance of an object (even a flashed circle)

in our field of view together with a sound often leads to the incor-

rect impression that the sound originated from the location of the

visual stimulus (Jack and Thurlow, 1973). This effect suggests

the expectation (innate or learned) of a spatial correlation be-

tween inputs, reflecting physical reality. The magnitude of the

cross-modal effect depends on factors such as spatial (Jack

and Thurlow, 1973) and temporal proximity (Bonath et al.,

2014) of the stimuli. Functional electroencephalographic and im-

aging studies indicate that cortical areas traditionally seen as

unisensory (such as the core/belt region of the auditory cortex)

are, together with associative areas, involved in the ventriloquist

illusion (Bonath et al., 2014). Thus, multisensory integration ap-

pears to engage multiple levels of cortical interactions, possibly

across the entire hierarchy of sensory areas.

Bottleneck Issues

Even if holistic models are conceptually rich and powerful, they

remain difficult to link to precise explanatory neural and synaptic

mechanisms, especially via unsupervised means. Except for

Grossberg’s attempt to build a consistent neutrally inspired

computational framework of early visual processing (Grossberg

et al., 2008), entirely bottom-up approaches have not yet pro-

duced convincing implementations of Gestalt principles. This

failure was, in a way, predicted by Wertheimer in his definition

of holism: ‘‘There are wholes, the behavior of which is not deter-

mined by that of their individual elements, but where the part-

processes are themselves determined by the intrinsic nature of

the whole.’’ Although often opposed, Gestalt theory (calling for

a holistic evaluation) and neural-based theories of object recog-

nition are not mutually exclusive: when driven dominantly by

external sensations and bottom-up activation, the cortical

neuronal machinery generates an automatic interpretation of

our environment through built-in mechanisms yet to be identi-

fied, whose effects contribute to the emergence of perceptual

laws applicable to automatized sensing. When driven mainly

by top-down expectations, the computation takes advantage

of a zoo of hallucinatory states internally stored in the same net-

works, to extrapolate what is given to be seen (‘‘controlled hallu-

cination,’’ as coined by Jan Koendrink [Koenderink, 2012]).

Turning these ideas into working models is still beyond reach,

but hopefully possible. Below, we will try to identify the gaps

that need to be filled to attain this goal.

Re-evaluation of the Receptive Field ConceptEfforts to characterize, in cortical sensory areas, the algorithms

of low-level perception (as in Marr’s tripartite definition) focused

initially on the attributes and perceptual invariants encoded by

single neurons. The dogma was that a neuron encodes the input

configuration that maximizes its firing frequency (the ‘‘neuronal


Neuron

Perspective

doctrine’’ of Barlow [Barlow, 1972]). This paradigm naturally

extended to the receptive field (RF): a neuron’s RF can be

defined as the region of sensory space within which the pres-

ence of a stimulus significantly modifies its activity (Hartline,

1938). In the visual system, the RF is the region of the retina

where a local change of luminance (relative to the background)

modulates (enhances or suppresses) the firing rate or the prob-

ability of discharge of the recorded neuron. In the auditory

system, RFs correspond to regions in physical space, in the tem-

poral and frequency domain, or both. In the somatosensory sys-

tem, receptive fields are (usually contiguous) regions of the body

surface or internal organs. A special case of tactile receptive

fields are those corresponding to the whiskers in rodents. This

operational concept used across most sensory modalities (but

see below) has provided a powerful description of neuronal re-

sponses in the visual system with an emphasis on primary visual

cortex (Alonso and Swadlow, 2005) and area MT (Rust et al.,

2006), in somatosensory cortex (Estebanez et al., 2012), and in

auditory cortex (Theunissen and Elie, 2014).

Although receptive fields are primarily a convenient way to

categorize the response of neurons, they imply the existence

of hierarchies of representations in which each neuron codes

for a particular feature of the physical space in a static and

invariant way and in which a scene can be reconstructed in a

predominantly ‘‘bottom-up’’ fashion from assemblies of active

neurons and the knowledge of their associated features. In

particular, in the early visual system of mammalian carnivores

and primates, one observes an undeniable progression in RF

complexity: isotropic ‘‘Mexican hat’’ receptive fields with con-

centric opponent ON and OFF subfields, found in the retina or

the thalamus, detect local contrast change and filter out back-

ground mean luminance. In V1, simple RFs with spatially segre-

gated ON and OFF subfields are selective to position and

contrast edge orientation, while complex RFs generalize orienta-

tion selectivity across position within the receptive field. In V2

and V4, higher-order hypercomplex cells seem to code for cor-

ners and masking singularities. Even if this simplifying scheme

has been challenged, it has inspired the hierarchical organization

of the most advanced models of shape recognition (DiCarlo

et al., 2012) and provided adequate component filters to extract

a perceptual sketch of physical shapes. Hence, piecewise RF

decomposition is a powerful tool to describe the low-level skel-

eton of sensory percepts across subcortical and cortical areas.

This formalism is not without its problems, however. For

example, the full description of a RF requires, in principle, that

one tests the entire space of possible stimuli on a neuron; this

is clearly impossible, especially with poorly parameterized stim-

ulus spaces (e.g., chemicals in olfaction, natural scenes in

vision). In practice, people make simplifying assumptions: for

example, that a RF can be reduced to a linear filter combined

with a static nonlinearity (LNP models; Figures 2A and 2B)

of varying complexity (e.g., Priebe and Ferster, 2012). This

assumption enables the use of reverse correlation techniques

with broadband stimuli (e.g., white noise) to derive the filter—a

process whose logic is obviously circular. Observations in pri-

mary visual cortex (Baudot et al., 2013; David et al., 2004; Four-

nier et al., 2011; Smyth et al., 2003), auditory cortex (Bathellier

et al., 2012; Machens et al., 2004), and even in the retina (Hosoya


et al., 2005) indicate that linear filters with a static nonlinearity

poorly predict neuronal responses, especially for natural stimuli.

Linear receptivefielddescriptionscouldbeextendedwith ‘‘gain

control’’ mechanisms in retina (Solomon et al., 2006), thalamus

(Mante et al., 2008), and primary visual cortex (Rust et al., 2005),

which can have low-contrast enhancement functions as the ‘‘divi-

sive normalization’’ principle (Carandini and Heeger, 2012) in

which the output response of a neuron is divided by a factor pro-

portional to the average (or the sum) firing rate of nearby neurons.

A further, mathematically more general refinement is to

describe the RF as a bank of parallel linear and nonlinear sub-

units, using two main approaches that are, to a certain extent,

equivalent mathematically: spike-triggered covariance analysis

(Rust et al., 2005) (Figure 2C) or Volterra decomposition of re-

sponses to white noise (Fournier et al., 2011, 2014) (Figure 2F).

Further improvements still can be gained from analysis of synap-

tic signals (Fournier et al., 2011) rather than spikes. Other gener-

alized decomposition schemes have been developed (e.g., linear

[Pillow et al., 2008] or nonlinear [NIM model in McFarland et al.,

2013] models accounting for the combined activity of simulta-

neously recorded units and their functional interactions), with

varying degrees of success (Figures 2D and 2E). These models

are useful in that they can provide a finer functional dissection

of afferent circuits to a neuron. On the other hand, they implicitly

describe a neuron’s response in terms of feedforward inputs,

although we know that most of a cortical neuron’s inputs derive

largely from local reverberating circuits. This obvious conundrum

is usually ignored.

Bottleneck Issues

We identify three major issues with the RF concept:

(1) Models of receptive fields fail to consider the synaptic na-

ture of their underlying processes and the impact of local

recurrent networks. Classical system theory provides

spike-based phenomenological models of ‘‘equivalent’’

feedforward circuits whose predictive value is limited to

the input statistics of the training set. As indicated above,

the exploration of RF nonlinearities has led to a variety of

models whose applicability is often limited to particular

stimuli, stimulus ranges, or cortical areas. Claims that

we understand 60% of the variance of visual neuron firing

in V1 (Carandini et al., 2005) is therefore misleading,

because current models do not explain spike output

causally, nor do they account for the complex conduc-

tance and voltage dynamics observed experimentally

(Baudot et al., 2013; Haider et al., 2010; Monier et al.,

2008). Adding higher-order correction terms may help,

but it increases model complexity without guarantying

success over all stimulus ranges. Because features in

natural images often combine multiple orientations and

spatial frequencies in one location, filtering the visual

input through many nonlinear (complex-like) components

(each for a specific feature) may contribute to the detec-

tion of high-order correlations of the visual scene (LeCun

et al., 2015). Combining detailed nonlinear statistical and

biophysical modeling is another alternative. Known non-

linearities in the transfer between retina and cortex (e.g.,

rectification of retinal and thalamic outputs) are generally

Figure 2. Receptive Field Models in Mammalian Primary Visual Cortex(A–C) Functional linear-nonlinear-Poisson (LNP) Models for V1 Neurons, and their characterization using spike-triggered analyses (adapted from Rust et al.,2005). (A) The standard simple cell model, based on a single space-time oriented filter. The stimulus (left arrow) is convolved with the filter, and the output ispassed through a half-wave rectifying and squaring nonlinearity. This signal determines the instantaneous rate of a Poisson spike generator. (B) The ‘‘energymodel’’ of a complex cell, based on a pair of space-time oriented filters with a quadrature (90�) phase relationship (Adelson and Bergen, 1985). Each filter isconvolved with the stimulus, and the responses are squared and summed. The resulting signal drives a Poisson spike generator. (C) Generalized LNP responsemodel. The cell is described by a set of n linear filters (L), which can be excitatory (E) or suppressive (S). The model response is computed by first convolving eachof the filters with the stimulus. An instantaneous nonlinearity (N) governs the combination of excitatory and suppressive signals that drives a Poisson spikegenerator (P).(D and E) Adapted from Dan Butts, with permission. (D) Nonlinear modeling of stimulus processing: the nonlinear input model (NIM) model assumes that eachdirect input to the cell is already a LN process. The resulting receptive field can be seen as a LNLN cascade selective to multiple features (McFarland et al., 2013).(E) Generalized contextual receptive field: the contextual input is represented as multiple nonlinear LFP bands, each reflecting local recurrent contribution due tolaminar-specific intra-columnar processing. Model performance is improved by taking into account trial-to-trial changes in the global cortical state reflected bythe LFP signals.(F) Parallel bank model of synaptic integration (adapted from Fournier et al., 2011, 2014). Left: decomposition method based on a Volterra expansion truncated tothe second-order diagonal terms and a PCA analysis for separating excitatory and inhibitory subunits. Right: example of an intracellular recording of a layer IVstellate cell, filled with biocytin with its morphological reconstruction (bottom). Note that this simple receptive field (see linear kernel, top row) expresses at thesubthreshold level (Vm) three excitatory cross-oriented subunits (see X-Y plots, middle column) and one inhibitory non-oriented subunit.

Neuron

Perspective

ignored, as Sompolinsky and Shapley (1997) pointed out.

Moreover, RFs are usually described one feature at

a time (orientation, direction, disparity .), whereas RF

optimization should take all features into account simulta-

neously—a tricky problem given their sometimes conflict-

ing impact (Figure 2F).

(2) Receptive fields are not invariant. RFs are eminently

adaptive, and the balance between their linear and

nonlinear components can change with the dimension-

ality or the statistical properties of an input (Fournier

et al., 2011; Yeh et al., 2009) (Figure 2F). This has

been shown using nonlinear kernel estimation in V1 neu-

rons for different white-noise stimulus conditions. For

example, Fournier et al. (2011) showed that both the spa-

tio-temporal extent of linear and nonlinear kernels and

their relative weights depended on the spatio-temporal

density of the stimulus. A possible interpretation is that

the simple-like component of a RF derives from feedfor-

ward drive, complex-like components result from recur-

rent lateral connections, and that a neuron’s best RF

description depends on a variable, stimulus-dependent

balance between these components. This view may be

consistent with several studies (Nauhaus et al., 2008; Po-

lat et al., 1998; Sceniak et al., 1999), showing that the


Neuron

Perspective

lateral propagation of activity between adjacent cortical

units decreases substantially when stimulus contrast in-

creases. If similar effects exist for stimulus spatial or tem-

poral density, the cell apparent ‘‘complex-ness’’ would

vary with the stimulus due to changes in the balance be-

tween its lateral and feedforward inputs.

(3) Existing RF models only explain relatively simple res-

ponse features and may not be applicable to multimodal

processing. In the visual system of awake monkeys, for

instance, the responses of some V1 andmany V2 neurons

to illusory contours match the orientation of their RF and

their responses to rigid contours (see section 1). Yet these

RFs generally correspond to filters applied to regions of

stimulus space that are too small to account for the

long-range dependencies needed for illusory contours.

Beyond primary areas, neurons responsive to complex

features (e.g., faces) are described using no longer phys-

ical (defined by the stimulus) but perceptual (defined by

the observer) attributes; what is a RF for such neurons?

More generally, the single-neuron approach seems inad-

equate to define the complex nonlinear transformation

applied by brain circuits to map physical reality into the

perceptual space. A new mathematical framework, one

that goes beyond the RF concept, seems necessary.

This framework will also need to include a general way of treat-

ing multisensory interactions. Neural activity in associative areas

often co-varies with signals from different sensory channels. For

example, in the medial superior temporal area of the monkey ex-

trastriate visual cortex, neurons may respond to both visual and

vestibular inputs to encode head motion (Angelaki et al., 2011).

Inmonkeyprimary auditory cortex, information about sound iden-

tity carried by individual neurons is increased when those sounds

are paired with images (Kayser et al., 2010). In the posterior pari-

etal cortex of rats trained in an auditory visual task, neurons may

respondboth tovisual andauditory inputs,butwithdifferentmulti-

modal covariation rules (Raposo et al., 2014). In short, developing

a general framework suitable for multisensory responses will

require the co-parameterization of multiple modalities. Given the

practical difficulty of doing so with single modalities, one can

appreciate the problems facing us with multimodal conditions.

Dynamic Assembly Codes for PerceptionThe variety of response properties found in a typical sensory area

of cortex suggests that the knowledge of the simultaneous state

of many cortical neurons should correlate better with descrip-

tions of perceptual representations. Measures of single-neuron

responses and receptive fields classically rely on firing rates.

But the timing of action potentials could also carry distributed in-

formation about a stimulus (Abeles, 1991). If spike timing is more

precise than the perceived temporal variations of a stimulus,

spike timing itself could be part of the code. In this case, informa-

tion would have to be decoded in a space defined by N neurons

of the population and P time-bins, where P is the ratio between

the shortest perceived stimulus variations and the uncompressi-

ble variability of spike timing.

Strong experimental evidence for such precise distributed

spatio-temporal patterns exists in the olfactory system of insects


(Laurent, 2002). In mammals, several forms of temporal codes

have been proposed. They include spike-latency codes (spike

timing relative to stimulus onset) (Spors et al., 2006; Van Rullen

and Thorpe, 2001) and spike-phase codes (e.g., relative to hip-

pocampal theta rhythm (Huxter et al., 2003). These paradigms

can be included in the more general distributed spike-sequence

codes proposed by Abeles (‘‘synfire chain’’ hypothesis [Abeles,

1991]). The temporal precision of stimulus-locked activity in

mammalian retina and thalamus neurons (2–5 ms) (Reinagel

and Reid, 2000) is at least compatible with such ideas. In themo-

tor cortex of behaving monkeys, temporal motifs have been

found to be correlated with the nature of a behavioral task (Vaa-

dia et al., 1995) and reinforcement expectation (Riehle et al.,

1997), but the functional significance of these observations

(made by the external observer and not forcibly decoded by

the neurons) is debated because of statistical issues (Ayzenshtat

et al., 2010) and because of incompatibility (a priori) with cortical

dynamics, generally unstable (London et al., 2010).

Temporal codeshavebeen studied in rodent olfaction, inwhich

odor sampling is paced by active sniffing; below the timescale of

sampling (the sniff period), temporal fluctuations of the input can

probably not be perceived, implying that response dynamics, if

they exist, do not encode stimulus fluctuations. In rodents,

odorant binding to odorant receptors (ORs) varies across the

ORs, leading to sequential activation of odorant sensory neurons

(Spors et al., 2006). These sequences likely contribute to the dy-

namic activation of the first relay neurons, themitral cells (MCs) of

the olfactory bulb (Bathellier et al., 2008). Odor identity can be

better decoded fromMCassemblies using temporal than rate co-

des (Cury and Uchida, 2010). Using optogenetic approaches,

Haddadet al. (2013) showed that downstreamneurons in piriform

cortex can use temporal information to adjust their firing rate but

that piriform neuron assemblies do not carry additional informa-

tion in their relative spike timing (Miura et al., 2012). Thus, the ex-

istence and use of temporal codes may be specific only to some

relay stations and pathways, possibly used for specific computa-

tions.Recent studies in auditory andsomatosensory cortex show

a good correlation (Bathellier et al., 2012), possibly even a causal

link (Musall et al., 2014; O’Connor et al., 2013), between cortical

rate codes and perception. But the simplicity of the stimuli and

the behavioral model used in these studies do not exclude the

possibility that cortical structures with a high information load

such as primate V1 use a sparse code based on precise spike

timing, as observed during high-dimensional natural-scene stim-

ulation (Baudot et al., 2013; Vinje and Gallant, 2000).

Whetheror notcertainneuronsaresensitive to thefine temporal

features of their input, a pragmatic approach is to assesswhether

population dynamics can be related to sensation and perception.

This approach has been used successfully in olfaction, in which

stimulus space cannot be parameterized,makingRF approaches

intractable. Population state analysis requires simultaneous re-

cordings from large and representative sets of neurons in an

area of interest (e.g., usingmultielectrode arrays or optical indica-

tors). It then aims at comparing the representations of stimuli us-

ing neural population metrics to infer the structure of sensory,

perceptual, or behavioral space. This approach skirts the com-

plex issue of deriving mathematical models of receptive fields to

fit the data, but developing it into a predictive tool remains a

0.250

1 20Trial #

Cel

l num

ber

87

9.5 kHz

1

26.8 kHz

Correlation

50 µm

25%

5 s

A B mixtures

population response switch

Figure 3. Switch-like Reconfigurations in Local Population Activity Patterns of Mouse Auditory Cortex during Auditory Perception(A) Top: 200 3 200 mm image of OGB1-stained neurons in the mouse auditory cortex. The line scan path used for two-photon calcium imaging in isoflurane-anaesthetized mice is superimposed. Bottom: typical calcium signals from all the neurons shown in (A) during presentation of various short sounds.(B) Single-trial population responses (left column) and similarity matrices (right column) for different mixtures of two pure tones (spectrograms on the top) thatexcited two distinct stereotypical response patterns. This figure shows that although the mixtures were varied gradually, the population response patternchanged in a switch-likemanner for an intermediatemixture ratio (see arrowhead; first response pattern elicitedwith mixtures dominated by 26.8 kHz, blue frame;second pattern, red). Adapted from Bathellier et al., 2012.

Neuron

Perspective

challenging task (Rabinovich et al., 2008). Population analysis has

beenextensivelyapplied to theolfactorybulbof fish (Friedrichand

Laurent, 2001), locust (Stopfer et al., 2003), and themouse (Bath-

ellier et al., 2008). It revealed that olfactory representationsdisplay

strong temporal dynamics even during constant stimuli, suggest-

ing that thedynamicsarepart of theolfactory representation.With

populationmetrics anddimensionality reduction techniques, sim-

ilarity between stimulus representations could be estimated, and

subspacescorresponding to representationsof thesameodorant

at different concentrations were identified (Stopfer et al., 2003),

providingmeans toaddress thedifficult question of concentration

invariance in olfactory perception.

Population analysis techniques have recently been applied to

rodent auditory cortex (Bathellier et al., 2012). Auditory cortex

studies contributed two important pieces of information. First, ac-

tivity in sensory cortex includes spontaneous but coordinated and

complex activity patterns that may reflect a repertoire of intrinsic

cortical dynamics with consequences on responses to sensory

input (Luczak et al., 2009). Importantly, sound-evoked cortical dy-

namics evolve in discrete steps when sounds are varied gradually

(Bathellier et al., 2012) (Figure 3). Altogether, these data suggest

that cortical circuits not only respond to their inputs but produce

their own dynamics in which sensory representations are

embedded, possibly resonating within specific ‘‘attractor’’ states.

Further investigation will be required to fully understand the inter-

play between intrinsic circuit dynamics and neural coding in cor-

tex, but an interestingworking hypothesis could be that the attrac-

torsofcorticaldynamics representelementary tokensofpercepts.

Bottleneck Issues

While increasing evidence points toward the existence of coordi-

nated dynamics in cortical circuits involving large cell assemblies

during perception, a difficult challenge still to overcome is to

prove their importance in perception.Development of large-scale

recording techniques and population analysis methods is one

important aspect, but as much as receptive field approaches,

which they generalize, they will detect only correlations between

brain activity and perception. This will be enough to infer new

algorithms potentially at play in sensory cortex. But to prove

that these algorithms are sufficient to generate perception, two

important tests should be performed. On one side, candidate

algorithms should be explicitly simulated to assess whether

they can quantitatively help reproducing perceptual capabilities

(e.g., object recognition) in a well-controlled theoretical setting,

thereby demonstrating that these algorithms perform the ex-

pected computations (in Marr’s sense). On the other side, to

show their causal involvement in perception in the living system,

the ideal experiment would be to show that, by replaying some

spatio-temporal activity pattern within or across brain areas,

one is able to trigger the emergence of the same percept actually

reported during observation of this very pattern. Such a Ge-

danken experiment, and quantitative tests on the impact pro-

duced by changing the firing rate or even jittering the individual

spike timingof cells composing the sameassembly,were already

proposed by Christoph von der Malsburg (von der Malsburg,

1986) andmay become possible in the near future. Cortical activ-

ity ‘‘re-encoding’’ experiments based on optogenetic tech-

niques, in which an illusory percept is produced by direct cortical

stimulation, recently started tobe tractable in rodents using opto-

genetics for very simple percepts such as object detection

(O’Connor et al., 2013) with the whisker system. So far, current

optogenetics is only able to precisely control the timing of spikes

on small or large populations of neurons, potentially identified by

genetic markers. However, creating precise spatio-temporal

sequences at a scale of the network that can be relevant for a


Neuron

Perspective

general percept is still a serious issuedespite the significant tech-

nical advances recently obtained using spatial light modulators

permitting the asynchronous activation of tens of individual cells

over a few hundred microns in awake mice (Szabo et al., 2014).

From Functional Principles to Neural Architectures forPerceptionBeyond the determination of algorithm principles (in David

Marr’s sense) thatmay underlie perception, another fundamental

question is their biological implementation. We here review a

few examples in which sensory processing principles could be

explained by particular cortical architectures and discuss the

generality of these architectures across senses and species.

As a first example, an important principle of visual processing is

‘‘spatial opponency,’’ which corresponds to suppressive effects

across features (contrast, motion, color contrast, orientation

contrast, center-surround) locatednearby in the retinotopic space

(e.g., center-surroundRF from retina to cortex). This effect can be

mathematically written as a convolution with a Mexican hat-

shaped filter equivalent to the opposite of the second-order

spatial derivative in visuotopic space (Ratliff, 1965). The role of

such operator is to emphasize local contrast and this type ofmotif

explains the Mach band illusion (Figure 1B) (Ratliff, 1965). It also

approximates the Gaussian Laplacian operator used in artificial

vision to extract edges and contours (Marr, 1982) and plays a

role in orientation selectivity. ‘‘Spatial opponency’’ can be imple-

mented in topographically organizedcircuitsby thespatially local-

ized lateral inhibitionarchitectures (Figure4A),whichareobserved

inmanysensory circuits includingprimary sensory neocortical cir-

cuits. Such architectures correspond to various forms of either

feedforward or recurrent inhibition (Figure 4B) when the interneu-

rons involved sendconnectionsmostly to spatially closecells (i.e.,

with neighboring and similar RF, e.g., Figures 4C and 4D). This

type of lateral inhibition is found in many different sensory struc-

tures, such as the retina or primary visual cortex in mammals,

with sometimes further refinements. For example, in cat, ferret,

and macaque, orientation selectivity in simple cells in the tha-

lamo-recipient layer 4 (4Cb for the macaque) is known to result

froma push-pull circuit in which the excitatory neurons receive in-

hibition from similarly orientation-tuned interneurons but in oppo-

sition of phase (Figures 4E and 4F) (Troyer et al., 1998).

While local connectivity motifs such as lateral inhibition under-

lie the extraction of precise local relationship in a visual scene

and probably explain the core of receptive fields in V1, it is un-

likely that they account for the emergence of global percepts

based on inputs spanning large retinotopic distances. However,

the later may in part result from long-distance connectivity motifs

in the cortical circuit, so as to bind together elemental visual fea-

tures in a meaningful way (Figure 5A). In higher mammals (tree

shrew, ferret, and cat as well as non-human primates) recon-

structed pyramidal cell axons that remain within the gray matter

extend over several hypercolumns (up to 6–8 mm in the tree

shrew [Bosking et al., 1997]; in the cat [Gilbert and Li, 2012],

but seeMartin et al. [2014]). Supposedly, these long connections

modulate the response gain of the local circuits (i.e., hypercol-

umn), inducing often suppressive effects although particular

center-surround stimulus conditions can induce specific boost-

ing (Sillito et al., 1995).


In spite of this uncertain status, horizontal connectivity has

long been presented as the biological substrate of iso-prefer-

ence binding. This functional organization principle is derived

from a developmental canonical rule which posits that ‘‘who fires

together (or is alike) tend to wire together’’ and has been imple-

mented successfully by LISSOM models to account for the

development of lateral connectivity in sensory cortical maps

(Miikkulainen et al., 2006). At the psychophysical level, this

view corresponds to the perceptual ‘‘association field’’

(Figure 5B) (Field et al., 1993). This concept assumes the facilita-

tion of collinear and, to a lesser extent, co-circular spatial inte-

gration of oriented contrast edges. This elegant psychophysical

hypothesis accounts in humans for the ‘‘pop-out’’ perception of

smooth contiguous path integration even when immersed in a

sea of randomly oriented edge elements (Field et al., 1993)

(Figure 5A) and the facilitation of target detection by high-

contrast co-aligned flankers (Polat and Sagi, 1993) (Figure 5A).

At the neuronal level, this view is supported by the electrophys-

iological demonstration of a ‘‘neural facilitation field’’ (Gilbert and

Li, 2012) corresponding to a boosting of the response gain to an

optimally oriented contrast edge within the classical RF when

flankers were simultaneously flashed in the immediate ‘‘silent

surround’’ and co-aligned along the preferred orientation axis

of the recorded cell (Figures 5C and 5D). Remarkably, this

gain-control effect seems to depend on top-down signals, as it

is weakened by diverted attention and suppressed by anes-

thesia (Li et al., 2006).

Further evidence and understanding comes from intracellular

data and VSD imaging in the anesthetized cat, demonstrating

long-distance propagation of visually evoked subthreshold ac-

tivity through lateral (and possibly feedback) connectivity outside

the classical receptive field (intracellular: Bringuier et al., 1999;

Fregnac, 2012; VSD: Benucci et al., 2007). More recently, both

techniques have been combined to show that a critical threshold

of spatial synergy and temporal summation has to be crossed

before the impact of the long-range interactions (in themV range)

can be functionally detected (Chavane et al., 2011). Taken

together, these studies suggest the existence (in higher mam-

mals) of structuro-functional collinearity biases, intrinsic to V1

and present at a subthreshold level in the anesthetized brain,

which require attention and feedback from higher cortical areas

to be expressed at a spiking level. No evidence for such circuits

has been found in the rodent.

Bottleneck Issues

One difficulty in identifying canonical motifs is that the functional

expression of cortical circuits adapts continuously to the statis-

tics of the sensory environment. Moreover, most associative

plasticity algorithms have been introduced to build distributed

memories of our environment, during development or learning

(review in Fregnac, 2003). Their implication in perception itself,

although envisioned by James and Hebb, remains largely unex-

plored. The dominant plasticity algorithms are derived from cor-

relation-based rules (Bienenstock et al., 1982), which extend

beyond Hebb’s principle, and are often seen as a universal set

of recipes to build long-lasting assemblies or reinforce causal

chains in processing. Although earlier work focused on syn-

chrony (with a ± 50 ms temporal contiguity window (e.g., Baranyi

and Feher, 1981), the refinement of in vitro techniques showed

Figure 4. Canonical Circuits(A) Lateral inhibition (illustration of center-surround interactions, taken from Cavanaugh et al., 2002). The ratio of Gaussians (RoG) model is constructed fromindependent and spatially stable center and surround components with two Gaussian envelopes of different spatial spread (s).(B) Major types (adapted from Isaacson and Scanziani, 2011) of intracortical inhibition: left, feedback; right, feedforward. Inhibitory cell in blue (round soma);excitatory cell in gray (pyramidal soma). Red axon emitted by the postsynaptic target cell (left, feedback) or by a common excitatory drive (right, feedforward).(C) Sensory-evoked feedforward inhibition (FFI): inhibition enforces precise spike timing in a principal cell in auditory cortex (adapted fromWehr and Zador, 2005):top, timing of action potentials; middle, subthreshold membrane potential; bottom, underlying excitatory and inhibitory synaptic conductances. Action potentialslargely occur only in the narrow time window during which excitation precedes inhibition.(D) Visual cortex of the rat (Liu et al., 2010). Stimulus selectivity in the rodent cortex emerges from a temporal shift in the timing of excitation relative to inhibition.(E) Push-pull in cat, ferret, and higher mammals (adapted from Troyer et al., 1998 and Jens Kremkow, with permission): prototypic recurrent network model oflayer 4 in the mammalian visual cortex ‘‘V1’’ with correlation-based connectivity implementing the push-pull receptive field organization. Inputs from the LGNprovide direct excitatory (push). In cat V1, inhibitory neurons project preferentially to neurons having a receptive field phase difference of around 180�, effectivelyimplementing the pull inhibition. Note also the intracortical reciprocal inhibition between inhibitory I1 and I2 neurons and the intracortical excitatory amplificationfor E1 and E2 neurons.(F) Generalized forms of push-pull architectures: the push-pull organization demonstrated for spatial phase can be hypothetically generalized to other dimensionsthan space such as orientation and direction (Monier et al., 2003, 2008).

Neuron

Perspective

that the temporal order of pre- and postsynaptic spikes is critical

for the sign of the synaptic change. The experimentally defined

Spike-Timing-Dependent Plasticity (STDP+) rule that is used in

most computational models is a modified Hebbian rule

(Figure 6A) that includes a potentiation part when the postsyn-

aptic spike follows shortly the presynaptic spike (necessary for

strengthening causal links) and a depression part for the reverse

order association (necessary for network stability) (Bi and Poo,

1998; Markram et al., 1997). However, its ubiquity remains

debated since other forms of STDP have also been observed,

where the ‘‘causal’’ pre/ post spike sequence induces depres-

sion and the reversed sequence (post / pre, or ‘‘anti-causal’’)

induces potentiation (Bell et al., 1997; Fino and Venance, 2011;

Letzkus et al., 2006; Safo and Regehr, 2005) (Figure 6A). There

also exist synapses with symmetric STDP between layer IV spiny

stellate cells in layer 4 of S1 cortex (Egger et al., 1999).

Interestingly, similar STDP rules have been proposed by some

daring theoreticians (Von der Malsburg, 1981) to operate also on

fast timescales (tens ofmilliseconds), compatible with the waxing

andwaning of a percept. The fast reversible formof STDP that re-

lates the best to perception is anti-Hebbian (STDP�). It has been

described in vivo and in vitro in the electrosensory lobe (ELL) of

the mormyrid electric fish (Bell et al., 1997) and should not be

confounded with classical LTD, in view of the fast time constant

and the self-erasing feature of negative STDP (Figure 6A).

The originality of the STDP� rule as a decorrelation algorithm

reducing input redundancy was theoretically envisioned first in

the context of the visual system (Barlow and Foldiak, 1989). But

it was first demonstrated experimentally in the ELL of the fish

(Bell et al., 1997) in neurons that compare the current sensory

input (electric field image encoded by electrosensory afferents

to the ELL) and the efference copy signal (originating in the electric


Figure 5. The ‘‘Perceptual Association Field’’(A) Top: ‘‘pop-out’’ emergence of a continuous integration path in a sea of randomly oriented Gabor patches (Field et al., 1993). Bottom: facilitation of detection ofa low-contrast vertical Gabor element induced by the simultaneous presentation of co-aligned high contrast flanker elements (Polat and Sagi, 1993).(B) Hypothetical association field induced by an oriented element through lateral interactions (Field et al., 1993).(C) The ‘‘iso-functionalbinding’’ hypothesis (adapted fromGilbert andLi, 2012). An individual superficial layer cortical pyramidal cell forms long-rangeconnections thatextend many millimeters parallel to the cortical surface. Long-range connections (> 500 mm from the injection center) tend to link columns of similar orientationpreference.(D) The ‘‘neural facilitation field’’ (Gilbert and Li, 2012). Left: the responses of V1 neurons are amplified in the awake behaving monkey by collinear contoursextending outside the RF. Introducing a cross-oriented bar between the collinear segments blocks the contour-related facilitation. Right: two-dimensional map offacilitatory (blue) and inhibitory (red) modulation of the response to an optimally oriented line segment centered in the RF (horizontal white bar). The spikingmodulation is suppressed by anesthesia.

Neuron

Perspective

fish from premotor neurons of the electric organ). The efference

copy updates the integration of the new sensory input by sub-

tracting the previous sensory reafference. The resulting functional

impact is to depress the transmission of inputs whose occurrence

reflects past evoked discharge and to strengthen unexpected in-

formation arising from the environment (Bell, 1981). The general-

ization of this fast-acting adaptive rule to other sensory systems

during sensory exploration suggests a plausible model for the

cortical the visuo-oculomotor system of higher mammals. One

can envision (as developed in Figures 6B and 6C) that an effer-

ence copy generated by saccadic oculomotor activity planning

in higher mammals (Crapse and Sommer, 2008) filters out in the

early visual system the predictable retinal changes due to

voluntary movements. This would imply that synapses onto thal-

amus and perigeniculate cells, which convey the contextual


feedback or prediction from the cortex, obey STDP� rules acting

on the timescale of a percept (100 ms), which remains to be

demonstrated. Since re-entrant cortico-thalamic loops are

shared by many sensory systems (review in Briggs and Usrey,

2008) this idea could extend to other sensory modalities.

Together, the examples of connectivity schemes and possible

diversity of fast plasticity algorithms presented above show the

difficulty of assigning particular circuit motifs to every receptive

field properties or operators (e.g., feedforward ON-OFF kernels

[Vidyasagar and Eysel, 2015], lateral diffusion kernels, or cor-

tico-thalamic feedback kernels) or even to the non-stationarities

and context dependencies of receptive fields asmentioned earlier

(long-range lateral connections). The same circuit can have facil-

itatory or depressive effect depending on the considered time-

scale and sensory context. What makes it difficult to extend our

A

C

B Figure 6. Plasticity Algorithms andPerceptual Filtering(A) Polymorphy of associative plasticity algorithms(adapted from Fregnac, 2003). The left panel de-tails different forms of spike timing-dependentplasticity rules established in vitro (co-cultures andacute slices). The induced synaptic change is ex-pressed as a function of the temporal delayseparating postsynaptic firing from presynapticfiring (taken here as the synchrony reference),imposed during the pre-post pairing protocol.From top to bottom: pyramidal cells in hippo-campus or in non-granular layers in neocortex,granular spiny stellate cells in neocortex,GABAergic neurons in hippocampal cultures,GABAergic medium ganglionic layer cells in theelectrosensory lobe (ELL) of the mormyrid electricfish. Synaptic potentiation in pink, depression inblue.(B and C) Prediction of negative STDP in feedbackprojections from higher areas in the early visualsystem. (B) Perceptual filtering in the ELL of themormyrid electric fish at the level of the largePurkinje cells of the ELL, by the efferent copyoriginating from the torus circularis nucleus, pre-motor to the electrical discharge center (EOD). (C)Hypothetical predictive filtering in the early visualsystem. Negative STDP, similar to that describedin the ELL, is expected synapses that project thecortical context feedback back to the sensorythalamic gate. The cortical prediction is comparedto what is received from the retina, resulting in thecancellation of expected correlations.

Neuron

Perspective

understanding of the circuit bases of perception is, first, that the

algorithmic principles of fast reversible plasticity are not fully eluci-

dated (see above), and second, that a large diversity of circuit

motif and mechanisms exists across senses and species.

We have seen that certain circuit principles (e.g., lateral inhibi-

tion) depend on the precise, topographic organization of the en-

coded features. Examples of topographically organized cortical

modules exist in various species and sensory modalities: the

whisker barrels in rodents, the visual hypercolumns in higher

mammals, as well as the blob and interblob patches in the visual

cortex of primates. Although their role is not fully elucidated (Hor-

tonandAdams, 2005), the existenceof thesemodulesmay reflect

the need to organize large flows of information in away that eases

the application of relational rules between elements of the unisen-

sory scene and extract invariant representations (Kavukcuoglu

et al., 2009). To take one concrete example, rodents that are often

nocturnal or extensively travel in underground tunnels (like rats or

mice) require well-structured tactile perception through their

whisker pads. However, what is required for haptic perception

does not seem to be needed for vision in rodents: accordingly,

even if neuronsof thevisual cortexof the rodent exhibit orientation

tuning, they are not spatially clustered according to their orienta-

tion preference (Ohki et al., 2006). This ‘‘salt and pepper’’ organi-

zationof V1 in the rodent contrastswith thecontinuousorientation

preference domains observed in diurnal predators such as cats

andnon-humanprimates. Thisdifference in architecturemaypre-

serve some general principles of network organization between

rodents and more visual mammals with orientation preference

maps. For example, increased connectivity probability between

cells with similar feature selectivity exist both in mice (Cossell

et al., 2015) and in cats or ferrets on the scale of an hypercolumn

(but see above for species-specific constraints for longer-dis-

tance connectivity). But inmice, due to the salt-and-pepper orga-

nization, feedforward inhibition colocalized with excitation domi-

nates in contrast to the spatial segregation of excitatory and

inhibitory sources mediating the push-pull organization of tha-

lamo-recipient RF in layer 4 in the cat and ferret and layer 4b in

monkey V1 (Figures 5D–5F). In mice, interneurons receive inputs

from cells with different orientation tuning (Bock et al., 2011) and

therefore control the global response gain rather than the sharp-

ness of orientation selectivity (Atallah et al., 2012), which is poor

in mouse cortex and mostly arises from thalamic inputs (Lien

and Scanziani, 2013). This interspecies shift in microcircuit archi-

tecture,where functional selectivitymaybe already established in

thalamus for rodents, while requiring cortex for highermammal, is

reflected at the computational level, considering for example the

much higher orientation detection performance in cats and pri-

mates than in mice (Andermann et al., 2010).

Along the same line, long-distance projections of cortical

neurons may also serve very different functions in rodents

compared to higher mammals. Recent connectomics studies

show that, in the rodent cortex, axons fromprimary sensory areas

can massively reach motor or decisional areas and vice versa

(Zingg et al., 2014). In ferret, cat, and tree shrew, long-range


Neuron

Perspective

connections prevalently intrinsic to the same primary area, e.g.,

V1 for vision, are superseded by a stronger involvement of cor-

tico-cortical feedback in non-human primates (Angelucci et al.,

2002) and in humans. In primates, the reach of long-distance

axons is limited to the representation of only a few degrees in

foveal vision, and cortico-cortical loops may be required to go

beyond this spatial diffusion boundary. The selective pressure

produced by the refined grain of the retinotopic map in primates

(resulting in an increase in retino-cortical magnification factor)

and the relative size increase of cortical areas may render very

long-range intrinsic horizontal axons ineffective (possibly too

slow, toodiluted, or tooexpensive) tomediate long-distance cen-

ter-surround interactions in the visual field. Therefore, these

lateral interactions are established in primates via higher areas

at the expense of intra-V1 connectivity.Finally, interspecies differences also exist for the circuits pro-

cessing multisensory information. In primates, very few neurons

project directly from one primary sensory area to another (Falch-

ier et al., 2002), and multimodal responses remain subthreshold

except during precise spatial or temporal coincidence (Stein and

Stanford, 2008). In rodents, cross-modal connectivity is strong

across primary sensory areas (Laramee and Boire, 2014) and

leads to suprathreshold responses (Olcese et al., 2013). Hence,

the classical view of segregated feedforward streams of unimo-

dal sensory information before associative areas is mostly appli-

cable to higher vertebrates and not so much to the rodent. In the

rodent, multimodal interactions can reshape unisensory repre-

sentations already in primary cortical areas, potentially building

cross-modal associations with a lower level of complexity than

associations between sensory modalities made by primates.

These examples together show that it is unavoidable to

choose a comparative approach in the search for the archetypal

circuits of perception and discriminate, based on structural and

ecological evidence, the common principles from the circuit mo-

tifs that reflect a particular use of a sensory modality.

ConclusionThis overview illustrates the complexity of explaining perceptual

processes in terms of realistic neural-based architecture and

rules. A bottom-up strategy on its own seems doomed to fail;

conceptual approaches must be developed to reduce structural

complexity (see Box 2).

Current observations and models suggest that the psycho-

logical laws of visual ‘‘pop-out’’ non-attentive perception rely

principally on the internal architecture of primary sensory areas

expressed through recurrent and horizontal connectivity and

probably shaped by correlation-based associative synaptic

plasticity. This architecture would enable the self-organization

and propagation of activity that ultimately facilitate plausible

feature associations. The rewiring of horizontal connectivity in

auditory cortex of the developing ferret induced by the forced re-

innervation by visual afferents (Sharma et al., 2000) after lesion of

the normal subcortical afferent pathway, and turning it function-

ally in a visual cortex, is today the most impressive demonstra-

tion of the versatility of sensory cortical modules.

Experimentally, the technical possibility of recording simulta-

neously large assemblies of functionally identified neurons with

millisecond precision opens a new era. On the theoretical front,


many paths exist toward modeling the sensory brain, even in

cases as seemingly simple as unimodal, low-level perception

and form recognition in the absence of behavioral attention.

Others argue for ‘‘top-down’’ approaches, via the abstract math-

ematical modeling of network-scale cortico-cortical and cortico-

thalamo-cortical loop architectures (Bastos et al., 2012; Mum-

ford, 1991). They often describe the brain as aBayesian inference

device. This approachworks well to describe the computation (in

Marr’s sense), but it fails to identify algorithms and underlying

circuits. What is missing is a ‘‘middle-out’’ approach that can

identify plausible models to link architecture and cognition.

Large-scale brain initiatives (The Allen Institute, The Human

Brain Project) are currently developing data collection pipelines

and computational infrastructures mostly driven by a bottom-

up approach. This conceptual ‘‘prior’’ takes its ground in the

‘‘cortical column’’ ideation simulated in the Blue Brain Project

(Markram, 2012). It runs on the assumption that a detailed ‘‘bot-

tom-up’’ reconstruction and exhaustive simulation of neuronal

elements will eventually reveal canonical microcircuits, from

which the laws of perception will emerge. In this latter formula-

tion, these circuits are seen as innate building blocks of knowl-

edge for perception that are combined during learning to form

complex Lego constructs composed in such a way as to engram

new memories (Markram and Perin, 2011). The merit of this bold

approach in the rodent is that paradigms developed with pri-

mates (see the inspiring work of Bill Newsome and colleagues

that linked neurometric and psychometric perceptual measures

[Britten et al., 1992]) are now being applied to rodents with

measurable success (Bathellier et al., 2012; Musall et al.,

2014). But to avoid the trap of nested complexity, it remains

necessary to extract generic principles (Marr’s algorithmic level)

and obtain modeling simplifications. In this vein, one should be

careful not to force homologies (between distant species) that

may not always exist, and one should take advantage of the di-

versity of experimental systems, many of which have been

driving the field: the cat visual and somatosensory cortex for vi-

sual perception and early associative memory storage (Spinelli

and Jensen, 1979), the rat hippocampus and entorhinal cortex

for navigation-related computation (Moser and Moser, 2013;

O’Keefe and Recce, 1993), the electric fish electrosensory lobe

for perceptual filtering and novelty detection (Bell, 1981), the in-

sect olfactory system for associative learning of dynamic sen-

sory representations (Cassenaer and Laurent, 2012). We remain

convinced of the importance of ‘‘simpler’’ organisms to discover

canonical principles. ‘‘New directions in Science are launched by

new tools much more often than by new concepts’’ (Dyson,

1997). The development of optogenetics (mostly in rodents)

and its undeniable usefulness has triggered an historical change

in strategy in the study of sensory perception. The pressure to

standardize and the attraction of ‘‘big data’’ have led, too quickly

in our view, to a loss of perspective on diversity and evolution.

In this regard, the choice of the mouse model to study

mammalian vision seems odd to us. For example, the otherwise

laudable goal to industrialize and standardize data collection

could lead to a great impoverishment of visual protocols if one

is not careful enough. Recent studies show that themouse visual

system seems to depend heavily on locomotion (Niell and

Stryker, 2008), and the generalized use of ‘‘running-on-a-ball’’

Neuron

Perspective

paradigms have set a new sensory-motor behavior standard, far

from natural exploration behavior (Wallace et al., 2013). By repli-

cating pioneering experiments showing that active exploration is

needed to induce visually guided reaching behavior in kittens

deprived of vision (Hein and Held, 1967), Stryker and colleagues

identified in the rodent a specific ascending circuit, driven by

locomotion, and controlling sensory-induced plasticity. Howev-

er, if the whole-body locomotion seems needed in rodents, eye

movements and their proprioceptive inflow suffice in higher

mammals (Buisseret et al., 1978). In general, no report of a tight

relation between locomotion and vision has been confirmed in

cats, ferrets, and non-human primates in which primary sensory

and motor cortices are separated by numerous secondary and

associative areas (Markov et al., 2013). Hence, mouse vision

might acutely illustrate the species specificity of sensory-guided

behavior (see section 6).

Comparative studies show that primate brains are not simply

inflated versions of rodent brains. As underlined by Anthony

Movshon (Movshon, 2013), typical long-distance axons in ro-

dents do not only remain within their cortical area of origin as

in the ferret or the cat and rather tend to link multiple areas, sen-

sory, limbic, and motor. If long-range connections underlie our

‘‘perceptual grammar,’’ mice may actually have a very different

perceptual language than higher mammals.

But if, as we argue, the choice of rodents as the reference

model may be a deceiving alley for studying visual processing,

it may come at its advantage when searching for mechanisms

responsible for olfaction, tactile sensing, or multimodal integra-

tion. The reduced size of the computational sheet makes that

the higher visual cortical areas of the mouse abut directly other

primary areas such as S1 and A1, with their interfacing border

constituting an ideal site for multimodal integration (Laramee

and Boire, 2014). This strong connectivity results in suprathres-

hold multimodal interaction even in mouse primary sensory

cortex (Olcese et al., 2013), whereas such influences are sub-

threshold in primates (Stein and Stanford, 2008). In that respect,

the mouse may offer interesting opportunities to understand the

functional significance of heteromodal influences in primary

areas, otherwise present but silent in primates.

Hence, future progress in data-driven modeling of unimodal

and multimodal perception will depend on a judicious choice

of experimental models, on the simultaneous acquisition of neu-

ral data at many spatial and temporal scales, and on the defini-

tion of naturalistic sensory input benchmarks that meaningfully

constrain quantitative data-driven simulations. These goals

emphasize the inescapable roles of well-designed experiments

and of comparative approaches.

ACKNOWLEDGMENTS

We would like to thank Gilles Laurent, whose expert help has helped us toimprove the clarity and focus of this review. We would like to thank also KirstyGrant for her expert views on anti-Hebbian plasticity, and our colleagues CyrilMonier, Daniel Shulz, and Florian Gerard-Mercier for helpful comments. Wealso would like to extend our thanks to the participants of the NYU workshopon canonical plasticity held at La Pietra Villa in July 2015, where some of theseviews were discussed. We acknowledge support from CNRS, The Paris-Sa-clay IDEX (NeuroSaclay and I-Code), the French National Research Agency(ANR: NatStats Sensemaker and Complex-V1), the European Community(FET-Bio-I3 integrated programs IP FP6: FACETS [015879], IP FP7: BRAIN-

SCALES [269921]; Marie Curie CIG [334581]; FET-Open Brain-i-nets[243914]; FET-Flagship: The Human Brain Project [604102]), The InternationalHuman Frontier Science Program Organization (CDA-0064-2015), and theFyssen foundation.

REFERENCES

Abboud, S., Maidenbaum, S., Dehaene, S., and Amedi, A. (2015). A number-form area in the blind. Nat. Commun 6, 6026.

Abeles, M. (1991). Corticonics neural circuits of the cerebral cortex (Cam-bridge University Press).

Adelson, E.H., and Bergen, J.R. (1985). Spatiotemporal energy models for theperception of motion. J. Opt. Soc. Am. A 2, 284–299.

Alonso, J.M., and Swadlow, H.A. (2005). Thalamocortical specificity and thesynthesis of sensory cortical receptive fields. J. Neurophysiol. 94, 26–32.

Andermann, M.L., Kerlin, A.M., and Reid, R.C. (2010). Chronic cellular imagingof mouse visual cortex during operant behavior and passive viewing. Front.Cell. Neurosci. 4, 3.

Angelaki, D.E., Gu, Y., and Deangelis, G.C. (2011). Visual and vestibular cueintegration for heading perception in extrastriate visual cortex. J. Physiol.589, 825–833.

Angelucci, A., Levitt, J.B., and Lund, J.S. (2002). Anatomical origins of the clas-sical receptive field and modulatory surround field of single neurons in ma-caque visual cortical area V1. Prog. Brain Res. 136, 373–388.

Atallah, B.V., Bruns, W., Carandini, M., and Scanziani, M. (2012). Parvalbumin-expressing interneurons linearly transform cortical responses to visual stimuli.Neuron 73, 159–170.

Ayzenshtat, I., Meirovithz, E., Edelman, H., Werner-Reiss, U., Bienenstock, E.,Abeles, M., and Slovin, H. (2010). Precise spatiotemporal patterns among vi-sual cortical areas and their relation to visual stimulus processing.J. Neurosci. 30, 11232–11245.

Baranyi, A., and Feher, O. (1981). Synaptic facilitation requires paired activa-tion of convergent pathways in the neocortex. Nature 290, 413–415.

Barlow, H.B. (1972). Single units and sensation: a neuron doctrine for percep-tual psychology? Perception 1, 371–394.

Barlow, H.B., and Foldiak, P. (1989). Adaptationa and decorrelation in the cor-tex. In The Computing Neuron, R. Durnin, C. Miall, and G. Mitchison, eds. (Ad-dison-Wesley), pp. 54–72.

Bastos, A.M., Usrey, W.M., Adams, R.A., Mangun, G.R., Fries, P., and Friston,K.J. (2012). Canonical microcircuits for predictive coding. Neuron 76,695–711.

Bathellier, B., Buhl, D.L., Accolla, R., and Carleton, A. (2008). Dynamicensemble odor coding in the mammalian olfactory bulb: sensory informationat different timescales. Neuron 57, 586–598.

Bathellier, B., Ushakova, L., and Rumpel, S. (2012). Discrete neocortical dy-namics predict behavioral categorization of sounds. Neuron 76, 435–449.

Baudot, P., Levy, M., Marre, O., Monier, C., Pananceau, M., and Fregnac, Y.(2013). Animation of natural scene by virtual eye-movements evokes high pre-cision and low noise in V1 neurons. Front. Neural Circuits 7, 206.

Bell, C.C. (1981). An efference copy which is modified by reafferent input. Sci-ence 214, 450–453.

Bell, C.C., Han, V.Z., Sugawara, Y., and Grant, K. (1997). Synaptic plasticity ina cerebellum-like structure depends on temporal order. Nature 387, 278–281.

Benucci, A., Frazor, R.A., and Carandini, M. (2007). Standing waves and trav-eling waves distinguish two circuits in visual cortex. Neuron 55, 103–117.

Bi, G.Q., and Poo, M.M. (1998). Synaptic modifications in cultured hippocam-pal neurons: dependence on spike timing, synaptic strength, and postsynapticcell type. J. Neurosci. 18, 10464–10472.


http://refhub.elsevier.com/S0896-6273(15)00830-2/sref162



















































Neuron

Perspective

Bienenstock, E.L., Cooper, L.N., andMunro, P.W. (1982). Theory for the devel-opment of neuron selectivity: orientation specificity and binocular interaction invisual cortex. J. Neurosci. 2, 32–48.

Blanke, O. (2012). Multisensory brain mechanisms of bodily self-conscious-ness. Nat. Rev. Neurosci. 13, 556–571.

Blanke, O., Slater, M., and Serino, A. (2015). Behavioral, Neural, and Computa-tional Principles of Bodily Self-Consciousness. Neuron 88, this issue, 145–166.

Bock, D.D., Lee, W.C., Kerlin, A.M., Andermann, M.L., Hood, G., Wetzel, A.W.,Yurgenson, S., Soucy, E.R., Kim, H.S., and Reid, R.C. (2011). Network anat-omy and in vivo physiology of visual cortical neurons. Nature 471, 177–182.

Bonath, B., Noesselt, T., Krauel, K., Tyll, S., Tempelmann, C., and Hillyard, S.A.(2014). Audio-visual synchronymodulates the ventriloquist illusion and its neu-ral/spatial representation in the auditory cortex. Neuroimage 98, 425–434.

Bosking, W.H., Zhang, Y., Schofield, B., and Fitzpatrick, D. (1997). Orientationselectivity and the arrangement of horizontal connections in tree shrew striatecortex. J. Neurosci. 17, 2112–2127.

Botvinick, M., and Cohen, J. (1998). Rubber hands ‘feel’ touch that eyes see.Nature 391, 756.

Briggs, F., and Usrey, W.M. (2008). Emerging views of corticothalamic func-tion. Curr. Opin. Neurobiol. 18, 403–407.

Bringuier, V., Chavane, F., Glaeser, L., and Fregnac, Y. (1999). Horizontal prop-agation of visual activity in the synaptic integration field of area 17 neurons.Science 283, 695–699.

Britten, K.H., Shadlen, M.N., Newsome, W.T., and Movshon, J.A. (1992). Theanalysis of visual motion: a comparison of neuronal and psychophysical per-formance. J. Neurosci. 12, 4745–4765.

Buisseret, P., Gary-Bobo, E., and Imbert, M. (1978). Ocular motility and recov-ery of orientational properties of visual cortical neurones in dark-reared kittens.Nature 272, 816–817.

Carandini, M., and Heeger, D.J. (2012). Normalization as a canonical neuralcomputation. Nat. Rev. Neurosci. 13, 51–62.

Carandini, M., Demb, J.B., Mante, V., Tolhurst, D.J., Dan, Y., Olshausen, B.A.,Gallant, J.L., and Rust, N.C. (2005). Do we know what the early visual systemdoes? J. Neurosci. 25, 10577–10597.

Cassenaer, S., and Laurent, G. (2012). Conditional modulation of spike-timing-dependent plasticity for olfactory learning. Nature 482, 47–52.

Cavanaugh, J.R., Bair, W., andMovshon, J.A. (2002). Nature and interaction ofsignals from the receptive field center and surround in macaque V1 neurons.J. Neurophysiol. 88, 2530–2546.

Chavane, F., Sharon, D., Jancke, D., Marre, O., Fregnac, Y., and Grinvald, A.(2011). Lateral Spread of Orientation Selectivity in V1 is Controlled by Intracort-ical Cooperativity. Front. Syst. Neurosci 5, 4.

Chen, L.M., Friedman, R.M., and Roe, A.W. (2003). Optical imaging of a tactileillusion in area 3b of the primary somatosensory cortex. Science 302, 881–885.

Cossell, L., Iacaruso, M.F., Muir, D.R., Houlton, R., Sader, E.N., Ko, H., Hofer,S.B., and Mrsic-Flogel, T.D. (2015). Functional organization of excitatory syn-aptic strength in primary visual cortex. Nature 518, 399–403.

Crapse, T.B., and Sommer, M.A. (2008). Corollary discharge circuits in the pri-mate brain. Curr. Opin. Neurobiol. 18, 552–557.

Cury, K.M., and Uchida, N. (2010). Robust odor coding via inhalation-coupledtransient activity in the mammalian olfactory bulb. Neuron 68, 570–585.

David, S.V., Vinje, W.E., and Gallant, J.L. (2004). Natural stimulus statisticsalter the receptive field structure of v1 neurons. J. Neurosci. 24, 6991–7006.

De Weerd, P., Desimone, R., and Ungerleider, L.G. (1996). Cue-dependentdeficits in grating orientation discrimination after V4 lesions in macaques.Vis. Neurosci. 13, 529–538.

Diaz-Caneja, E. (1928). Sur l’alternance binoculaire. Annales d’oculistique(Paris) 165, 721–731.


DiCarlo, J.J., Zoccolan, D., and Rust, N.C. (2012). How does the brain solvevisual object recognition? Neuron 73, 415–434.

Dyson, F. (1997). Imagined Worlds (Harvard University Press).

Eagleman, D.M. (2001). Visual illusions and neurobiology. Nat. Rev. Neurosci.2, 920–926.

Egger, V., Feldmeyer, D., and Sakmann, B. (1999). Coincidence detection andchanges of synaptic efficacy in spiny stellate neurons in rat barrel cortex. Nat.Neurosci. 2, 1098–1105.

Estebanez, L., El Boustani, S., Destexhe, A., and Shulz, D.E. (2012). Correlatedinput reveals coexisting coding schemes in a sensory cortex. Nat. Neurosci.15, 1691–1699.

Falchier, A., Clavagnier, S., Barone, P., and Kennedy, H. (2002). Anatomicalevidence of multimodal integration in primate striate cortex. J. Neurosci. 22,5749–5759.

Field, D.J., Hayes, A., and Hess, R.F. (1993). Contour integration by the humanvisual system: evidence for a local ‘‘association field’’. Vision Res. 33, 173–193.

Fino, E., and Venance, L. (2011). Spike-timing dependent plasticity in striatalinterneurons. Neuropharmacology 60, 780–788.

Fournier, J., Monier, C., Pananceau, M., and Fregnac, Y. (2011). Adaptation ofthe simple or complex nature of V1 receptive fields to visual statistics. Nat.Neurosci. 14, 1053–1060.

Fournier, J., Monier, C., Levy, M., Marre, O., Sari, K., Kisvarday, Z.F., and Fre-gnac, Y. (2014). Hidden complexity of synaptic receptive fields in cat V1.J. Neurosci. 34, 5515–5528.

Fregnac, Y. (2003). Hebbian Plasticity. In The Handbook of Brain Theory andNeural Networks, M.A. Arbib, ed. (MIT Press), pp. 515–522.

Fregnac, Y. (2012). Reading out the synaptic echoes of low level perception inV1. Lect. Notes Comput. Sci. 7583, 486–495.

Friedrich, R.W., and Laurent, G. (2001). Dynamic optimization of odor repre-sentations by slow temporal patterning of mitral cell activity. Science 291,889–894.

Gilbert, C.D., and Li, W. (2012). Adult visual cortical plasticity. Neuron 75,250–264.

Gregory, R.L. (1997). Knowledge in perception and illusion. Philos. Trans. R.Soc. Lond. B Biol. Sci. 352, 1121–1127.

Grossberg, S., Yazdanbakhsh, A., Cao, Y., and Swaminathan, G. (2008). Howdoes binocular rivalry emerge from cortical mechanisms of 3-D vision? VisionRes. 48, 2232–2250.

Haddad, R., Lanjuin, A., Madisen, L., Zeng, H., Murthy, V.N., and Uchida, N.(2013). Olfactory cortical neurons read out a relative time code in the olfactorybulb. Nat. Neurosci. 16, 949–957.

Haider, B., Krause, M.R., Duque, A., Yu, Y., Touryan, J., Mazer, J.A., and Mc-Cormick, D.A. (2010). Synaptic and network mechanisms of sparse and reli-able visual cortical activity during nonclassical receptive field stimulation.Neuron 65, 107–121.

Hartline, H.K. (1938). The response of single optic nerve fibers of the vertebrateeye to illumination of the retina. Am. J. Physiol. 121, 400–415.

Hein, A., and Held, R. (1967). Dissociation of the visual placing response intoelicited and guided components. Science 158, 390–392.

Held, R., Ostrovsky, Y., de Gelder, B., Gandhi, T., Ganesh, S., Mathur, U., andSinha, P. (2011). The newly sighted fail to match seen with felt. Nat. Neurosci.14, 551–553.

Horton, J.C., and Adams, D.L. (2005). The cortical column: a structure withouta function. Philos. Trans. R. Soc. Lond. B Biol. Sci. 360, 837–862.

Hosoya, T., Baccus, S.A., and Meister, M. (2005). Dynamic predictive codingby the retina. Nature 436, 71–77.

Huxter, J., Burgess, N., and O’Keefe, J. (2003). Independent rate and temporalcoding in hippocampal pyramidal cells. Nature 425, 828–832.























































































































Neuron

Perspective

Isaacson, J.S., and Scanziani, M. (2011). How inhibition shapes cortical activ-ity. Neuron 72, 231–243.

Jack, C.E., and Thurlow, W.R. (1973). Effects of degree of visual associationand angle of displacement on the ‘‘ventriloquism’’ effect. Percept. Mot. Skills37, 967–979.

Jancke, D., Chavane, F., Naaman, S., and Grinvald, A. (2004). Imaging corticalcorrelates of illusion in early visual cortex. Nature 428, 423–426.

Kavukcuoglu, K., Ranzato, M.A., Fergus, R., and LeCun, Y. (2009). Learninginvariant features through topographic filter maps. IEEE Conference on Com-puter Vision and Pattern Recognition, 1605–1612.

Kayser, C., Logothetis, N.K., and Panzeri, S. (2010). Visual enhancement of theinformation representation in auditory cortex. Curr. Biol. 20, 19–24.

Koenderink, J.J. (2012). Geometry of imaginary spaces. J. Physiol. Paris 106,173–182.

Laramee, M.E., and Boire, D. (2014). Visual cortical areas of the mouse: com-parison of parcellation and network structure with primates. Front. Neural Cir-cuits 8, 149.

Laurent, G. (2002). Olfactory network dynamics and the coding of multidimen-sional signals. Nat. Rev. Neurosci. 3, 884–895.

LeCun, Y., Bengio, Y., and Hinton, G. (2015). Deep learning. Nature 521,436–444.

Lee, T.S., and Nguyen, M. (2001). Dynamics of subjective contour formation inthe early visual cortex. Proc. Natl. Acad. Sci. USA 98, 1907–1911.

Lemus, L., Hernandez, A., Luna, R., Zainos, A., and Romo, R. (2010). Do sen-sory cortices processmore than one sensory modality during perceptual judg-ments? Neuron 67, 335–348.

Leopold, D.A., and Logothetis, N.K. (1996). Activity changes in early visual cor-tex reflect monkeys’ percepts during binocular rivalry. Nature 379, 549–553.

Letzkus, J.J., Kampa, B.M., and Stuart, G.J. (2006). Learning rules for spiketiming-dependent plasticity depend on dendritic synapse location.J. Neurosci. 26, 10420–10429.

Li, W., Piech, V., and Gilbert, C.D. (2006). Contour saliency in primary visualcortex. Neuron 50, 951–962.

Lien, A.D., and Scanziani, M. (2013). Tuned thalamic excitation is amplified byvisual cortical circuits. Nat. Neurosci. 16, 1315–1323.

Liu, B.H., Li, P., Sun, Y.J., Li, Y.T., Zhang, L.I., and Tao, H.W. (2010). Inter-vening inhibition underlies simple-cell receptive field structure in visual cortex.Nat. Neurosci. 13, 89–96.

London, M., Roth, A., Beeren, L., Hausser, M., and Latham, P.E. (2010). Sensi-tivity to perturbations in vivo implies high noise and suggests rate coding incortex. Nature 466, 123–127.

Luczak, A., Bartho, P., and Harris, K.D. (2009). Spontaneous events outline therealm of possible sensory responses in neocortical populations. Neuron 62,413–425.

Machens, C.K.,Wehr, M.S., and Zador, A.M. (2004). Linearity of cortical recep-tive fields measured with natural sounds. J. Neurosci. 24, 1089–1100.

Mante, V., Bonin, V., and Carandini, M. (2008). Functional mechanismsshaping lateral geniculate responses to artificial and natural stimuli. Neuron58, 625–638.

Markov, N.T., Ercsey-Ravasz, M., Van Essen, D.C., Knoblauch, K., Toroczkai,Z., and Kennedy, H. (2013). Cortical high-density counterstream architectures.Science 342, 1238406.

Markram, H. (2012). The human brain project. Sci. Am. 306, 50–55.

Markram, H., and Perin, R. (2011). Innate neural assemblies for lego memory.Front. Neural Circuits 5, 6.

Markram, H., Lubke, J., Frotscher, M., and Sakmann, B. (1997). Regulation ofsynaptic efficacy by coincidence of postsynaptic APs and EPSPs. Science275, 213–215.

Marr, D. (1982). Vision: A Computational Investigation into the Human Repre-sentation and Processing of Visual Information (Freeman).

Martin, K.A., Roth, S., and Rusch, E.S. (2014). Superficial layer pyramidal cellscommunicate heterogeneously between multiple functional domains of catprimary visual cortex. Nat. Commun. 5, 5252.

McFarland, J.M., Cui, Y., and Butts, D.A. (2013). Inferring nonlinear neuronalcomputation based on physiologically plausible inputs. PLoS Comput. Biol.9, e1003143.

McGurk, H., and MacDonald, J. (1976). Hearing lips and seeing voices. Nature264, 746–748.

Mendola, J.D., Dale, A.M., Fischl, B., Liu, A.K., and Tootell, R.B. (1999). Therepresentation of illusory and real contours in human cortical visual areas re-vealed by functional magnetic resonance imaging. J. Neurosci. 19, 8560–8572.

Miikkulainen, R., Bednar, J., Choe, Y., and Sirosh, J. (2006). ComputationalMaps in the Visual Cortex (Springer).

Miller, S.M., Liu, G.B., Ngo, T.T., Hooper, G., Riek, S., Carson, R.G., and Petti-grew, J.D. (2000). Interhemispheric switching mediates perceptual rivalry.Curr. Biol. 10, 383–392.

Miura, K., Mainen, Z.F., and Uchida, N. (2012). Odor representations in olfac-tory cortex: distributed rate coding and decorrelated population activity.Neuron 74, 1087–1098.

Monier, C., Chavane, F., Baudot, P., Graham, L.J., and Fregnac, Y. (2003).Orientation and direction selectivity of synaptic inputs in visual cortical neu-rons: a diversity of combinations produces spike tuning. Neuron 37, 663–680.

Monier, C., Fournier, J., and Fregnac, Y. (2008). In vitro and in vivomeasures ofevoked excitatory and inhibitory conductance dynamics in sensory cortices.J. Neurosci. Methods 169, 323–365.

Montaser-Kouhsari, L., Landy, M.S., Heeger, D.J., and Larsson, J. (2007).Orientation-selective adaptation to illusory contours in human visual cortex.J. Neurosci. 27, 2186–2195.

Morel, J.M., Petro, A.B., and Sbert, C. (2010). A PDE formalization of Retinextheory. IEEE Trans. Image Process. 19, 2825–2837.

Moser, E.I., and Moser, M.B. (2013). Grid cells and neural coding in high-endcortices. Neuron 80, 765–774.

Movshon, J.A. (2013). Functional models and functional organisms for sys-tems neuroscience. In Cosyne (Salt Lake City), p. 25.

Mumford, D. (1991). On the computational architecture of the neocortex. I. Therole of the thalamo-cortical loop. Biol. Cybern. 65, 135–145.

Musall, S., von der Behrens, W., Mayrhofer, J.M., Weber, B., Helmchen, F.,and Haiss, F. (2014). Tactile frequency discrimination is enhanced by circum-venting neocortical adaptation. Nat. Neurosci. 17, 1567–1573.

Nauhaus, I., Benucci, A., Carandini, M., and Ringach, D.L. (2008). Neuronalselectivity and local map structure in visual cortex. Neuron 57, 673–679.

Niell, C.M., and Stryker, M.P. (2008). Highly selective receptive fields in mousevisual cortex. J. Neurosci. 28, 7520–7536.

O’Connor, D.H., Hires, S.A., Guo, Z.V., Li, N., Yu, J., Sun, Q.Q., Huber, D., andSvoboda, K. (2013). Neural coding during active somatosensation revealed us-ing illusory touch. Nat. Neurosci. 16, 958–965.

O’Keefe, J., and Recce, M.L. (1993). Phase relationship between hippocampalplace units and the EEG theta rhythm. Hippocampus 3, 317–330.

Ohki, K., Chung, S., Kara, P., Hubener, M., Bonhoeffer, T., and Reid, R.C.(2006). Highly ordered arrangement of single neurons in orientation pinwheels.Nature 442, 925–928.

Olcese, U., Iurilli, G., and Medini, P. (2013). Cellular and synaptic architectureof multisensory integration in the mouse neocortex. Neuron 79, 579–593.

Palacios, A.G., and Bozinovic, F. (2003). An ‘‘enactive’’ approach to integrativeand comparative biology: thoughts on the table. Biol. Res. 36, 101–105.






















































































































Neuron

Perspective

Pan, Y., Chen, M., Yin, J., An, X., Zhang, X., Lu, Y., Gong, H., Li, W., andWang,W. (2012). Equivalent representation of real and illusory contours in macaqueV4. J. Neurosci. 32, 6760–6770.

Pillow, J.W., Shlens, J., Paninski, L., Sher, A., Litke, A.M., Chichilnisky, E.J.,and Simoncelli, E.P. (2008). Spatio-temporal correlations and visual signallingin a complete neuronal population. Nature 454, 995–999.

Polat, U., and Sagi, D. (1993). Lateral interactions between spatial channels:suppression and facilitation revealed by lateral masking experiments. VisionRes. 33, 993–999.

Polat, U., Mizobe, K., Pettet, M.W., Kasamatsu, T., and Norcia, A.M. (1998).Collinear stimuli regulate visual responses depending on cell’s contrastthreshold. Nature 391, 580–584.

Priebe, N.J., and Ferster, D. (2012). Mechanisms of neuronal computation inmammalian visual cortex. Neuron 75, 194–208.

Rabinovich, M., Huerta, R., and Laurent, G. (2008). Neuroscience. Transientdynamics for neural processing. Science 321, 48–50.

Raposo, D., Kaufman, M.T., and Churchland, A.K. (2014). A category-free neu-ral population supports evolving demands during decision-making. Nat. Neu-rosci. 17, 1784–1792.

Ratliff, F. (1965). Mach Bands: Quantitative Studies on Neural Networks in theRetina (Holden-Day).

Reinagel, P., and Reid, R.C. (2000). Temporal coding of visual information inthe thalamus. J. Neurosci. 20, 5392–5400.

Riehle, A., Grun, S., Diesmann, M., and Aertsen, A. (1997). Spike synchroniza-tion and rate modulation differentially involved in motor cortical function. Sci-ence 278, 1950–1953.

Rust, N.C., Schwartz, O., Movshon, J.A., and Simoncelli, E.P. (2005). Spatio-temporal elements of macaque v1 receptive fields. Neuron 46, 945–956.

Rust, N.C., Mante, V., Simoncelli, E.P., and Movshon, J.A. (2006). How MTcells analyze the motion of visual patterns. Nat. Neurosci. 9, 1421–1431.

Safo, P.K., and Regehr, W.G. (2005). Endocannabinoids control the inductionof cerebellar LTD. Neuron 48, 647–659.

Sceniak, M.P., Ringach, D.L., Hawken, M.J., and Shapley, R. (1999). Con-trast’s effect on spatial summation by macaque V1 neurons. Nat. Neurosci.2, 733–739.

Seghier, M.L., and Vuilleumier, P. (2006). Functional neuroimaging findings onthe human perception of illusory contours. Neurosci. Biobehav. Rev. 30,595–612.

Sharma, J., Angelucci, A., and Sur, M. (2000). Induction of visual orientationmodules in auditory cortex. Nature 404, 841–847.

Sharma, J., Dragoi, V., Tenenbaum, J.B., Miller, E.K., and Sur, M. (2003). V1neurons signal acquisition of an internal representation of stimulus location.Science 300, 1758–1763.

Sheth, B.R., Sharma, J., Rao, S.C., and Sur, M. (1996). Orientation maps ofsubjective contours in visual cortex. Science 274, 2110–2115.

Shulz, D.E., and Fregnac, Y. (2010). From sensation to perception. In Encyclo-pedia of Behavioural Neuroscience, G. Koob, M. Le Moal, and R. Thompson,eds. (Elsevier), pp. 550–558.

Sillito, A.M., Grieve, K.L., Jones, H.E., Cudeiro, J., and Davis, J. (1995). Visualcortical mechanisms detecting focal orientation discontinuities. Nature 378,492–496.

Smyth, D.,Willmore, B., Baker, G.E., Thompson, I.D., and Tolhurst, D.J. (2003).The receptive-field organization of simple cells in primary visual cortex of fer-rets under natural scene stimulation. J. Neurosci. 23, 4746–4759.

Solomon, S.G., Lee, B.B., and Sun, H. (2006). Suppressive surrounds andcontrast gain in magnocellular-pathway retinal ganglion cells of macaque.J. Neurosci. 26, 8715–8726.

Sompolinsky, H., and Shapley, R. (1997). New perspectives on the mecha-nisms for orientation selectivity. Curr. Opin. Neurobiol. 7, 514–522.


Spinelli, D.N., and Jensen, F.E. (1979). Plasticity: the mirror of experience. Sci-ence 203, 75–78.

Spors, H., Wachowiak, M., Cohen, L.B., and Friedrich, R.W. (2006). Temporaldynamics and latency patterns of receptor neuron input to the olfactory bulb.J. Neurosci. 26, 1247–1259.

Stein, B.E., and Stanford, T.R. (2008). Multisensory integration: current issuesfrom the perspective of the single neuron. Nat. Rev. Neurosci. 9, 255–266.

Stopfer, M., Jayaraman, V., and Laurent, G. (2003). Intensity versus identitycoding in an olfactory system. Neuron 39, 991–1004.

Super, H., van der Togt, C., Spekreijse, H., and Lamme, V.A. (2003). Internalstate of monkey primary visual cortex (V1) predicts figure-ground perception.J. Neurosci. 23, 3407–3414.

Szabo, V., Ventalon, C., De Sars, V., Bradley, J., and Emiliani, V. (2014).Spatially selective holographic photoactivation and functional fluorescenceimaging in freely behaving mice with a fiberscope. Neuron 84, 1157–1169.

Theunissen, F.E., and Elie, J.E. (2014). Neural processing of natural sounds.Nat. Rev. Neurosci. 15, 355–366.

Troyer, T.W., Krukowski, A.E., Priebe, N.J., and Miller, K.D. (1998). Contrast-invariant orientation tuning in cat visual cortex: thalamocortical input tuningand correlation-based intracortical connectivity. J. Neurosci. 18, 5908–5927.

Vaadia, E., Haalman, I., Abeles, M., Bergman, H., Prut, Y., Slovin, H., and Aer-tsen, A. (1995). Dynamics of neuronal interactions in monkey cortex in relationto behavioural events. Nature 373, 515–518.

Van Rullen, R., and Thorpe, S.J. (2001). Rate coding versus temporal ordercoding: what the retinal ganglion cells tell the visual cortex. Neural Comput.13, 1255–1283.

Vidyasagar, T.R., and Eysel, U.T. (2015). Origins of feature selectivities andmaps in the mammalian primary visual cortex. Trends Neurosci. 38, 475–485.

Vinje, W.E., and Gallant, J.L. (2000). Sparse coding and decorrelation in pri-mary visual cortex during natural vision. Science 287, 1273–1276.

von der Heydt, R., Peterhans, E., andBaumgartner, G. (1984). Illusory contoursand cortical neuron responses. Science 224, 1260–1262.

Von der Malsburg, C. (1981). The correlation theory of brain function (Max-Plank Institute for Biophysical Chemistry).

von der Malsburg, C. (1986). Am I thinking brain assemblies? In Brain Theory,G. Palm and A. Aertsen, eds. (Springer), pp. 161–176.

Wagemans, J., Elder, J.H., Kubovy, M., Palmer, S.E., Peterson, M.A., Singh,M., and von der Heydt, R. (2012). A century of Gestalt psychology in visualperception: I. Perceptual grouping and figure-ground organization. Psychol.Bull. 138, 1172–1217.

Wallace, D.J., Greenberg, D.S., Sawinski, J., Rulla, S., Notaro, G., and Kerr,J.N. (2013). Rats maintain an overhead binocular field at the expense of con-stant fusion. Nature 498, 65–69.

Watkins, S., Shams, L., Josephs, O., and Rees, G. (2007). Activity in human V1follows multisensory perception. Neuroimage 37, 572–578.

Wehr, M., and Zador, A.M. (2005). Synaptic mechanisms of forward suppres-sion in rat auditory cortex. Neuron 47, 437–445.

Wertheimer, M. (1912). Experimentelle Studien uber das Sehen von Bewe-gung. In Zeitschrift fur Psychologie (J.A. Barth).

Yeh, C.I., Xing, D., Williams, P.E., and Shapley, R.M. (2009). Stimulusensemble and cortical layer determine V1 spatial receptive fields. Proc. Natl.Acad. Sci. USA 106, 14652–14657.

Zingg, B., Hintiryan, H., Gou, L., Song, M.Y., Bay, M., Bienkowski, M.S., Fos-ter, N.N., Yamashita, S., Bowman, I., Toga, A.W., and Dong, H.W. (2014). Neu-ral networks of the mouse neocortex. Cell 156, 1096–1111.

Zipser, K., Lamme, V.A., and Schiller, P.H. (1996). Contextual modulation inprimary visual cortex. J. Neurosci. 16, 7376–7389.





















































































































Cortical Correlates of Low-Level Perception: From Neural ... · physiological data describing the...

Documents

Transcript of Cortical Correlates of Low-Level Perception: From Neural ... · physiological data describing the...