Spatial and temporal determinants of context modulation in ...

48
Spatial and temporal determinants of context modulation in human contrast perception Markku Kilpeläinen Institute of Behavioural Sciences, University of Helsinki, Finland Academic dissertation to be publicly discussed, by due permission of the Faculty of Behavioural Sciences at the University of Helsinki in Auditorium XII at the Main building, Fabianinkatu 33, on the 19th of April, 2012, at 12 o’clock University of Helsinki Institute of Behavioural Sciences Studies in Psychology 82: 2012

Transcript of Spatial and temporal determinants of context modulation in ...

Page 1: Spatial and temporal determinants of context modulation in ...

Spatial and temporal determinants of context

modulation in human contrast perception

Markku Kilpeläinen

Institute of Behavioural Sciences, University of Helsinki,

Finland

Academic dissertation to be publicly discussed, by due permission of the Faculty of Behavioural Sciences

at the University of Helsinki in Auditorium XII at the Main building, Fabianinkatu 33, on the 19th of April, 2012, at 12 o’clock

University of Helsinki Institute of Behavioural Sciences Studies in Psychology 82: 2012

Page 2: Spatial and temporal determinants of context modulation in ...

2

Supervisors: Professor Kristian Donner, PhD Department of Biosciences University of Helsinki Helsinki, Finland Professor Jussi Saarinen, PhD Institute of Behavioural Sciences University of Helsinki Helsinki, Finland Docent Simo Vanni, MD, PhD Brain Research Unit O.V. Lounasmaa Laboratory School of Science Aalto University Espoo, Finland Reviewers: Professor Lynn Olzak, PhD Department of Psychology Miami University Oxford, OH, USA Professor Matti Weckström, MD, PhD Department of Physics University of Oulu Oulu, Finland Opponent: Professor Mark Georgeson School of Life and Health Sciences Aston University Birmingham, UK

ISSN 1798-842X ISSN-L 1798-842X

ISBN 978-952-10-7915-3 (pbk.) ISBN 978-952-10-7916-0 (PDF)

http://www.ethesis.helsinki.fi Unigrafia

Helsinki, 2012

Page 3: Spatial and temporal determinants of context modulation in ...

3

Contents

Abstract ....................................................................................................................... 4 Tiivistelmä ................................................................................................................... 5 Acknowledgements .................................................................................................... 6 List of original publications ....................................................................................... 7 Abbreviations .............................................................................................................. 8 1. Introduction ............................................................................................................. 9

1.1 The primate visual system ........................................................................... 11 1.2 The extra-classical receptive field ................................................................ 14

2. Aims of the study .................................................................................................. 17 3. General experimental methods ............................................................................ 18

3.1 Psychophysics ............................................................................................. 18 3.1.1 Subjects ........................................................................................... 18 3.1.2 Stimuli .............................................................................................. 18 3.1.3 Procedures ...................................................................................... 18

3.2 Functional magnetic resonance imaging ...................................................... 19 3.3 Modelling ..................................................................................................... 19

4. The individual studies: methods and results ...................................................... 20 4.1 Study I: A change in mean luminance attenuates perceived contrast in a subtractive, local and transient manner .............................................................. 20

4.1.1 Methods ........................................................................................... 20 4.1.2 Results ............................................................................................. 21

4.2 Study II: Modelling the effects of mean luminance changes and the pedestal effect with the same early retinal mechanisms ................................................... 22

4.2.1 Methods ........................................................................................... 23 4.2.2 Results of the luminance step simulation ......................................... 25 4.2.3 Results of the pedestal effect simulation .......................................... 26

4.3 Study III: The time-course of surround suppression in human visual system 28 4.3.1 Methods ........................................................................................... 28 4.3.2 Results ............................................................................................. 28

4.4 Study IV: The spatial extent of the ECRF mechanisms in human visual system ............................................................................................................... 30

4.4.1 Methods ........................................................................................... 31 4.4.2 Results ............................................................................................. 32

5. Discussion ............................................................................................................. 35 5.1 Modulation caused by changes in the underlying context ............................. 35 5.2 Modulation caused by changes in the adjacent context................................ 38 5.3 The temporal nature of the neural signal ...................................................... 42 5.4 Conclusions ................................................................................................. 43

6. References ............................................................................................................ 44

Page 4: Spatial and temporal determinants of context modulation in ...

4

Abstract

The amount of information carried by light from our environment is practically unlimited. The amount of information that can be handled by the biological visual system, in contrast, is strictly limited by the amount of resources (neurons, energy and time) allocated to the system. Within these constraints, the visual system must perform its task of extracting the features most relevant for object recognition and movement planning. To perform the task successfully, the visual system needs to sacrifice the faithful coding of each point’s light intensity. In early visual processing, this is done in (at least) two ways. Firstly, retinal neurons adapt to the local time-averaged luminance in their receptive field, and use most of their dynamic range to signal small contrasts, i.e., deviations from the mean level. Secondly, the system emphasizes spatial and temporal changes. This entails suppressing responses not only to uniform fields of uniform luminance, but also to “repetitive” contrast. This thesis presents measurements and modelling of the spatial and temporal determinants of these forms of information compression in the human visual system. The first key finding of the psychophysical experiments carried out is that the perceived contrast of a target grating is attenuated if the target is presented simultaneously with a change in mean luminance, indicating a lack of adaptation. The attenuation is highly transient and its relative strength decreases with increasing target contrast. A model is presented and supported by simulations, showing that the contrast dependence of attenuation can be explained by two realistic retinal mechanisms: the linear-nonlinear dynamics of cone photoreceptors and the subsequent thresholding and leaky integration of the cone signals by retinal ganglion cells. The model parameters are strictly fixed by known physiology. The same mechanisms also predict the classic pedestal effect, where instead of the mean luminance, the local contrast in an area coextensive with the target changes simultaneously with target presentation. The second key finding of the psychophysical experiments is that the time course of suppression of the perceived contrast of a target grating by a non-overlapping surround grating depends on both the contrast and area of the surround. The contrast effect can be explained by the dynamics of the early retinal response, but the area effect cannot be. Thus, the spatial properties of surround suppression in the human visual system were further explored with neuroimaging, psychophysics and modelling. The structure was found to be similar in humans and non-human primates, and moreover, in agreement with the psychophysically observed effect of surround area on the timing of suppression. On a more general level, the results support the idea that the very early parts of the signals of photoreceptors and subsequent visual neurons are critical in the transmission of visual information.

Page 5: Spatial and temporal determinants of context modulation in ...

5

Tiivistelmä

Ympäristössämme on tarjolla käytännössä rajaton määrä näköinformaatiota. Näköjärjestelmäämme sen sijaan rajoittaa ainakin se, kuinka paljon hermosoluja, energiaa ja aikaa sillä on käytössään. Näiden rajoitusten alaisena näköjärjestelmän täytyy kyetä poimimaan näköinformaatiosta ne piirteet, jotka eniten auttavat meitä esineiden tunnistamisessa ja toiminnan ohjauksessa. Ilmeisesti tästä tehtävästä suoriutuminen on edellyttänyt luopumista ympäristön jokaisen pisteen kirkkauden koodaamisesta. Sen sijaan näköjärjestelmä sopeutuu näkökentän eri osien paikalliseen keskiluminanssiin ja käyttää hermosolujen vastealueen pääasiassa ilmaisemaan pieniä, paikallisia poikkeamia tuosta keskiluminanssista. Lisäksi, järjestelmä käyttää hermosolujen vasteita pääasiassa ajassa ja tilassa tapahtuvien muutosten ilmaisemiseen. Solut eivät juurikaan reagoi tasaisiin pintoihin ja toistuvaan tekstuuriinkin vähemmän kuin yksittäisiin reunoihin. Tässä väitöskirjassa selvitetään yllä kuvattujen, ihmisen näköjärjestelmän suorittamien informaatiomuunnosten ajallisia ja spatiaalisia reunaehtoja. Ensimmäinen kokeellinen päätulos on se, että kontrastiärsykkeen havaittu kontrasti vaimentuu, kun se esitetään samanaikaisesti keskiluminanssin muutoksen kanssa. Vaimennusvaikutus poistuu melko nopeasti ja sen suhteellinen voimakkuus heikkenee ärsykkeen kontrastia kasvatettaessa. Vaimennus ja sen ajalliset piirteet selittyvät adaptaatiomekanismeilla. Väitöskirjassa esitetään mallisimulaatioita, joiden mukaan vaimennuksen kontrastiriippuvuus voidaan selittää kahdella realistisella verkkokalvon mekanismilla: reseptorisoluvasteen lineaarisen alkunousun muuttuminen kompressiiviseksi huippuvasteeksi ja ganglion-solujen suorittama reseptorivasteiden kynnystys ja integrointi. Samoilla mekanismeilla pystytään myös tuottamaan klassinen pedestaali-efekti, jossa keskiluminanssin sijaan muuttuu kohdeärsykkeen alla oleva kontrastirakenne. Toinen kokeellinen päätulos on, että ajoitus, jolla ei-päällekkäinen taustakontrasti maksimaalisesti vaimentaa keskustan havaittua kontrastia, riippuu taustan kontrastista ja pinta-alasta. Kontrastin vaikutus selittyy samalla retinan dynamiikalla kuin luminanssimuutoksen vaikutuksen kontrastiriippuvuus, mutta pinta-alan vaikutus ei. Tästä syystä väitöskirjassa tutkitaan vielä ympäristön aiheuttaman vaimennuksen spatiaalisia rajoitteita aivokuvantamisen, psykofysiikan ja mallinnuksen avulla. Vaimennuksen spatiaalinen rakenne oli samankaltainen kuin apinoilla, ja myös sopusoinnussa sen kanssa, mikä oli taustan pinta-alan vaikutus vaimennuksen ajoitukseen. Väitöskirjan tutkimukset tukevat ajatusta, jonka mukaan näköjärjestelmän hermosolujen viestinnässä signaalin tärkein vaihe on sen lyhyt alkuosa.

Page 6: Spatial and temporal determinants of context modulation in ...

6

Acknowledgements

This work was conducted mostly in the stimulating environment of the Department of Behavioural Sciences in the University of Helsinki. The skills of my supervising professor Kimmo Alho were needed many times in solving various issues of academic administration. Study IV was conducted in the Advanced Magnetic Imaging Centre at the Aalto University. I have been privileged to receive a 4-year salary from the National Doctoral Programme of Psychology, which has enabled me to concentrate on the thesis without the need to constantly worry about funding. This thesis has many supervisors, and it is safe to say that I have received the best and most diversified education any doctoral student could wish for. The clearly observable enthusiasm and insightfulness of Pentti Laurinen (1945-2009) originally drew me to the study of the visual system. A clear manifestation of his enthusiasm is that although he retired on the day I started my full time doctoral studies, he carried on instructing me and has participated in three of this thesis’ studies. At the time when Pena could no longer participate as actively, I was very lucky to be hit by a second intellectual lightning, Professor Kristian Donner. I don’t know whether it is a result of conscious self education or simply the result of talent and vast experience, but Kristian has truly developed the mentoring process to an art. He always emphasizes the student’s own responsibility and competence and always puts quality above quantity and haste. Thank you, Kristian, for the opportunity to enjoy your art. During the rather confusing process of departing from Pena’s care, Professor Jussi Saarinen and Docent Simo Vanni kept me instructed, employed, funded and motivated. Without them, the thesis might never have been completed. Discussions with my colleagues have been instrumental in making the work as interesting as it is. They have often known the solution to something I’ve found puzzling and when they haven’t, they have been eager to think about it with me. Thank you Ilmari, Jenni, Jussi, Kaisa, Lari, Tarja, Viljami and, especially Lauri, for thinking with me. Talking about the visual system and various world visions with other colleagues within the Helsinki region has been a source of academic and personal satisfaction. Thanks are due to Dr. Linda Henriksson, Dr. Hanna Heikkinen, Dr. Frans Vinberg, Dr. Juha Silvanto, Dr. Petri Ala-Laurila, Prof. Ari Koskelainen, and many others. I also thank the two expert reviewers, Prof. Lynn Olzak and Prof. Matti Weckström. Besides the vision science group, my other fellow doctoral students have also been an essential source of normality in the often quite solitary job of a doctoral student. Thank you, Annika, Janne and Sointu, for the normalization lunches. In my childhood home, both education and thinking for oneself were held in great value. As a result, my sister was excellent in school and a suitable role model in many ways and I stayed in school for about 27 years. Thank you mum, dad and Kirsi for the spark to learn. Nothing about my sweet wife Anna could be said here in a way that would truly express her significance. What is clear, though, is that while being a warm, loving and fun companion, her pride and faith in my work has always remained strong even at the times when mine have hit a low. Thank you, Anna. It has also been a privilege to share with you the wisdoms of our dearest and wittiest teacher, Lotta.

Page 7: Spatial and temporal determinants of context modulation in ...

7

List of original publications

I Kilpeläinen, M., Nurminen, L., & Donner, K. (2011). Effects of mean luminance

changes on human contrast perception: Contrast dependence, time-course and

spatial specificity. PLoS ONE, 6, e17200.

II Kilpeläinen, M., Nurminen, L., & Donner, K. (in press). The effect of mean

luminance change and grating pedestals on contrast perception: Model

simulations suggest a common, retinal, origin. Vision research.

III Kilpeläinen, M., Donner, K., & Laurinen, P. (2007). Time course of suppression

by surround gratings: Highly contrast-dependent, but consistently fast. Vision

Research, 47, 3298-3306.

IV Nurminen, L., Kilpeläinen, M., Laurinen, P., & Vanni, S. (2009). Area summation

in human visual system: Psychophysics, fMRI, and modeling. Journal of

Neurophysiology, 102, 2900-2909.

The articles are printed with the permission of the copyright holders.

Page 8: Spatial and temporal determinants of context modulation in ...

8

Abbreviations

BOLD blood-oxygenation-level-dependent (contrast) cpd cycles per degree (of visual angle) fMRI functional magnetic resonance imaging ECRF extra-classical receptive field LGN lateral geniculate nucleus RF receptive field VOI voxel of interest V1 primary visual cortex

Page 9: Spatial and temporal determinants of context modulation in ...

9

1. Introduction

The visual system continuously provides us with an abundance of information that

promotes survival and reproduction. Although in modern industrialized societies

blindness does not necessarily lead to elevated mortality (Krumpaszky, Dietz, Mickler

& Selbmann, 1999), quality of life is nevertheless compromised. For example, cataract

surgery increases the patients’ quality of life considerably, even if the operation leaves

vision clearly below normal levels (Desai, Reidy, Minassian, Vafidis & Bolger, 1996).

People who do not struggle with significant vision related problems are tempted to

take for granted the visual information and the apparent ease of acquiring it. It is

possibly this ease that creates the illusion that the visual system somehow transfers an

accurate copy of our surroundings to our consciousness. In reality, the visual system

actively reconstructs the visual scene as a stream of neuro-chemical signals. The signals

are transferred through the visual system and modulated to varying degrees before they

are combined with stored information to produce conscious percepts and behavioural

responses. It is easy to accept that only representations of elephants can exist inside

one’s head, but why bother with modulating the signals representing the elephant? Why

not make the representation precisely like the elephant in front of the beholder?

The visual system faces the monumental task of processing a practically unlimited

amount of visual information with a biological system constrained at least by the

amount of neurons and energy allocated to the system as well as the range of intensities

the neurons are able to cover (Niven & Laughlin, 2008). Obviously, a considerable

amount of modulation of the neural signals is necessary to transfer as much as possible

of the most relevant information with the limited biological system.

The most compelling need for modulation comes from the fact that the mean light

levels in our surroundings vary by at least 109 -fold from a moonless night to a sunny

day. The largest changes, such as sunrise/sunset or moving from a dense forest to a

sunny, open plain, occur globally, affecting our entire visual environment similarly.

Such changes are covered by relatively slow processes, such as pupil size adjustment,

sensitivity control of the retinal neurons and a switch between different photoreceptors

systems (reviewed by Walraven, Enroth-Cugell, Hood, MacLeod & Schnapf, 1990).

However, the efficient extraction of relevant features from the visual scene requires

modulation that operates on the same spatial and temporal scales as the features

Page 10: Spatial and temporal determinants of context modulation in ...

10

themselves. This thesis concentrates on such fast and local interactions. The different

types of phenomena considered may be described as examples of contextual

modulation. In each case the perceptual responses to a certain target stimulus (e.g., a

wrinkle on the elephant’s trunk) are affected by other stimuli (e.g., other wrinkles).

One of the most studied forms of contextual modulation consists of effects brought

about by fast changes in local light level within which the target stimulus is observed.

When looking at a common day-light natural scene, the light levels falling on different

parts of the retina easily change by a factor of 10-100 from one fixation to another

(Frazor & Geisler, 2006). The photoreceptors could, in principle, simply adjust their

response range so that they would respond maximally to the highest light level they

typically meet and only just to a very faint light level. However, that would result in an

extremely poor ability to signal moderate contrasts (such as some trunk wrinkles) and

significant amounts of energy would be used just to signal the average light level. The

biological visual system has, instead, evolved to adjust the sensitivity of photoreceptors

and other neurons to the local light levels very rapidly. By shifting their intensity

response function so that the steep part of the function is aligned with mean intensity,

the neurons retain maximal sensitivity to the contrasts typically encountered (Rieke &

Rudd, 2009; Shapley & Enroth-Cugell, 1984). However, such adjustment is not

instantaneous. The extent of success in normal visual behaviour, where light levels

change rapidly with each eye movement is one question without an unequivocal answer

in the literature (Geisler, 1978; Mante, Frazor, Bonin, Geisler & Carandini, 2005).

Visual perception is not merely affected by context elements that are coextensive

with the target. Strong modulating effects can originate from completely non-

overlapping, even distant areas of the visual field (Nurminen, Peromaa & Laurinen,

2010; Westheimer, 1967). Since the post-retinal visual system operates with borders

signalled by contrast, rather than uniformly lit areas (Kurki, Peromaa, Hyvärinen &

Saarinen, 2009; Whittle, 1994), interactions between adjacent contrast elements are

critical to understand. One such effect that has been observed on virtually all levels of

the visual system is the so-called surround modulation, where the perceptual or neural

responses to a contrast stimulus are modulated by surrounding contrast stimulation

(Ejima & Takahashi, 1983, 1985; Maffei & Fiorentini, 1976). The fundamental function

of the phenomenon has not been resolved (reviewed by Carandini & Heeger, 2012), but

Page 11: Spatial and temporal determinants of context modulation in ...

11

one quite certain consequence of it is an attenuation of less informative texture (e.g.,

trunk wrinkles) in relation to the more important solitary edges of objects (the trunk,

Grigorescu, Petkov & Westenberg, 2003; Nothdurft, Gallant & Van Essen, 1999).

1.1 The primate visual system

Everything we see is light. Light consists of elementary particles called photons. From a

light source or a reflecting surface, light travels through the pupil to the back of the eye,

where the retina is located (Figure 1). The retina consists of multiple layers of

morphologically different neurons (inset in Figure 1). At the very back of the retina are

the photoreceptors containing visual pigment, i.e., molecules that can absorb photons.

Through the electrochemical process of phototransduction, the photoreceptors signal the

absorption of photons with electric responses that are synaptically transmitted to

second-order neurons. The rod photoreceptors (dark blue in Figure 1) are able to signal

the reception of single photons reliably (Rieke & Baylor, 1998) and thus provide vision

at very low light levels. The cone photoreceptors need much more light for useful

responses, but provide the fast, high resolution vision available to us at higher light

levels. The three different cone types (red, green, and light blue in Figure 1), tuned to

different photon energies of light, make colour vision possible.

Figure 1. A schematic picture of the lower levels of the human visual system. The inset represents a

cross-section of the retina. Credits: University of Michigan Kellogg eye center, Helga Kolb, Webvision.

Reproduced under the Creative Commons License.

Page 12: Spatial and temporal determinants of context modulation in ...

12

The output of the retina consists of the spike trains produced by the retinal ganglion

cells (purple in Figure 1). Approximately 90 % of the ganglion cells in the primate

retina represent three subtypes (Dacey, 2007). The midget ganglion cells are the most

abundant, possess the highest spatial resolution and process red-green colour

information. The parasol ganglion cells have the highest contrast sensitivity and

temporal resolution. The small bistratified ganglion cells process blue-yellow colour

information. All of these ganglion cell types have an approximately circular centre-

surround receptive field (RF) structure (Figure 2A, top). They respond maximally when

the light in the RF centre, but not in the surround, increases (ON-centre) or decreases

(OFF-centre). Such a receptive field also enhances responses to borders between light

and dark surfaces (see, Figure 2B, for an example). Instead, a uniformly lit surface

leaves excitation and inhibition in balance and results in little, if any, firing.

Figure 2. A) Examples of receptive fields of retinal ganglion cells and V1 neurons. B) The receptive field

of a neuron characterizes its response to a given stimulus.

In concert with the above described retinal neurons, the rest of the retinal circuitry,

with altogether 50 different cell types identified so far, performs extremely sophisticated

computations on the visual signals (see, Werblin, 2011, for a review). It is currently

unclear, to what extent all the cell types and computational properties described in non-

primates apply to the primate retina. However, the idea that all the advanced functions

found in the retinas of “lower” vertebrates are implemented only in the cortex in

Page 13: Spatial and temporal determinants of context modulation in ...

13

primates, is a harsh and increasingly outdated oversimplification (Field & Chichilnisky,

2007; Gollisch & Meister, 2010).

The lateral geniculate nucleus (LGN, Figure 1) of the thalamus is the main target of

the information stream from the retina. The LGN has been considered rather

uninteresting in terms of feedforward visual processing, mainly because the number of

neurons in the LGN, and the RF properties thereof, appear to be determined by the

projections from the retina (Ridder, 2006). Also, information from the parasol, midget

and small bistratified ganglion cells, as well as from the two eyes, is kept segregated in

the LGN. Instead, the LGN is considered a gate-keeper of the visual stream. At its most

basic, LGN neurons filter solitary spikes from the retina (Carandini, Horton & Sincich,

2007), which stops most of the maintained activity of single retinal ganglion cells from

entering the cortex. More elaborately, the LGN receives an abundance of connections

from the cortex and other thalamic structures and is hence able to modulate retinal

inputs according to rather high-level parameters such as direction of voluntary attention

(McAlonan, Cavanaugh & Wurtz, 2008; O'Connor, Fukui, Pinsk & Kastner, 2002).

The primary visual cortex (V1, Figure 1), in contrast to the LGN, provides an

obvious progressive step in the analysis of visual information. First and foremost, the

majority of V1 neurons have elongated receptive fields (Figure 2A, bottom) and are

thus selective to the orientation of an edge or stripe in their receptive field (Hubel &

Wiesel, 1962, 1968). A neuron of the primary visual cortex may not respond at all to an

otherwise appropriate stimulus, if its orientation does not match the neuron’s receptive

field (Figure 2B). The visual cortex is organized in vertical columns. Each column

contains, in a circular arrangement, cells sequentially representing all orientations. As a

result, each column informs the rest of the visual system about the orientation of a

segment at a particular location in the visual field (Hubel & Wiesel, 1968). The V1 is

also the first level of the visual system, where information from the two eyes is

combined. In a dimension orthogonal to that defining orientation, there is a columnar

sequence of cells progressing from “pure” left-eye to “pure” right-eye dominance

(Bartfeld & Grinvald, 1992). Together the orthogonal sets of orientation and eye-

dominance slabs make up a so-called hypercolumn. Some V1 neurons are selective to

binocular disparity, which is an important depth cue (Cumming & Parker, 1999), but

depth analysis appears to occur predominantly on higher levels of the visual cortex

Page 14: Spatial and temporal determinants of context modulation in ...

14

(Parker, 2007). Interestingly, the large number of neurons in V1 may be entirely

determined by the need to preserve the spatial resolution provided by the number of

ganglion cells while adding orientation selectivity with comparable precision (Stevens,

2001).

In addition to V1, there are some 15-20 distinct cortical areas that predominantly

process visual information (Van Essen & Maunsell, 1983). However, V1 mainly

projects to three areas of the visual cortex, namely V2, V3 and MT/V5 (Casagrande &

Kaas, 1994). These connections are reciprocal, i.e., there are also feedback connections

from those areas back to V1.

1.2 The extra-classical receptive field

The receptive field of a neuron expresses the location and shape of the area where light

stimulation can evoke a response in the neuron. As shown in Figure 2, the classical RF

generally includes areas where light increases (white) or decreases (black) the response.

However, the response can be modulated by additional stimulation outside the classical

RF. The combination of these modulation-mediating areas with the classical RF is often

called the extra-classical receptive field (ECRF).

The extra-classical receptive field of retinal ganglion cells takes two forms. The

response to the stimulation of the RF can be modulated by adding either uniform

luminance or contrast patterns outside the RF (Shapley & Victor, 1979; Solomon, Lee

& Sun, 2006). These ECRF properties appear to be inherited to some extent by neurons

in V1 (Webb, Dhruv, Solomon, Tailby & Lennie, 2005).

The V1 adds something new to the ECRF, however. In contrast to subcortical

neurons (Bonin, Mante & Carandini, 2005; Solomon et al., 2006; Solomon, White &

Martin, 2002), the ECRFs of V1 neurons are clearly orientation selective (Cavanaugh,

Bair & Movshon, 2002b; Knierim & van Essen, 1992), although not quite as strongly as

their classical RFs are.

Figure 3A presents a typical measurement of a V1 neuron’s ECRF. A grating

matched to the location and properties of the neuron’s conventional receptive field is

increased in diameter (see Figure 3A). Typically, an increase in the stimulus diameter

first increases the neuron’s firing rate. The stimulus diameter at which the response

peaks is often considered the neuron’s summation field. After the peak in firing rate, an

additional increase in stimulus size starts to suppress the firing. At some point the firing

Page 15: Spatial and temporal determinants of context modulation in ...

15

rate plateaus at a level lower than the maximum response but higher than spontaneous

activity. The stimulus diameter at which the firing rate plateaus is considered the

neuron’s surround field.

Figure 3. A) In comparison to a stimulus that approximately matches the receptive field, the response of

neurons in V1 increases with increasing stimulus size. However, after a point, additional increase in

stimulus size suppresses the response. B) When a centre grating is surrounded by a collinear grating, the

perceived contrast of the centre clearly decreases. The effect is significant but considerably weaker if the

surround grating is orthogonal. The physical contrast of all the centre patches is the same.

Measuring a correlate of the ECRF in the human visual system is complicated.

Imaging methods have been used successfully to measure the strength of ECRF effects

(Williams, Singh & Smith, 2003; Zenger-Landolt & Heeger, 2003), but only

psychophysical methods currently have sufficient spatial and temporal resolution to

enable measurement of the spatial structure and temporal dynamics of ECRF effects.

However, psychophysical methods of course measure the final output after multiple

stages of processing, and the relative contributions of the different stages can be only

partially disentangled with experimental designs. In addition, there is no straightforward

perceptual correlate to neural firing rates, especially for supra-threshold stimulation.

Nevertheless, many psychophysically observed figure-ground interactions have been

hypothesized to be functional correlates of ECRF type mechanisms (reviewed by Seriès,

Lorenceau & Frégnac, 2003). For example, when a grating patch is surrounded by

similar grating, the perceived contrast of the patch is clearly affected. With some

stimulus parameters, the effect is incremental, but more often it is suppressive (Cannon

Page 16: Spatial and temporal determinants of context modulation in ...

16

& Fullenkamp, 1991; Ejima & Takahashi, 1985; Olzak & Laurinen, 1999). The effect

can be observed by comparing the panels 1 and 2 in Figure 3B, while fixating the black

dot in between. The effect is strongly orientation selective, a surround of matching

orientation causing much stronger suppression than one of orthogonal orientation

(compare panels 2 and 3 in Figure 3B). In this respect, perception seems to correspond

to V1 neurons rather than sub-cortical neurons.

The mechanisms behind the ECRF of V1 neurons themselves have currently not

been fully resolved. Originally, horizontal connections from other V1 neurons with

adjacent visual field coverage and similar orientation preferences were, quite logically,

assumed to be critical (Gilbert & Wiesel, 1989; Grinvald, Lieke, Frostig & Hildesheim,

1994). However, such connections may be too short and slow to account for the effects

observed experimentally (Angelucci et al., 2002; Bair, Cavanaugh & Movshon, 2003).

Instead, feedback connections from higher visual areas appear to be better suited.

Firstly, receptive fields in those areas, especially in MT/V5 are much larger than in V1.

Secondly, the feedforward and feedback axons conduct much faster than axons of

horizontal connections (Girard, Hupe & Bullier, 2001). Thus, a feedforward-feedback

loop between V1 and a higher visual area would allow the suppressive signals to

propagate from a larger distance and with higher speed. Out of the three areas

predominantly targeted by V1 projections, area V5/MT is likely to have a dominant role

in mediating the suppression. Firstly, surround modulation is unaffected by inactivation

of V2 (Hupé, James, Girard & Bullier, 2001). Secondly, at least in some V1 neurons the

visual area from which suppression is integrated (where the descent from maximum

plateaus in Figure 3A) is so large, that only MT/V5 neurons seem to have large enough

receptive fields (Angelucci et al., 2002). Importantly, though, both the summation field

and the surround field of the V1 ECRF are likely to be formed by a combination of

feedforward, horizontal and feedback connections, with relative roles determined by

stimulus properties, such as contrast and spatial arrangement (Schwabe, Obermayer,

Angelucci & Bressloff, 2006).

In this thesis, the term surround suppression refers to the electrophysiological effect

described in Figure 3A and the psychophysical effect demonstrated in Figure 3B.

Page 17: Spatial and temporal determinants of context modulation in ...

17

2. Aims of the study

Study I

To characterize the effects of an abrupt change in mean luminance on human contrast

perception.

Study II

To analyse, by means of physiologically realistic modelling, to what extent the results

of Study I as well as the classical “pedestal effect” can be explained by known

properties of early retinal processing.

Study III

To psychophysically determine the time-course of surround suppression as a function of

surround contrast and surround size.

Study IV

To analyze the spatial structure of surround modulation mechanisms in the human

visual system with neuroimaging, psychophysics and modelling.

Page 18: Spatial and temporal determinants of context modulation in ...

18

3. General experimental methods

3.1 Psychophysics

3.1.1 Subjects

All subjects who participated in the psychophysical experiments were young adults with

normal or corrected to normal vision. In all psychophysical experiments, at least one

subject was naïve to the purposes of the study.

3.1.2 Stimuli

The stimuli were always presented on high-quality linearized CRT monitors with an

800x600 resolution. Presentation of stimuli was controlled by the Vision Works 3

(Vision research graphics, Durham, NH) or the ViSaGe (Cambridge research systems,

Kent, UK) system. The display was always the only light source in the room.

3.1.3 Procedures

In all the main experiments, the 2-interval forced choice procedure was used. When

perceived contrast was measured (Studies I and III), the subject was first presented with

a target stimulus with the same contrast throughout the measurement and then a

comparison stimulus, the contrast of which was varied from trial to trial. The subject’s

task was to tell, which of the two stimuli had the higher contrast. The value of perceived

contrast was determined with a staircase method. The contrast of the comparison

stimulus was decreased if the subject judged it to be higher and decreased if the contrast

of the target had been perceived as higher. A point where the progress of the

comparison contrast changed direction was considered a reversal point. The value of

perceived contrast is given by the mean of the reversal points. When discrimination

threshold was measured (Study IV), the subject was presented with an identical mask

stimulus in both intervals, with a target added on the mask randomly in one of the

intervals. The subject’s task was to tell, in which of the intervals the target was present.

The threshold contrast was determined with a method of constant stimuli. The

percentage of correct responses was determined for a range of target contrasts and the

target contrast that led to 75 % correct responses was considered the threshold contrast.

Page 19: Spatial and temporal determinants of context modulation in ...

19

3.2 Functional magnetic resonance imaging

In Study IV, the blood oxygenation level dependent (BOLD) signal was measured with

3T General Electric Signa EXCITE (General electric medical systems, Milwaukee, WI)

scanner with an eight-channel receiver coil (repetition time 1.8 s, voxel size (2.5 mm)3.

Borders of retinotopic visual areas were identified with the phase-encoded retinotopic

mapping (Sereno et al., 1995) or multifocal mapping (Vanni, Henriksson & James,

2005). The voxels (volume pixels), from which BOLD signal change was analyzed,

were determined with a functional localizer. The voxels used in the main analyses

within each retinotopic area were those with the highest t-value when the BOLD signal

caused by a high contrast grating (diameter 2 deg) was contrasted with activity during

rest. Both gradient echo and spin echo fMRI sequences were used, but spin echo was

determined more spatially specific and all the conclusions are based on data from spin

echo fMRI. The stimuli were presented with projected with a linearized Christie X3TM

(Christie Digital Systems Ltd) data projector with 1024 x 768 pixel spatial resolution

and 75 Hz refresh rate to a semitransparent screen, subtending 40 x 31 degrees of visual

angle at 34 cm viewing distance. The functional data were preprocessed and analyzed

with SPM2 (Wellcome Department of Imaging Neuroscience, London, UK)

(Frackowiak et al., 2003) Matlab toolbox. The BOLD signal change in each stimulus

condition was estimated by fitting a general linear model to the time-series data.

3.3 Modelling

All model computations were performed with Matlab 7 (MathWorks Inc, Natick, MA,

USA). All integration operations were conducted with numerical methods (ODE45 and

Euler’s in Study II, dblquad in Study IV).

Page 20: Spatial and temporal determinants of context modulation in ...

20

4. The individual studies: methods and results

4.1 Study I: A change in mean luminance attenuates perceived

contrast in a subtractive, local and transient manner

During natural viewing, every eye movement changes the contrast and mean luminance

falling on each part of the retina very abruptly and often substantially. The visual

system quickly adapts to the new image statistics at each retinal location. However,

such adaptation can never be completely instantaneous (Rieke & Rudd, 2009). It is thus

not a surprise that detecting small luminance-defined targets against a background

becomes harder if the luminance of the background is changed simultaneously (Geisler,

1978; Poot, Snippe & van Hateren, 1997). It is more surprising that the contrast

responses of single neurons in the cat LGN and V1 have been reported to be

independent of co-occurring changes in mean luminance (Geisler, Albrecht & Crane,

2007; Mante et al., 2005). Study I sought to clarify this quite profound discrepancy.

4.1.1 Methods

The effect of mean luminance change on perceived supra-threshold contrast was

measured psychophysically in three subjects (1 naïve). The subjects viewed the screen

monocularly through an artificial pupil (diameter 4.1 mm, natural pupil dilated) from a

distance of 103 cm. The stimuli were circular patches of sine wave grating (spatial

frequency 4 cpd, orientation horizontal) or bars of increment or decrement luminance.

The Michelson contrast of the target stimulus was varied (0.1–0.45) between

measurements. The contrast of the comparison was varied during the measurement,

according to the staircase method. The target grating was presented together with an

upward or downward step of mean luminance (between 185 and 1295 Td,

corresponding to 14 and 98 cd/m2), either simultaneously or with various delays (50–

800 ms). The time-course of a single trial is presented in Figure 4. One measurement

proceeded as follows. First, the subject adapted to the baseline luminance of the

measurement for a minimum of 40 seconds. Then, a trial was presented: a fixation

stimulus was followed by an abrupt change in mean luminance. The target stimulus

(duration 100 ms) was presented with either a simultaneous or a delayed onset relative

to the change in mean luminance. A comparison stimulus was presented 1800 ms after

the luminance change. The baseline luminance was shown for 8 seconds before the next

Page 21: Spatial and temporal determinants of context modulation in ...

21

trial. The subject's task was to indicate, whether the target or the comparison stimulus

appeared to have a higher contrast. Trials were continued until both staircases reached 8

reversal points.

Figure 4. After a longer period of adaptation to the baseline luminance, the mean luminance changes

abruptly. The target grating is presented either simultaneously or with a delay of 50-800 ms. The

comparison grating is presented approximately 1800 ms later, after which the mean luminance returns to

the adaptation value.

4.1.2 Results

The perceived contrast of the supra-threshold grating was consistently attenuated by the

co-occurring change in mean luminance (Figure 5A). For contrasts stronger than 0.1-

0.2, absolute attenuation was independent of target contrast, i.e. subtractive, as indicated

by the linear fits (dashed lines in Figure 5A). The attenuation subsided in some 200-400

ms (Figure 5B). Upward steps caused somewhat stronger (mean suppression strength 36

% vs. 17%) and longer lasting (mean time constant 157 ms vs. 78 ms) attenuation than

downward steps. The pattern of results was the same when increment or decrement bars

were used as stimuli instead of gratings and when the mean luminance was changed

only in the immediate vicinity of the target. Finally, attenuation was also clearly present

when detection thresholds were measured instead of perceived contrast.

Page 22: Spatial and temporal determinants of context modulation in ...

22

Figure 5. A change in mean luminance simultaneous with, or closely preceding, the presentation of a

contrast target attenuates perceived contrast consistently, but transiently. A) Perceived contrast of the

target grating as a function of physical contrast of the grating for three subjects. Dashed lines are linear

fits with slopes 0.995 (S1), 0.997 (S2) and 1.040 (S3).B) Perceived contrast of the target grating as a

function of target onset relative to the luminance change onset for two subjects. Smooth curves are best

fitting exponential functions. Data points (and accompanying error bars) are the means (and standard

deviations) of at least 4 staircases. Data of subject S3 was very similar to that of S1.

4.2 Study II: Modelling the effects of mean luminance changes

and the pedestal effect with the same early retinal mechanisms

The attenuation of perceived contrast caused by a simultaneous step in mean luminance

(Study I) can be explained by response compression due to a saturating non-linearity of

cone responses when there is insufficient time for adaptation (Schnapf, Nunn, Meister

& Baylor, 1990). The observed gradual release from attenuation with time would then

mainly reflect the adaptation process. However, the fact that absolute attenuation is

largely independent of contrast (implying that the relative attenuation strength decreases

with increasing stimulus contrast) is not consistent with such an explanation. The simple

prediction based on a saturating non-linearity would be that all contrasts are scaled

according to the compressed dynamic range of the cones, leading to contrast dependent

(divisive) attenuation. In Study II, we developed a computational model to study to

what extent the contrast dependence of the luminance step effect could be explained by

retinal mechanisms that are definitely involved in mediating the contrast signals to

Page 23: Spatial and temporal determinants of context modulation in ...

23

higher levels of the visual system. The basic signal considered in the model is the early

rise of cone photoreceptor responses. Since discriminating a contrast target presented on

top of a spatially coextensive “pedestal” is basically a similar problem from the

viewpoint of that cone signals, we also studied whether the same model could explain

the well-known and intensely investigated pedestal effect (Nachmias & Sansbury,

1974).

4.2.1 Methods

All vision depends on the responses of photoreceptors and on the encoding of the

graded photoreceptor signals into trains of action potentials transmitted by retinal

ganglion cells. In the current implementation, the response of each cone to a light

stimulus caused by its stimulating pixel was calculated according to the cone response

model of Baylor, Hodgkin and Lamb (1974, their equations 8, 33, and 46). The values

of all parameters were taken from reports of relevant electrophysiological experiments.

No free parameters were used. The second, equally pertinent mechanism is the

transformation of the graded cone signals to impulse frequencies, carried out by the

retinal ganglion cells. In the model, the “ganglion cell” was modeled as a leaky

integrate-and-fire operator. The neuron’s membrane voltage V is given by equation (1).

))((1 InhURVdtdV

m

(1),

where τm is the membrane time constant (2.7 ms, Weber & Harman, 2005), U is the

input current response in pA, R is input resistance (here normalized to 1 GΩ, to simplify

the numerical implementation) and Inh is a static inhibition applied to all cone

responses. The model neuron creates a spike when the membrane voltage crosses the

spike threshold (0.13 mV in the luminance step simulation and 0.195 in the pedestal

effect simulation). The strength of inhibition (3.95 pA) was set to match the

physiological spontaneous activity of macaque ganglion cells (Brivanlou, Warland &

Meister, 1998; Croner, Purpura & Kaplan, 1993). The spike frequencies (SF) were then

calculated as (tx-t1) / (x-1), where t is time of the spike and x (here 3) is the number of

spikes considered to carry the signal. Data were qualitatively the same for x = 4.

Page 24: Spatial and temporal determinants of context modulation in ...

24

The simplest signal on which an observer could base contrast perception is the

difference between peak and mean intensities. Thus, in the luminance step simulation,

the difference of the cone responses to the peak and the mean luminances was used as

the input to the leaky integrator. In contrast, in a contrast discrimination task,

determining stimulus contrast is neither necessary nor efficient. Instead, the observer

needs to find a large enough difference in any pixel of the stimulus. Thus, in the

pedestal effect simulation, cone responses from 32 units, spanning luminance levels

from peak to mean, were used as input.

To compare the model results with psychophysical data, a matching procedure was

applied to the model responses. For an array of target and comparison contrasts, we

calculated the probabilities (over noise samples) that the SF produced by the

target+noise is greater than the SF produced by the comparison+noise. The

probabilities were calculated according to equation 2.

21 1

)(

m

ncntm

k

m

jkj

(2),

where t is the target, c is the comparison stimulus, n is noise and m is the number of

noise samples (we used m = 200). The expression (t + nj > c + nk) equals 1 if true and 0

if false. In the pedestal effect simulation, the corresponding expression was (p+t + nj >

p + nk), where p is pedestal contrast.

Figure 6A illustrates the definition of perceived contrast in the luminance step

simulation. The probability of the noisy comparison response being larger than the

noisy target response is plotted as a function of comparison contrast. It can be seen that

with a 0.15 target contrast, the comparison contrast needs to be about 0.1 for the model

to produce larger responses for the comparison in 50 % of comparisons (corresponding

to random answering in a 2-alternative task). Thus, the model predicts roughly 33 %

suppression with 0.15 target contrast. Figure 6B illustrates the definition of

discrimination thresholds in the pedestal effect simulation. The probability for

target+pedestal being larger than pedestal alone is plotted against target contrast, for

four different pedestal contrasts (0, 0.009, 0.017 and 0.025). The graph shows that with

Page 25: Spatial and temporal determinants of context modulation in ...

25

the 0.009 target contrast the probability crosses the 75 % threshold criterion with a

lower target contrast than with no pedestal (black line), thus producing the well known

pedestal effect.

Figure 6. Comparison of simulated noisy responses. A) Judging supra-threshold contrast of a target

(simultaneous with the luminance step) relative to the contrast of a comparison stimulus (steady-state

luminance). The probability that the response for comparison+noise is larger than the response for

target+noise, as a function of comparison contrast. B) Discriminating a contrast target on pedestals. The

probability that the response for pedestal+target+noise is larger than the response for pedestal+noise, as a

function of target contrast. The black, blue, red and green lines correspond to pedestal contrasts 0, 0.009,

0.017 and 0.025, respectively.

4.2.2 Results of the luminance step simulation

Figure 7A presents the spike rates produced by the model for the steady-state and step-

up situations. In qualitative terms, comparing the two graphs reveals firstly, that the

spike rates are consistently suppressed in the step-up situation and secondly, that the

suppression becomes relatively weaker with increasing grating contrast. However, a

quantitative comparison with the psychophysical data requires the application of the

matching procedure of equation 1. Figure 7B compares the simulation results with the

psychophysical data from Study I (Figure 5). Both have been plotted as relative

suppression due to the luminance step, calculated in the conventional way: (Cphysical –

Capparent) / Cphysical, where Capparent corresponds to the comparison contrast with which the

observer or the model judges the comparison stimulus to have the higher contrast on 50

% of the trials (dashed lines in Figure 6A).

Page 26: Spatial and temporal determinants of context modulation in ...

26

Figure 7. Results of simulations of the effect of a luminance step on spike frequencies and perceived

contrast and comparison to psychophysical data. A) Spike frequencies produced by model simulations as

functions of grating contrast for the step-up situation (blue line) and the steady-state situation (red line).

B) Relative suppression caused by the luminance step. The continuous curve is the result of the model

simulations. The data points are derived from Figure5A of Study I (Figure 1A in the original study of

Kilpeläinen et al., 2011), re-plotted as relative suppression of perceived contrast. The symbols for the

three subjects are the same as in the original figure.

4.2.3 Results of the pedestal effect simulation

In this simulation, the model produces discrimination data for 32 different model units,

5 examples of which are presented in Figure 8A. The different curves illustrate how the

curves for units seeing different parts of the grating dip at different pedestal contrasts.

For example, the unit that has the highest threshold at zero pedestal is the one that has

the lowest threshold at pedestal contrasts higher than 0.07 (the curve reaching its

minimum around pedestal contrast 0.13). To simplify matters as far as possible, it was

assumed that the discrimination threshold of the whole population of model units is

equal to the lowest threshold of any model unit. Schematically this corresponds to the

lower envelope of the curve pattern in Figure 8A.

Figure 8B compares simulated 75 % discrimination threshold with the

psychophysical data of Henning and Wichmann (2007). It is evident that the model

produces the familiar dipper function, the shape of which is roughly in agreement with

the psychophysically measured functions (which themselves vary quite considerably

between subjects). Since the main point of the simulation was to qualitatively reproduce

the overall pattern of data, Figure 8C compares the simulations and data, for three

different discrimination threshold criteria, on normalized axes. In addition to the correct

Page 27: Spatial and temporal determinants of context modulation in ...

27

general shape of the functions, the model correctly predicts two overall patterns of the

data. First, lowering the criterion deepens the dip trough and moves it to higher pedestal

contrast. Second, the slopes of the subsequent rising parts of the functions are generally

well reproduced. The main shortcoming is that the model produces a somewhat too

shallow dip in the 60% criterion function.

Figure 8. Results of the pedestal effect simulations. A) Discrimination thresholds from 5 different model

units for the 75 % threshold criterion. B) Comparison of model simulations and data for three subjects at

the 75 % correct threshold criterion (data from Henning and Wichmann, 2007). C) Comparison of

simulations and data at three different % correct criteria (as indicated in the leftmost panel) on normalized

axes. Panels S1 and S2 show data for two subjects and the third panel (bottom right) gives the model

curves for the same levels of % correct. The data markers are as in Figure 4 of Henning & Wichmann

(2007, first two subjects).

Page 28: Spatial and temporal determinants of context modulation in ...

28

4.3 Study III: The time-course of surround suppression in

human visual system

Surround suppression is a form of contextual modulation where the perceptual or neural

response to a centre stimulus is decreased by a completely non-overlapping surround

stimulus (Figure 3B). It is trivial that very large differences in the onset times of the two

stimuli will render the surround ineffective. However, since the critical part of neural

signals is very short and transient (Ludwig, Gilchrist, McSorley & Baddeley, 2005;

Müller, Metha, Krauskopf & Lennie, 2001), even modest differences in the times when

the centre and surround signals arrive at the level of interaction can have a considerable

effect. Such differences are likely to arise since the latencies of the retinal outputs

depend, at least, on the contrasts and sizes of different stimulus parts (Donner, 1981,

1989; Donner & Fagerholm, 2003). Study III measured how the timing of the surround

relative to the centre affects the surround’s suppressive effect, using stimuli designed to

probe effects of the “extra-classical receptive field”.

4.3.1 Methods

The effect of a surround grating on the perceived contrast of a centre grating was

measured psychophysically in three subjects (2 naïve). The subjects viewed the screen

binocularly from a distance of 114 cm. The centre stimulus was a circular patch of sine

wave grating (4 cpd). The surround stimuli were similar gratings in an annulus window.

The contrast and the size of the surround stimulus were separately varied in the two

experiments of this study. The mean luminance of the screen was 33 cd/m2. The most

effective timing of the surround was determined with a procedure very similar to that in

Study I. However, in this study, the centre and the surround stimuli were presented

either with simultaneous onset or with various amounts of asynchrony and the mean

luminance of the screen remained constant. The comparison stimulus was presented

1100 ms after the target. Trials were presented until both staircases reached 10 reversal

points.

4.3.2 Results

The timing of the surround stimulus had a clear effect on its ability to suppress the

perceived contrast of the centre stimulus. If the centre and surround were presented with

an onset difference greater than approximately 75 ms, the surround had very little effect

Page 29: Spatial and temporal determinants of context modulation in ...

29

(Figure 9A). However, simultaneous onset did not lead to the strongest effect, either.

The higher the surround contrast, the later it had to be presented in order to have the

maximal effect (Figure 9B). This indicates that the contrast dependence of neural

latencies, which is known to exist on the level of the retinal output, is transferred up to

the level where the centre and surround signals interact. However, the relative latencies

of the surround were consistently too short, leaving no time for the transfer of

suppression at the site of interaction (compare broken curves and data markers in Figure

9B).

Figure 9. A) Perceived contrast of a centre grating as a function of the onset difference between the

centre and surround, separately for two subjects. Negative values on the x-axis mean that the surround

was presented before the centre. Smooth curves are best fitting normal or lognormal functions. B)

Relative latency, i.e., the opposite of the most effective surround onset time (see broken lines from A), as

a function of surround contrast. Centre contrast was always 0.2. Smooth curves are best fitting contrast-

latency functions. In the broken lines, a calculated median delay for horizontal and feedback connections

has been added (see labels next to curves). The delay calculations are based on the average distance

Page 30: Spatial and temporal determinants of context modulation in ...

30

between the centre and the surround on V1 and cortical propagation velocities. The thick curve indicates

no mediation delay, and the thin solid curve the best fit to the data.

In the first experiment the area of the surround annulus was always 3 times as large

as the centre patch. In the second experiment we studied the effect of surround area on

the optimal timing of the surround stimulus. Decreasing the surround area from 3-fold

increased the relative latency of the surround, and when centre and surround areas were

equal (leftmost data points in Figure 10), the data actually matched the predictions

based on the mediation via feedback connections. In contrast, increasing the surround

area from 3-fold did not decrease the relative latency of the surround. On the contrary,

the latency increased somewhat (rightmost data points in Figure 10).

Figure 10. Relative latency as a function of surround area (area of the annulus / area of the central patch)

for three subjects. Relative latencies and predictions for horizontal and feedback connections (see labels)

calculated as in Figure 9. Predictions were made both for a case where average propagation distance stays

the same (straight lines) and for a case where average propagation distance increases with increasing

surround size (curved lines).

4.4 Study IV: The spatial extent of the ECRF mechanisms in

human visual system

The result of Study III where the relative latency of the surround signal first decreased

with increasing surround area, but then started increasing again, suggests a complex

interplay between the different ECRF components and their latencies. To study the

Page 31: Spatial and temporal determinants of context modulation in ...

31

spatial extents of summation and suppression in the human visual system, Study IV

combined psychophysics, fMRI and computational modelling.

4.4.1 Methods

The basic paradigm was the one that has been routinely used to describe the structure of

the ECRF in single cells in the primate visual system (Cavanaugh, Bair & Movshon,

2002a; Sceniak, Hawken & Shapley, 2001; Solomon et al., 2006), the measurement of

responses to gratings of various sizes (Figure 3A).

There is no straightforward way to psychophysically measure the response to a

certain sub-area (the central area in this case) of a uniform, suprathreshold grating.

Instead, we assumed that discriminating the Gabor target from the underlying grating is

an index of, at least, the slope of the underlying neural contrast-response function or

neural noise, or both (Geisler & Albrecht, 1997; Georgeson & Meese, 2006; Klein,

2006). Both of these factors change monotonically with neural responses around the 15

% contrast used here. We thus interpret an increase in the discrimination threshold as a

correlate of an increase in the responses of the neurons that process the area covered by

the target.

The discrimination threshold with each mask size was measured with the method of

constant stimuli in 4 subjects (1 naïve). All stimuli were drifting gratings in oblique

orientation (1 cpd, 4 Hz, presentation time 180 ms) presented at 13 deg eccentricity. The

target to be detected was a Gabor grating (SD 0.125 deg, see Figure 11A) superimposed

on the centre of the mask. Mask diameter varied between measurements (0.5-24 deg),

while target size was held constant. The target contrast corresponding to 75 % correct

answers was taken as the threshold value. Viewing distance was 49.5 cm, mean

luminance 40 cd/m2.

In fMRI, the stimuli were the same as in the psychophysical experiment, but without

the target Gabor and with a 10.8 s presentation time (drift direction reversed every 450

ms. Attention was controlled with a fixation task. Six subjects (3 naïve) participated in

the fMRI experiment. See 3.2 for further details on fMRI methods.

Both in psychophysics and fMRI, functions of the form difference-of-integrals-of-

Gaussians were fitted to the data for interpolation purposes. The size of the summation

field was measured by the diameter where the fitted function peaked, the size of the

surround field by the diameter where the function had decreased to a value 5 % of that

Page 32: Spatial and temporal determinants of context modulation in ...

32

at the largest diameter. The suppression index was calculated as the (Rpeak - Rsup) / Rpeak,

where Rpeak is the peak value of the function and Rsup is the value at the largest diameter.

4.4.2 Results

The overall shape of the psychophysically measured functions greatly resembles

functions measured in macaque V1 (Cavanaugh et al., 2002a; 2002b). The functions

first increase to a peak and then decrease at a decelerating rate and reach a plateau

(Figure 11B). Quantitative comparison between the two types of data is inevitably quite

speculative, but it is nevertheless interesting that the values observed here (summation

field 2.1 ± 0.3 deg, surround field 6.2 ± 2.5 deg, suppression index 0.34 ± 0.08) are

quite close to average values in the Cavanaugh et al. (2002a; 2002b) macaque data (2.7

± 0.14 deg, 4.5 ± 0.22, 0.32 ± 0.02).

Figure 11. A) The contrast profile of the stimulus used in the psychophysical experiment of Study IV. B)

The threshold contrast increment (∆C) for detecting the gabor target from the mask grating as a function

of mask diameter. The broken lines indicate the summation and surround fields, defined from the fitted

DOiG functions (smooth curves).

The data of the fMRI experiment replicate the familiar pattern, although the function

relating signal size to stimulus diameter reached a plateau only in the data from some

subjects (Figure 12). The average summation field diameter in V1 indicated by the

BOLD signal was 3.2 ± 1.3° (95% CI), the surround field diameter 15 ± 2.3°, and the

suppression index 0.87 ± 0.23. V2/V3 values were 5.6 ± 6.0°, 15 ± 6.4°, and 0.82 ±

0.68. In line with single-cell studies (Shushruth, Ichida, Levitt & Angelucci, 2009;

Zhang et al., 2005), the data showed a trend towards larger summation field size in V2

than in V1.

Page 33: Spatial and temporal determinants of context modulation in ...

33

Figure 12. BOLD signal change as a function of grating diameter in human visual areas V1 and V2. The

individual data of six subjects (gray lines) and mean over subjects (black line). The label V2/V3 refers to

the fact there was no full certainty in all subjects that the VOI was exclusively within V2, as the

activation was close to the V2/V3 border.

The fMRI measurement used a typical (2.5 mm)3 voxel size. Such a cortical volume

includes hundreds of thousands of neurons with significantly varying receptive field

locations. To link the measured voxel responses to the underlying neural responses, a

model was developed. The model included a population of stereotypical model neurons

differing only in their receptive field locations (Figure 13A). The retinotopic locations

of the receptive fields were determined by projecting a single voxel in V1 at 13°

eccentricity to visual space (Figure 13A, top). The projection was based on human V1

magnification factors (Duncan & Boynton, 2003).

After determining the RF positions, the response of a model neuron at each visual

field location to a stimulus size was given by a two-dimensional difference-of-

integrals-of-Gaussians function, integrated over the stimulus area. The function is

essentially the Sceniak et al. (2001) RF model applied in two dimensions. The

parameters for the model neuron responses were extracted separately from the

psychophysical experiment of this study and from macaque single neuron data from

Cavanaugh, Bair and Movshon (2002a; 2002b), but only simulations based on the

psychophysical parameters are shown here. The black and red solid curves in Figure

13A (bottom) illustrate how receptive field positions have a significant influence on the

stimulus diameter at which the responses peak. After computing the response of each

model neuron, the voxel response is achieved by summing over all model neurons i.e.,

over all RF locations. We used 3600 model neurons, which was deemed sufficient to

Page 34: Spatial and temporal determinants of context modulation in ...

34

represent realistically the output of the hundred-fold larger, but much more

computationally demanding actual population.

Figure 13B compares the simulation results with the fMRI data from V1. The model

captures the summation part and some of the suppressing part well, but whereas the

model function plateaus, the measured functions continue decreasing. Figure 13C shows

that the model generalizes to a completely novel stimulus paradigm, where the

measured neural population is displaced from the centre of stimulation.

Figure 13. A) The logic of the voxel response model. The model neurons are evenly distributed in the

voxel in cortex (dashed line square). The positions of the ECRFs of the model neurons are determined by

projecting the voxels to visual space according to human magnification factors (dashed-line quadrangle).

Different positions within the voxel, and relative to the centre of stimulation, lead to differing response

functions for the different model neurons (red vs. black curve). The voxel response is the sum of all

individual responses (dashed curve). B) Comparison of fMRI data (one thin line for each subject) and

model simulations (thick black curve). Normalized responses are plotted as functions of stimulus

diameter. C) fMRI data and model simulations when the voxel of interest is in the centre of stimulation

(black lines, same settings as in A, but separate data) or 3.6 deg toward the fovea from the centre of

stimulation (red lines).

Page 35: Spatial and temporal determinants of context modulation in ...

35

5. Discussion

The visual system is an active system that receives an immensely dynamic spatio-

temporal pattern of light and transforms that into features that facilitate object

recognition and action planning. Performing this task within the constraints of the

biological visual system requires the visual information to be modulated and processed

between light absorption and perception. This thesis focuses on two types of modulation

effects, where the neural signal elicited by a certain stimulus is affected by additional

stimulation either at the same or at immediately adjacent location in space and time. I

will first discuss these two types of effects separately and then consider the temporal

nature of neural signals, a concept that links many of the studies of the thesis.

5.1 Modulation caused by changes in the underlying context

Arguably the simplest change in the context of a contrast stimulus is that its mean

luminance changes at the moment of presentation. Such changes are very naturalistic,

since they occur (with varying magnitude) with every saccade during viewing of natural

scenes (Frazor & Geisler, 2006). Effects of such a change on the perception of supra-

threshold contrast were investigated in Study I. In line with earlier results regarding

detection thresholds (Geisler, 1978; Poot et al., 1997), the change in mean luminance

caused significant attenuation of the perceived contrast of the target grating. Attenuation

as such is an expected result. At the moment of the change, the cones are adapted to the

lower mean luminance and the 0.7 log step upward in mean luminance transiently

consumes about 60 % of the dynamic range of the cones (Schnapf et al., 1990; Valeton

& Van Norren, 1983), leaving a heavily compressed range available for transmitting a

signal about contrast, i.e., each “pixel’s” deviation from the mean luminance.

Why, then, was attenuation not observed in the two single neuron recordings (see,

Geisler et al., 2007; Mante et al., 2005) in the cat visual system? Since we obtained very

similar results with stimuli near and much above threshold, as well as with gratings and

increment/decrement bars, stimulus differences are probably not the cause of the

discrepancy between the single neuron recordings and the psychophysical data. The

most probable reason is that the single cell studies were interested in responses after a

step to a certain absolute luminance level, rather than responses after a luminance step

Page 36: Spatial and temporal determinants of context modulation in ...

36

of a certain size. In those studies, responses were averaged over the size and direction of

luminance steps, potentially obscuring the effects of step size.

However, if the compression of cone peak amplitudes were the only effect at play,

attenuation should increase with increasing target contrast. Instead, attenuation was

roughly constant in absolute terms, meaning that it became relatively weaker at higher

contrasts. While the compression of cone responses would undoubtedly affect

increments at the peak of the responses, it may lose its significance in responses

signalling strong contrasts with sharp onset. This is due to a basic, but rather

underappreciated fact of retinal signalling: Unlike the peak amplitude, the amplitude

during the earliest rise of photoreceptor responses is a completely linear function of

stimulus intensity over a very wide range of stimulus intensities (Baylor et al., 1974;

Lamb & Pugh, 1992) and adaptation levels (Friedburg, Thomas & Lamb, 2001; Hood &

Birch, 1993). Only the earliest rise of the response is strictly linear, however, and the

effect of the compressive non-linearity gradually increases towards the peak of the

response. It follows that with a weak stimulus contrast, a mechanism which integrates

and thresholds the cone signals, such as a retinal ganglion cell, integrates the cone

contrast all the way to the non-linear parts to produce reliable spike responses (see solid

curves in Figure 14 for a schematic illustration). In that situation, the difference integral

for the responses with the luminance step (blue) reaches spike thresholds (dashed

arrows) much more slowly than the integral for the responses with a steady mean

luminance (red) and, consequently, produces a lower spike frequency. With an

increasing contrast (dashed lines), however, the ganglion cell reaches its thresholds

increasingly at the same pace in the two situations, approaching (but never quite

reaching) the point where all information is based on the strictly linear part of the

photoreceptor response. As a result, the spike rates also converge with increasing

contrast (Figure 7A).

Page 37: Spatial and temporal determinants of context modulation in ...

37

Figure 14. A schematic presentation of spike generation by an integrating and thresholding operator from

integrals of cone contrast responses (Rmax-Rmean). Blue lines correspond to the step-up situation, red lines

to the steady-state situation. Solid lines show the situation for a low contrast stimulus, dashed lines for a

high contrast stimulus.

Study II tested whether the above presented principle, implemented within a realistic

computational model, could reproduce the psychophysical data of Study I. The

modelling study provides a proof of principle. The model correctly captures the

decrease of attenuation, and roughly the strength of attenuation, with parameters strictly

fixed by primate physiology.

The credibility of the model is further supported by the fact that the same

mechanisms also reproduced the main features of the classical pedestal effect, where the

contrast context underlying the target changes abruptly and simultaneously with target

onset. The pedestal effect has traditionally been explained with a hypothetical sigmoid-

shaped contrast-response function (Legge & Foley, 1980), where the initial dip in

thresholds would reflect the initial acceleration of the contrast-response function and the

subsequent rise would be caused by deceleration of the function at higher contrasts.

Later it has been indicated that the need for signals to exceed internal noise levels,

which are possibly signal-dependent, can also lead to the initial dip (Geisler & Albrecht,

1997; Gur, Beylin & Snodderly, 1997; Georgeson & Meese, 2006). Common to the

theories above is the assumption that discrimination is always determined by the

neurons most sensitive to the target stimulus. More current theories suggest that to

correctly grasp the discrimination performance of a psychophysical observer, it is

necessary to consider neurons with varying stimulus preferences (Chirimuuta &

Tolhurst, 2005; Goris, Wichmann & Henning, 2009).

The model presented here is not categorically at odds with any of the earlier theories.

For example, the idea of multiple units with varying preferences is inherent in the idea

Page 38: Spatial and temporal determinants of context modulation in ...

38

of units that read the cone responses at different parts of the grating. However, Study II

demonstrates that basic retinal mechanisms, which undoubtedly are at work in any case,

can generate the functions observed in psychophysical experiments. Thus, it seems that

the relevance of those mechanisms can only be dismissed by an explanation of where

and how the effect is lost and a similar function recreated by mechanisms upstream.

Since there are, at least on the surface, many differences between the two

psychophysical data sets of Study II, it may appear that the two simulations also rely on

quite different mechanisms. It is thus worthwhile to summarize the common aspects of

the two simulations. Firstly, cone responses were computed with identical functions and

parameters. Secondly, cone signals were transformed into spike rates with the same

functions and only one modest parameter change, a higher spike threshold in the

pedestal effect simulation. Even this adjustment is understandable since, in reality, there

are likely post-retinal thresholding mechanisms, such as the spike rate dependent spike

transfer probability of thalamic neurons (Carandini et al., 2007), which should not

significantly affect the perception of supra-threshold contrast. Thirdly, we used the cone

contrast as input to the thresholding integrator in the luminance step simulation but the

responses to each individual pixel of the grating in the pedestal effect. That, however, is

physiologically completely realistic, since parasol/Y ganglion cells of the primate retina

are capable of responding to both aspects of a grating stimulus (Crook et al., 2008;

Demb, Zaghloul, Haarsma & Sterling, 2001). Of course, it is not currently known how

the two response types are combined in different situations at higher levels of the visual

system. Finally, although the two main mechanisms are necessary in both simulations,

the linear-nonlinear cone responses are more critical in creating decelerating,

converging contrast response functions in the luminance step simulation. In the pedestal

effect simulation, in contrast, the threshold mechanism as such creates the dip and the

nature of cone signals becomes important on the later part of the dipper function.

5.2 Modulation caused by changes in the adjacent context

In surround suppression, the effect investigated in Studies III and IV, the target stimulus

itself remains completely unchanged and the modulation effect is instead caused by a

change in the immediately adjacent visual space. This brings about the need for an

additional mechanism, one which transmits the signal from neurons processing the

Page 39: Spatial and temporal determinants of context modulation in ...

39

surround area to neurons processing the centre. The identification of the mechanism(s)

can greatly benefit from characterization of the spatial and temporal principles involved.

The temporal principles involved in surround suppression (called contrast-contrast

suppression in the original publication) were determined in Study III. The determinants

of the optimal timing of the surround stimulus (∆L) were divided into three linearly

summed factors, according to equation 3.

)()()( lDALCLL (3),

where ∆L(C) is the latency difference due to different contrasts of the centre and

surround stimuli, ∆L(A) is the difference due to different stimulus areas and D(l) is the

delay caused by the mediation of the suppressive signal from the neurons processing the

surround to the neurons processing the centre, at the level where the two signals

interact.

The contrast-related latency difference ∆L(C) followed the contrast-latency

relationship of retinal ganglion cells (Donner, 1989), which is indicated by the similar

shape of the model predictions and the data patterns in Figure 9B. The relationship is in

line with the effects of contrast on reaction times (Donner & Fagerholm, 2003), and

arises from a thresholding mechanism acting on the early rise of photoreceptor

responses, similarly as outlined in Studies I and II. However, when delays due to the

transmission of suppression D(l) derived from the literature (Girard et al., 2001) were

included, it appeared that the observed suppression was consistently too fast. This is

indicated by the vertical distance between the data and the model predictions based

either on horizontal connections or on feedback connections (see labels horiz. and

feedb.) in Figure 9B.

Perhaps the larger area of the surround gave it an advantage in the latency

comparison by affecting ∆L(A). We therefore proceeded to measure the effect of

surround area. When the areas and contrasts of the centre and surround stimuli were

equal, the optimal presentation time of the surround stimulus was approximately 5 ms

before the centre (leftmost data points in Figure 10). This is in fair agreement with the

model predictions based on feedback connections, but irreconcilable with the

predictions based on horizontal connections. When surround size was increased to 2-

Page 40: Spatial and temporal determinants of context modulation in ...

40

fold and 3-fold, the relative latency of the surround decreased, although the average

propagation distance in cortex almost certainly increased. This is most easily

understood as reflecting acceleration in the surround signals due to increased surround

area. A similar acceleration effect of increased grating area has been observed in human

reaction times (Harwerth & Levi, 1978) and it can plausibly be explained by many

complementary mechanisms. On one hand, the neurons that mediate the suppressive

surround field get a stronger input to their RF and thus respond with a shorter latency

(see eg., Liu, Hashemi-Nezhad & Lyon, 2011). In addition, more neurons processing

the surround get an input and, as a result, more neurons processing the centre get their

surround fields stimulated, leading to probability summation type of acceleration.

In contrast, the subsequent slowing down of the surround signal with an increase

from 3-fold to 10-fold area is a much more puzzling result. It is true that the average

distance from the surround to the centre increases, but not very much. Why does the

area-related acceleration not overcome the increasing propagation distance after the 3-

fold surround area? It is, at this point, important to keep in mind that we measured the

time-course of the strongest suppression, not the fastest. In that light, it is interesting

that a recent study found three types of ECRFs in primate V1. All neurons show

surround suppression right after response onset, but after that, the suppression subsides

in about 25 % of neurons and turns to facilitation in a few percent of neurons (Liu et al.,

2011). As a result, the surround timing leading to the strongest suppression is likely the

result of a complex combination of faster surround components from longer distances

and the slower components from shorter distances. After all, Bair et al. (2003) did not

show that the connections underlying surround suppression were exclusively feedback,

only that they were too fast to be exclusively horizontal.

Regardless of the complexity of the underlying circuitry, understanding the spatial

structure of the ECRF mechanism could shed light on the complex relationship between

stimulus area and latency. In Study IV, three complementary methodologies were

combined to this end. Functional MRI was used to get an estimate from a specific level

of the visual system (e.g., V1). The drawback of fMRI is that it is known to be an

extremely indirect correlate of spiking activity (Attwell et al., 2010). However, in most

cases, the correlation is very strong (Lee et al., 2010; Nir, Dinstein, Malach & Heeger,

2008). Psychophysics was used to probe what the ECRF of a spatially localized (unlike

Page 41: Spatial and temporal determinants of context modulation in ...

41

dispersed as in fMRI) neural population could approximately look like. Also,

psychophysical data are likely to correlate closely with spiking activity, as signals can

only be transmitted to the level of decision making by spiking. Finally, computational

modelling was used to analyse the relation between fMRI, psychophysics and primate

single neurons.

It is quite encouraging that the data from the different methods mostly produce a

rather coherent picture, although no attempts were made to force the different data sets

into agreement via parameter adjustments or otherwise. The comparison of primate

single cell data and our psychophysical data on one hand, and the fMRI data and

modelling results on the other hand, suggests that the ECRF profile in human V1 is

quite similar to that in non-human primates. The notable discrepancy is that in the fMRI

data, suppression is much stronger and more far-reaching than in single cells,

psychophysics or in the model simulations. Some justifiable adjustments of the

parameters of the model neurons could improve the fit. However, we suspected that the

discrepancy is mainly caused by a dissociation of spiking and the BOLD signal. An

analogous dissociation in suppression extent has been observed between spiking and

synaptic activity in single neurons in cat V1 (Anderson, Lampl, Gillespie & Ferster,

2001). Synaptic activity, in turn, correlates with the BOLD signal even in certain

situations where spiking does not (Logothetis, Pauls, Augath, Trinath & Oeltermann,

2001).

In most of the model simulations, we used parameters from our psychophysical study

rather than the macaque single cell data. Although it might appear that our rationale of

using contrast discrimination thresholds to probe cortical responses is in contradiction

with our claims about the retinal origin of the pedestal effect in Study II, this is not the

case. While the effects considered in Study II probably originate in the retina, the

signals of course thereafter travel through the cortex, where they can be modulated, for

example, by stimulation within the surround fields of V1 neurons.

What about the relationship between the time-course and spatial extent of surround

suppression? If we now take our estimate of the surround field diameter, 6.2 degrees at

13 degree eccentricity, and take into consideration the change of surround size as a

function of eccentricity in primates (Cavanaugh et al., 2002a; Teichert, Wachtler,

Michler, Gail & Eckhorn, 2007), a rough approximation of the surround diameter in

Page 42: Spatial and temporal determinants of context modulation in ...

42

human fovea would be about 3 degrees. Although such a comparison is infested with

uncertainty, it is nevertheless interesting that the accelerating effect of surround area

ceased somewhere between diameters 2.1 and 3.4 degrees in Study III.

5.3 The temporal nature of the neural signal

Studies I, II, and III all support the idea that the critical part of the neural response to

stimulus onset in cone vision is extremely short and transient. Although modulation by

underlying (Study II) and adjacent contrast context (Study III) probably involves at least

partly different mechanisms, the time-courses are remarkably similar. Both in Study III

and an overlay masking study by Georgeson and Georgeson (1987), the modulating

stimulus is effective only if presented ± 100 ms from the target stimulus. This suggests

that in feed forward visual processing, the modulating stimulus must be able to affect

the target response during its short initial part in order to be effective. The critical role

of the early part has been shown in a variety of paradigms (Hegdé & Van Essen, 2004;

Ludwig et al., 2005; Müller et al., 2001; Osborne, Bialek & Lisberger, 2004; Watson &

Nachmias, 1977; however, see also Tolhurst, 1975). Some forms of higher-level

contextual modulation, where the perceptual significance of a stimulus is altered after

the primary coding of the stimulus, such as perceptual pop-out (Smith, Kelly & Lee,

2007; Zipser, Lamme & Schiller, 1996) and attentional modulation (Roelfsema, Lamme

& Spekreijse, 1998; Wannig, Stanisor & Roelfsema, 2011) may work on longer time

scales. The effect of the luminance change in Study I also lasts a bit longer, 200-400 ms,

probably because the system must not only cope with the transient neural signal caused

by the modulating stimulus, but must also adapt to the new mean luminance in order to

restore steady-state responsivity. Indeed, the time course observed in Study I is roughly

in agreement with fast adaptation observed in macaque cones (Schnapf et al., 1990).

Due to the dynamic nature of normal visual processing observed here and in many other

studies, it cannot be emphasized enough that changing certain properties of a

modulating stimulus (e.g., size or contrast) does not merely change its strength, but also

the timing of its neural representations. As a result, the changes observed in the strength

of modulation cannot be considered as direct indicators of the strength of the

modulating signal, but indicate the combined effect of strength and timing.

Page 43: Spatial and temporal determinants of context modulation in ...

43

5.4 Conclusions

The current study provides information about the spatial and temporal constraints that

affect the effectiveness of modulating visual signals. The thesis adds to the evidence

that the relevant part of neural signals is very short and transient (Studies I, II and III).

The thesis shows that basic retinal signalling is a plausible and powerful determinant of

the time window within which modulating signals can be effective, whether the

modulation arises from stimulation at the same (Study II) or adjacent (Study III)

location in the visual field. However, in the latter case, timing of effective modulation is

also affected by spatial properties of the surround in a way that suggests the necessity of

cortical mechanisms, probably feedforward-feedback-loops between primary visual

cortex and higher visual cortices (Studies III and IV).

Page 44: Spatial and temporal determinants of context modulation in ...

44

6. References

Anderson, J. S., Lampl, I., Gillespie, D. C., & Ferster, D. (2001). Membrane potential and conductance changes underlying length tuning of cells in cat primary visual cortex. Journal of Neuroscience, 21, 2104-2112.

Angelucci, A., Levitt, J. B., Walton, E. J. S., Hupe, J. M., Bullier, J., & Lund, J. S. (2002). Circuits for local and global signal integration in primary visual cortex. Journal of Neuroscience, 22, 8633-8646.

Attwell, D., Buchan, A. M., Charpak, S., Lauritzen, M., MacVicar, B. A., & Newman, E. A. (2010). Glial and neuronal control of brain blood flow. Nature, 468, 232-243.

Bair, W., Cavanaugh, J. R., & Movshon, J. A. (2003). Time course and time-distance relationships for surround suppression in macaque V1 neurons. Journal of Neuroscience, 23, 7690-7701.

Bartfeld, E., & Grinvald, A. (1992). Relationships between orientation-preference pinwheels, cytochrome oxidase blobs, and ocular-dominance columns in primate striate cortex. Proceedings of the National Academy of Sciences, 89, 11905-11909.

Baylor, D. A., Hodgkin, A. L., & Lamb, T. D. (1974). Electrical response of turtle cones to flashes and steps of light. Journal of Physiology-London, 242, 685-727.

Bonin, V., Mante, V., & Carandini, M. (2005). The suppressive field of neurons in lateral geniculate nucleus. Journal of Neuroscience, 25, 10844-10856.

Brivanlou, I. H., Warland, D. K., & Meister, M. (1998). Mechanisms of concerted firing among retinal ganglion cells. Neuron, 20, 527-539.

Cannon, M. W., & Fullenkamp, S. C. (1991). Spatial interactions in apparent contrast - inhibitory effects among grating patterns of different spatial-frequencies, spatial positions and orientations. Vision Research, 31, 1985-1998.

Carandini, M., & Heeger, D. J. (2012). Normalization as a canonical neural computation. Nature Reviews Neuroscience, 13, 51-62.

Carandini, M., Horton, J. C., & Sincich, L. C. (2007). Thalamic filtering of retinal spike trains by postsynaptic summation. Journal of Vision, 7.

Casagrande, V. A., & Kaas, J. H. (1994). The afferent, intrinsic, and efferent connections of primary visual cortex in primates. In A. Peters & K. Rockland (Eds.), Cerebral cortex (Vol. 10, pp. 201-259). New York: Plenum Press.

Cavanaugh, J. R., Bair, W., & Movshon, J. A. (2002a). Nature and interaction of signals from the receptive field center and surround in macaque V1 neurons. Journal of Neurophysiology, 88, 2530-2546.

Cavanaugh, J. R., Bair, W., & Movshon, J. A. (2002b). Selectivity and spatial distribution of signals from the receptive field surround in macaque V1 neurons. Journal of Neurophysiology, 88, 2547 - 2556.

Chirimuuta, M., & Tolhurst, D. J. (2005). Does a bayesian model of V1 contrast coding offer a neurophysiological account of human contrast discrimination? Vision Research, 45, 2943-2959.

Croner, L. J., Purpura, K., & Kaplan, E. (1993). Response variability in retinal ganglion-cells of primates. Proceedings of the National Academy of Sciences of the United States of America, 90, 8128-8130.

Crook, J. D., Peterson, B. B., Packer, O. S., Robinson, F. R., Troy, J. B., & Dacey, D. M. (2008). Y-cell receptive field and collicular projection of parasol ganglion cells in macaque monkey retina. Journal of Neuroscience, 28, 11277-11291.

Cumming, B. G., & Parker, A. J. (1999). Binocular neurons in V1 of awake monkeys are selective for absolute, not relative, disparity. Journal of Neuroscience, 19, 5602-5618.

Dacey, D. M. (2007). Physiology, morphology and spatial densities of identified ganglion cell types in primate retina. In G. Bock & J. Goode (Eds.), Ciba foundation symposium 184 - Higher-order processing in the visual system (pp. 12-34). Chichester: John Wiley & Sons, Ltd.

Demb, J. B., Zaghloul, K., Haarsma, L., & Sterling, P. (2001). Bipolar cells contribute to nonlinear spatial summation in the brisk-transient (Y) ganglion cell in mammalian retina. Journal of Neuroscience, 21, 7447-7454.

Desai, P., Reidy, A., Minassian, D. C., Vafidis, G., & Bolger, J. (1996). Gains from cataract surgery: Visual function and quality of life. British Journal of Ophthalmology, 80, 868-873.

Donner, K. (1981). Receptive-fields of frog retinal ganglion-cells - response formation and light-dark-adaptation. Journal of Physiology-London, 319, 131-142.

Donner, K. (1989). Visual latency and brightness: An interpretation based on the responses of rods and ganglion-cells in the frog retina. Visual Neuroscience, 3, 39-51.

Page 45: Spatial and temporal determinants of context modulation in ...

45

Donner, K., & Fagerholm, P. (2003). Visual reaction time: Neural conditions for the equivalence of stimulus area and contrast. Vision Research, 43, 2937-2940.

Duncan, R. O., & Boynton, G. M. (2003). Cortical magnification within human primary visual cortex correlates with acuity thresholds. Neuron, 38, 659-671.

Ejima, Y., & Takahashi, S. (1983). Effects of high-contrast peripheral patterns on the detection threshold of sinusoidal targets. Journal of the Optical Society of America, 73, 1695-1700.

Ejima, Y., & Takahashi, S. (1985). Apparent contrast of a sinusoidal grating in the simultaneous presence of peripheral gratings. Vision Research, 25, 1223-1232.

Field, G. D., & Chichilnisky, E. J. (2007). Information processing in the primate retina: Circuitry and coding. Annual Review of Neuroscience, 30, 1-30.

Frackowiak, R. S., Friston, K. J., Frith, C. D., Dolan, R., Ashburner, J., Price, C., et al. (2003). Human brain function (2nd ed.). New York, NY: Academic Press.

Frazor, R. A., & Geisler, W. S. (2006). Local luminance and contrast in natural images. Vision Research, 46, 1585-1598.

Friedburg, C., Thomas, M. M., & Lamb, T. D. (2001). Time course of the flash response of dark- and light-adapted human rod photoreceptors derived from the electroretinogram. Journal of Physiology-London, 534, 217-242.

Geisler, W. S. (1978). Adaptation, afterimages and cone saturation. Vision Research, 18, 279-289. Geisler, W. S., & Albrecht, D. G. (1997). Visual cortex neurons in monkeys and cats: Detection,

discrimination, and identification. Visual Neuroscience, 14, 897-919. Geisler, W. S., Albrecht, D. G., & Crane, A. M. (2007). Responses of neurons in primary visual cortex to

transient changes in local contrast and luminance. Journal of Neuroscience, 27, 5063-5067. Georgeson, M. A., & Georgeson, J. M. (1987). Facilitation and masking of briefly presented gratings -

time-course and contrast dependence. Vision Research, 27, 369-379. Georgeson, M. A., & Meese, T. S. (2006). Fixed or variable noise in contrast discrimination? The jury's

still out. Vision Research, 46, 4294-4303. Gilbert, C. D., & Wiesel, T. N. (1989). Columnar specificity of intrinsic horizontal and corticocortical

connections in cat visual cortex. Journal of Neuroscience, 9, 2432-2442. Girard, P., Hupe, J. M., & Bullier, J. (2001). Feedforward and feedback connections between areas V1

and V2 of the monkey have similar rapid conduction velocities. Journal of Neurophysiology, 85, 1328-1331.

Gollisch, T., & Meister, M. (2010). Eye smarter than scientists believed: Neural computations in circuits of the retina. Neuron, 65, 150-164.

Goris, R. L. T., Wichmann, F. A., & Henning, G. B. (2009). A neurophysiologically plausible population code model for human contrast discrimination. Journal of Vision, 9.

Grigorescu, C., Petkov, N., & Westenberg, M. A. (2003). Contour detection based on nonclassical receptive field inhibition. IEEE Transactions on Image Processing, 12, 729-739.

Grinvald, A., Lieke, E. E., Frostig, R. D., & Hildesheim, R. (1994). Cortical point-spread function and long-range lateral interactions revealed by real-time optical imaging of macaque monkey primary visual-cortex. Journal of Neuroscience, 14, 2545-2568.

Gur, M., Beylin, A., & Snodderly, D. M. (1997). Response variability of neurons in primary visual cortex (V1) of alert monkeys. The Journal of Neuroscience, 17, 2914-2920.

Harwerth, R. S., & Levi, D. M. (1978). Reaction-time as a measure of suprathreshold grating detection. Vision Research, 18, 1579-1586.

Hegdé, J., & Van Essen, D. C. (2004). Temporal dynamics of shape analysis in macaque visual area V2. Journal of Neurophysiology, 92, 3030-3042.

Henning, G. B., & Wichmann, F. A. (2007). Some observations on the pedestal effect. Journal of Vision, 7.

Hood, D. C., & Birch, D. G. (1993). Human cone receptor activity - the leading-edge of the alpha-wave and models of receptor activity. Visual Neuroscience, 10, 857-871.

Hubel, D. H., & Wiesel, T. N. (1962). Receptive fields, binocular interaction and functional architecture in cats visual cortex. Journal of Physiology-London, 160, 106-154.

Hubel, D. H., & Wiesel, T. N. (1968). Receptive fields and functional architecture of monkey striate cortex. Journal of Physiology-London, 195, 215-243.

Hupé, J.-M., James, A. C., Girard, P., & Bullier, J. (2001). Response modulations by static texture surround in area V1 of the macaque monkey do not depend on feedback connections from V2. Journal of Neurophysiology, 85, 146-163.

Page 46: Spatial and temporal determinants of context modulation in ...

46

Kilpeläinen, M., Nurminen, L., & Donner, K. (2011). Effects of mean luminance changes on human contrast perception: Contrast dependence, time-course and spatial specificity. PLoS ONE, 6, e17200.

Klein, S. A. (2006). Separating transducer non-linearities and multiplicative noise in contrast discrimination. Vision Research, 46, 4279-4293.

Knierim, J. J., & van Essen, D. C. (1992). Neuronal responses to static texture patterns in area V1 of the alert macaque monkey. Journal of Neurophysiology, 67, 961-980.

Krumpaszky, H. G., Dietz, K., Mickler, A., & Selbmann, H. K. (1999). Mortality in blind subjects. Ophthalmologica, 213, 48-53.

Kurki, I., Peromaa, T., Hyvärinen, A., & Saarinen, J. (2009). Visual features underlying perceived brightness as revealed by classification images. PLoS ONE, 4, e7432.

Lamb, T. D., & Pugh, E. N. (1992). A quantitative account of the activation steps involved in phototransduction in amphibian photoreceptors. Journal of Physiology-London, 449, 719-758.

Lee, J. H., Durand, R., Gradinaru, V., Zhang, F., Goshen, I., Kim, D.-S., et al. (2010). Global and local fMRI signals driven by neurons defined optogenetically by type and wiring. Nature, 465, 788-792.

Legge, G. E., & Foley, J. M. (1980). Contrast masking in human vision. Journal of the Optical Society of America, 70, 1458-1471.

Liu, Y. J., Hashemi-Nezhad, M., & Lyon, D. C. (2011). Dynamics of extraclassical surround modulation in three types of V1 neurons. Journal of Neurophysiology, 105, 1306-1317.

Logothetis, N. K., Pauls, J., Augath, M., Trinath, T., & Oeltermann, A. (2001). Neurophysiological investigation of the basis of the fMRI signal. Nature, 412, 150-157.

Ludwig, C. J. H., Gilchrist, I. D., McSorley, E., & Baddeley, R. J. (2005). The temporal impulse response underlying saccadic decisions. Journal of Neuroscience, 25, 9907-9912.

Maffei, L., & Fiorentini, A. (1976). The unresponsive regions of visual cortical receptive-fields. Vision Research, 16, 1131-1139.

Mante, V., Frazor, R. A., Bonin, V., Geisler, W. S., & Carandini, M. (2005). Independence of luminance and contrast in natural scenes and in the early visual system. Nature Neuroscience, 8, 1690-1697.

McAlonan, K., Cavanaugh, J., & Wurtz, R. H. (2008). Guarding the gateway to cortex with attention in visual thalamus. Nature, 456, 391-394.

Müller, J. R., Metha, A. B., Krauskopf, J., & Lennie, P. (2001). Information conveyed by onset transients in responses of striate cortical neurons. Journal of Neuroscience, 21, 6978-6990.

Nachmias, J., & Sansbury, R. V. (1974). Grating contrast - discrimination may be better than detection. Vision Research, 14, 1039-1042.

Nir, Y., Dinstein, I., Malach, R., & Heeger, D. J. (2008). BOLD and spiking activity. Nature Neuroscience, 11, 523-524.

Niven, J. E., & Laughlin, S. B. (2008). Energy limitation as a selective pressure on the evolution of sensory systems. Journal of Experimental Biology, 211, 1792-1804.

Nothdurft, H.-C., Gallant, J. L., & Van Essen, D. C. (1999). Response modulation by texture surround in primate area V1: Correlates of "Popout" Under anesthesia. Visual Neuroscience, 16, 15-34.

Nurminen, L., Peromaa, T., & Laurinen, P. (2010). Surround suppression and facilitation in the fovea: Very long-range spatial interactions in contrast perception. Journal of Vision, 10.

O'Connor, D. H., Fukui, M. M., Pinsk, M. A., & Kastner, S. (2002). Attention modulates responses in the human lateral geniculate nucleus. Nature Neuroscience, 5, 1203-1209.

Olzak, L. A., & Laurinen, P. I. (1999). Multiple gain control processes in contrast-contrast phenomena. Vision Research, 39, 3983-3987.

Osborne, L. C., Bialek, W., & Lisberger, S. G. (2004). Time course of information about motion direction in visual area MT of macaque monkeys. Journal of Neuroscience, 24, 3210-3222.

Parker, A. J. (2007). Binocular depth perception and the cerebral cortex. Nature Reviews Neuroscience, 8, 379-391.

Poot, L., Snippe, H. P., & van Hateren, J. H. (1997). Dynamics of adaptation at high luminances: Adaptation is faster after luminance decrements than after luminance increments. Journal of the Optical Society of America A-Optics Image Science and Vision, 14, 2499-2508.

Ridder, W. (2006). Visual evoked potentials in animals. In J. R. Heckenlively & G. B. Arden (Eds.), Principles and practice of clinical electrophysiology of vision (2 ed.). Cambridge, MA: MIT Press.

Rieke, F., & Baylor, D. A. (1998). Single-photon detection by rod cells of the retina. Reviews of Modern Physics, 70, 1027.

Rieke, F., & Rudd, M. E. (2009). The challenges natural images pose for visual adaptation. Neuron, 64, 605-616.

Page 47: Spatial and temporal determinants of context modulation in ...

47

Roelfsema, P. R., Lamme, V. A. F., & Spekreijse, H. (1998). Object-based attention in the primary visual cortex of the macaque monkey. Nature, 395, 376-381.

Sceniak, M. P., Hawken, M. J., & Shapley, R. (2001). Visual spatial characterization of macaque V1 neurons. Journal of Neurophysiology, 85, 1873-1887.

Schnapf, J. L., Nunn, B. J., Meister, M., & Baylor, D. A. (1990). Visual transduction in cones of the monkey macaca fascicularis. Journal of Physiology-London, 427, 681-713.

Schwabe, L., Obermayer, K., Angelucci, A., & Bressloff, P. C. (2006). The role of feedback in shaping the extra-classical receptive field of cortical neurons: A recurrent network model. Journal of Neuroscience, 26, 9117-9129.

Sereno, M. I., Dale, A. M., Reppas, J. B., Kwong, K. K., Belliveau, J. W., Brady, T. J., et al. (1995). Borders of multiple visual areas in humans revealed by functional magnetic-resonance-imaging. Science, 268, 889-893.

Seriès, P., Lorenceau, J., & Frégnac, Y. (2003). The "Silent" Surround of V1 receptive fields: Theory and experiments. Journal of Physiology-Paris, 97, 453-474.

Shapley, R., & Enroth-Cugell, C. (1984). Visual adaptation and retinal gain controls. In N. Osborne & G. Chader (Eds.), Progress in retinal research (Vol. 3, pp. 263-346). Oxford: Pergamon Press.

Shapley, R. M., & Victor, J. D. (1979). Nonlinear spatial summation and the contrast gain control of cat retinal ganglion cells. Journal of Physiology,290, 141-161.

Shushruth, S., Ichida, J. M., Levitt, J. B., & Angelucci, A. (2009). Comparison of spatial summation properties of neurons in macaque V1 and V2. Journal of Neurophysiology, 102, 2069-2083.

Smith, M. A., Kelly, R. C., & Lee, T. S. (2007). Dynamics of response to perceptual pop-out stimuli in macaque V1. Journal of Neurophysiology, 98, 3436-3449.

Solomon, S. G., Lee, B. B., & Sun, H. (2006). Suppressive surrounds and contrast gain in magnocellular-pathway retinal ganglion cells of macaque. Journal Of Neuroscience, 26, 8715-8726.

Solomon, S. G., White, A. J. R., & Martin, P. R. (2002). Extraclassical receptive field properties of parvocellular, magnocellular, and koniocellular cells in the primate lateral geniculate nucleus. Journal of Neuroscience, 22, 338-349.

Stevens, C. F. (2001). An evolutionary scaling law for the primate visual system and its basis in cortical function. Nature, 411, 193-195.

Teichert, T., Wachtler, T., Michler, F., Gail, A., & Eckhorn, R. (2007). Scale-invariance of receptive field properties in primary visual cortex. BMC Neuroscience, 8, 38.

Tolhurst, D. J. (1975). Reaction-times in detection of gratings by human observers - probabilistic mechanism. Vision Research, 15, 1143-1149.

Valeton, J. M., & Van Norren, D. (1983). Light adaptation of primate cones: An analysis based on extracellular data. Vision Research, 23, 1539-1547.

Walraven, J., Enroth-Cugell, C., Hood, D. C., MacLeod, D. I. A., & Schnapf, J. L. (1990). The control of visual sensitivity: Receptoral and postreceptoral processes. In L. Spillmann & J. S. Werner (Eds.), Visual perception: The neurophysiological foundations (pp. 53-101). San Diego, CA: Academic Press, Inc.

Van Essen, D. C., & Maunsell, J. H. R. (1983). Hierarchical organization and functional streams in the visual cortex. Trends in Neurosciences, 6, 370-375.

Vanni, S., Henriksson, L., & James, A. C. (2005). Multifocal fMRI mapping of visual cortical areas. Neuroimage, 27, 95-105.

Wannig, A., Stanisor, L., & Roelfsema, P. R. (2011). Automatic spread of attentional response modulation along gestalt criteria in primary visual cortex. Nature Neuroscience, 14, 1243-1244.

Watson, A. B., & Nachmias, J. (1977). Patterns of temporal interaction in the detection of gratings. Vision Research, 17, 893-902.

Webb, B. S., Dhruv, N. T., Solomon, S. G., Tailby, C., & Lennie, P. (2005). Early and late mechanisms of surround suppression in striate cortex of macaque. Journal of Neuroscience, 25, 11666-11675.

Weber, A. J., & Harman, C. D. (2005). Structure-function relations of parasol cells in the normal and glaucomatous primate retina. Investigative Ophthalmology and Visual Science, 46, 3197-3207.

Werblin, F. S. (2011). The retinal hypercircuit: A repeating synaptic interactive motif underlying visual function. The Journal of Physiology, 589, 3691-3702.

Westheimer, G. (1967). Spatial interaction in human cone vision. Journal of Physiology-London, 190, 139-154.

Whittle, P. (1994). The psychophysics of contrast-brightness. In Gilchrist (Ed.), Lightness, brightness and transparency (pp. 35–110). Hillsdale, NJ: Lawrence Erlbaum Associates.

Page 48: Spatial and temporal determinants of context modulation in ...

48

Williams, A. L., Singh, K. D., & Smith, A. T. (2003). Surround modulation measured with functional MRI in the human visual cortex. Journal of Neurophysiology, 89, 525-533.

Zenger-Landolt, B., & Heeger, D. J. (2003). Response suppression in V1 agrees with psychophysics of surround masking. Journal of Neuroscience, 23, 6884-6893.

Zhang, B., Zheng, J. H., Watanabe, I., Maruko, I., Bi, H., Smith, E. L., et al. (2005). Delayed maturation of receptive field center/surround mechanisms in V2. Proceedings of the National Academy of Sciences of the United States of America, 102, 5862-5867.

Zipser, K., Lamme, V. A. F., & Schiller, P. H. (1996). Contextual modulation in primary visual cortex. Journal of Neuroscience, 16, 7376-7389.