This seminar is about how cognition (especially visual perception) connects with the world The...

149
This seminar is about how cognition This seminar is about how cognition (especially visual perception) (especially visual perception) connects with the world connects with the world The central concept will be the notion of “picking out” or selecting and the usual mechanism that is appealed to in explaining this selection is attention (sometimes called focal attention or selective attention). Why do we need to select? This is a nontrivial question and we will consider several different answers: We need to select because we can’t process all the information available. This is the resource- limitation reason. We need to select because of the way relevant information in the world is packaged. It gives rise to the Binding Problem We need to select because certain patterns cannot be computed without first marking certain special elements of a scene We need to select because selection is the first line of contact between the mind and the world – and precedes all conceptualizing and encoding

Transcript of This seminar is about how cognition (especially visual perception) connects with the world The...

Page 1: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

This seminar is about how cognition (especially This seminar is about how cognition (especially visual perception) connects with the worldvisual perception) connects with the world

The central concept will be the notion of “picking out” or selecting and the usual mechanism that is appealed to in explaining this selection is attention (sometimes called focal attention or selective attention).

Why do we need to select? This is a nontrivial question and we will consider several different answers: We need to select because we can’t process all the information

available. This is the resource-limitation reason. We need to select because of the way relevant information in

the world is packaged. It gives rise to the Binding Problem We need to select because certain patterns cannot be computed

without first marking certain special elements of a scene We need to select because selection is the first line of contact

between the mind and the world – and precedes all conceptualizing and encoding

Page 2: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Attention in Psychology: Attention in Psychology: Historical BackgroundHistorical Background

Attention was one of the first concepts to appear in Psychology texts (ca 1730) – e.g., Ebbinghaus, Titchener, …

Early discussions (Hatfield, 1998) focused on properties such as Narrowing of range of sensitivity (Aristotle, 4th century BC) Active Directing (Lucretius, 1st century AD) Involuntary shifts (Hippo, 400 AD) Clarity (Buridan, 14th century) Fixation over time (Descartes, 17th century) Laws of Attention (Titchener, 1908)

• Independence of clarity and other attributes (e.g., loudness)• Law of two levels of clarity (focus vs non-focus)• Law of accommodation (cuing) and law of Inertia (disengagement)• Law of prior entry (attended stimuli have temporal priority)

All the above phenomena (William James, early 1900s)

Page 3: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The functions of focal attentionThe functions of focal attention A central notion in the present analysis is the notion of “picking

out” or selecting. The usual mechanism that is appealed to in explaining perceptual selection is attention (sometimes called focal attention or selective attention).

Why must we select anyway? This is a rarely asked question to which there are several answers: We need to select because we can’t process all the information

available. This is the resource-limitation reason. <But in what ways is it limited? Along what dimensions?>

We need to select because certain patterns cannot be computed without first marking certain special elements of a scene

We need to select because of the way relevant information in the world is packaged (Strawson’s Collecting Principles). It is a response to the Binding Problem

We need to select because selection is a consequence of the first line of causal contact between mind and world: it precedes all conceptualizing and predicating.

Page 4: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Attention and SelectionAttention and Selection

I will first concentrate on the Selection or Filtering aspects of attention. I will ask:

1. Why do we need to select anyway? Because our processing capacity is limited?

The Big Question: In what way is it limited? (Miller, 1957) We will return to this core question after some preliminaries on

the early study of attention as selection and the filter theory.

2. On what basis do we select? Some alternatives: We select according to what is important to us (e.g., affordances) We select what can be described physically (i.e., “channels”) We select based on what can be encoded without accessing LTM We “pick out” things to which we subsequently attach concepts: i.e., we pick

out objects (or regions?)

3. What happens to what we have not selected? A largely unsolved

mystery (though in some cases there are plausible answers).

Page 5: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Big Question #1: Why do we need to select Big Question #1: Why do we need to select information? Because capacity is limited.information? Because capacity is limited.

Along which dimensions is human information processing capacity limited?

Channel capacity: Shannon-Hartley Theorem

Capacity measured in some sort of “chunks” (Miller) Capacity measured in terms of the number of

arguments that can be simultaneously bound to cognitive routines (Newell)

To what things in the world can the arguments of visual predicates be bound?

Page 6: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Amount of information in terms of the Amount of information in terms of the Information-theoretic measure (entropy)Information-theoretic measure (entropy)

Amount of information in a signal depends on how much one’s estimate of the probability of events is changed by the signal.

H = -pi Log2 (pi) … information in bits “One of by land, two if by sea” contains one bit of information if

the two possibilities were equally likely, less if they were not (e.g., if one was twice as likely as the other the information in the message would be ⅓ Log ⅓ + ⅔ Log ⅔ = 0.92 bits <using Excel>)

The amount of information transmitted depends on the potential amount of information in the message and the amount of correlation between message sent and message received. So information transmitted is a type of I-O correlation measure.

The information measure is an “ideal receiver” or competence measure. It is the maximum information that could be transmitted, given the statistical properties of messages, assuming that the sender and receiver know the code.

Page 7: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Information transmitted in a typical Information transmitted in a typical absolute judgment experimentabsolute judgment experiment

Information transmitted in an experiment in which subjects were presented with tones drawn from a known practiced set (of a given size, which determines the value of input information) and had to name the tones from a learned name set.

The information transmitted was always around 2.5 bits or an average of 6.25 equiprobable alternatives!

Page 8: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Short term memory capacity is independent of the amount Short term memory capacity is independent of the amount of information per item!of information per item!

This table shows the STM capacity as a function of the type of item. The number of items recalled remains roughly constant (at 7±2) while the amount of information recalled increases rapidly as the information carried by each item increases.

Type of item Number of alternatives

bits/per item

Items recalled

transmitted info (bits)

Binary digit 2 1 9 9

Digit 10 3.32 8 27

Letter 26 4.7 8 38

Letter/Digit 36 5.2 8 42

syllables 100 6.65 7.5 50

All words 20,000 14.28 7 100

Page 9: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Why can we retain different amounts of information Why can we retain different amounts of information just by using a different encoding vocabulary?just by using a different encoding vocabulary?

Answer: The architecture of the cognitive system has the property that it can deal with a fixed maximum number of items, regardless of what the items are.

This property can be exploited to get around the bottleneck of the short-term memory. We do this by recoding the input into a smaller number of discrete units, called chunks.

There is also evidence that it takes additional time to encode and decode chunks, so the recoding technique is a case of time-capacity tradeoff or what is known in CS as a compute-vs-store tradeoff. Allan Newell’s novel model to account for the time taken

in the Sternberg memory scan experiment attributes the observed RT to encoding or chunking.

Page 10: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Example of the use of Example of the use of chunkingchunking

•To recall a string of binary bits – e.g., 00101110101110110101001

•People can recall a string of about 8 binary integers. If they learn a binary encoding rule (000, 011, 102, 113) they can recall about 8 such chunks or 18 binary bits. If they learn a 3:1 chunking rule (called the Octal number system) they can recall a 24 bit string, etc

Page 11: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Early studies: Colin Cherry’s Early studies: Colin Cherry’s “Cocktail Party Problem”“Cocktail Party Problem”

What determines how well you can select one conversation among several? Why are we so good at it?

The more controlled version of this study used dichotic presentations – one “channel” per ear.

Cherry found that when attention is fully occupied in selecting information from one ear (through use of the “shadowing” task), almost nothing is noticed in the “rejected” ear (only if it was not speech).

More careful observations shows this was not quite true Change in spectral properties (pitch) is noticed You are likely to notice your name spoken Even meaning is extracted, as shown by involuntary ear switching and

disambiguating effect of rejected channel content

Page 12: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Broadbent’s Filter TheoryBroadbent’s Filter Theory

Broadbent, D. E. (1958). Perception and Communication. London: Pergamon Press.

Limited Capacity Channel

Effectors

Store of conditional probabilities of past events (in LTM)

Filt

erMotor planner

Ver

y Sh

ort T

erm

Sto

re

Sens

es

Rehearsal loop

Page 13: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Stroop EffectStroop EffectBaseline: Name the colors of the inkBaseline: Name the colors of the ink

Page 14: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Stroop Effect in English Stroop Effect in English Name the colors of the inkName the colors of the ink

RED GREEN BLUE PINK BROWN ORANGE GREEN PINK RED YELLOW GREEN YELLOW RED BROWN RED BLUE BROWN GREEN RED ORANGE RED BLUE YELLOW PINK ORANGE GREEN BLUE BROWN PINK RED YELLOW GREEN YELLOW RED BROWN PINK RED YELLOW GREEN YELLOW RED PINK ORANGE GREEN BLUE BROWN PINK RED YELLOW GREEN YELLOW RED BROWN RED BLUE GREEN BROWN YELLOW GREEN YELLOW RED PINK ORANGE GREEN RED BLUE BROWN GREEN RED ORANGE RED BLUE YELLOW YELLOW GREEN YELLOW RED BROWN PINK RED YELLOW GREEN PINK RED YELLOW

Page 15: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Stroop Effect in PortugueseStroop Effect in Portuguese Name the colors of the inkName the colors of the ink

VERMELHO VERDE AZUL MARROM ROSA ALARANJADO VERDE ROSA VERMELHO AMARELO VERDE AMARELO VERMELHO MARROM VERMELHO AZUL MARROM VERDE VERMELHO ALARANJADO VERMELHO AZUL AMARELO ROSA ALARANJADO VERDE AZUL MARROM ROSA VERMELHO AMARELO VERDE AMARELO VERMELHO MARROM ROSA VERMELHO AMARELO VERDE AMARELO VERMELHO ROSA ALARANJADO VERDE AZUL MARROM ROSA VERMELHO AMARELO VERDE AMARELO VERMELHO BROWN VERMELHO AZUL MARROM VERDE AMARELO VERDE AMARELO VERMELHO ROSA ALARANJADO VERDE VERMELHO AZUL MARROM VERDE VERMELHO ALARANJADO VERMELHO AZUL

Page 16: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Visual analogues illustrating the two-Visual analogues illustrating the two-channel selection problemchannel selection problem

In these examples you are to read only the text in shadows and ignore the rest. Read as quickly as you can and when you are finished, close your eyes or look away from the text.

Page 17: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Visual analogue #1 illustrating the two-channel Visual analogue #1 illustrating the two-channel selection problemselection problem

In performing an experiment like this one on man attention car it house is boy critically hat important she that candy the old material horse that tree is pen being phone read cow by book the hot subject tape for pin the stand relevant view task sky be read cohesive man and car gramatically house complete boy but hat without shoe either candy being horse so tree easy pen that phone full cow attention book is hot not tape required pin in stand order view to sky read red it nor too difficult.

Page 18: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Visual analogue #2 illustrating the two-channel Visual analogue #2 illustrating the two-channel selection problemselection problem

It is important that the subject man be car pushed slightly boy beyond that his normal limits horse of tree competence open for be only in phone this cow way book can hot one tape be pin certain stand that snaps he with is his paying teeth attention in to the the empty relevant air task and rather minimal than to the attention candy to horse the tree second or peripheral task.

Page 19: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Degree of Interference of the attended Degree of Interference of the attended message, as well as its interpretation, shows message, as well as its interpretation, shows that the rejected message was that the rejected message was understoodunderstood Moral: Although the rejected channel appears to be

rejected, it is being processed enough to understand the words!

The semantic interpretation of attended message depends on the meaning content of the rejected message. Subjects were asked to paraphrase the attended message in: Channel 1 (attended): “I think I will go down to the bank but I

will be back for dinner” Channel 2 (rejected): “The election results will depend on the

value of the dollar against the Euro and on the state of the domestic economy”

OR Channel 2 (rejected): “The rain has resulted in erosion by the overflowing river”

(Lackner, J. R., & Garrett, M. F. (1972). Resolving ambiguity: Effects of biasing context in the unattended ear. Cognition, 1, 359-372.)

Page 20: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

From here on I will focus on the From here on I will focus on the special case of special case of visualvisual attention attention

Visual working memory and visual selection What is the nature of the input, storage and

information processing limits in vision?

Page 21: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Studies of the capacity of Visual Studies of the capacity of Visual Working MemoryWorking Memory (Luck & Vogel, 1997)(Luck & Vogel, 1997)

People appear to be able to retain about 4 properties of an object (4 colors, 4 shapes, 4 orientations, etc) over a short time

People can also retain the identity of 4 objects for a short time.

Luck and Vogel found that as long as there are not more than 4 properties per object, people can retain large numbers of properties when the properties are on different objects (a phenomenon that is reminiscent of Miller’s “chunking hypothesis” except the chunks are visual objects).

Page 22: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

What does What does visualvisual attention select? attention select? (What is the basis for selection?)(What is the basis for selection?)

If visual attention is selection, what does it select? An obvious answer is places. We can select places by moving

our eyes so our gaze lands on different places. When places are selected, are they selected automatically? Must we always move our eyes to change what we attend to?

Studies of Covert Attention-Movement: Posner (1980). How does attention switch from one place to another? Is it always the case that we attend to places? Can we attend to

any other property? Can we select on the basis of color, depth, spatial frequency, affordances, or the property a painting has of having been painted by Da Vinci (A property to which Bernard Berenson was able to attend extremely well). cf Gibson

Page 23: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

What else can visual attention select? What else can visual attention select?

Regions? Can we control the size and shape of the region that is selected, or is selection always punctate and data-driven? Zoom Lens model of spatial attention (Eriksen & St James,

1986).

Controlling where attention moves: Is this automatic or voluntary? How do we know where to direct our attention? How do we

specify a location prior to attending to it? We need a way to specify where or what prior to attending to it!

Keep this conundrum in mind – we will return to it later!How narrowly can we focus our attention? Can we make it

pick out one out of several objects? Are there special conditions under which we are able to pick out

individual things? We will return to “attentional resolution” or the minimum spacing for selecting individual things.

Page 24: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Covert movement of attentionCovert movement of attention

Example of an experiment using a cue-validity paradigm for showing that the locus of attention moves without eye movements and for estimating its speed. Posner, M. I. (1980). Orienting of Attention. Quarterly Journal of Experimental Psychology, 32, 3-25.

Page 25: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Exogenous Exogenous vsvs endogenous control of attention endogenous control of attention In the Posner paradigm illustrated in the last slide, attention was

automatically seized by the onset of a luminance change (exogenous attention allocation). Other experiments show that this can also be done under voluntary (endogenous) control – e.g., by providing a cue for which direction to move attention.

Posner, Tsal and others showed that when attention goes from A to B, intermediate locations are maximally sensitive to detecting a signal at intermediate times. Although this suggests a continuously moving “spotlight” of

attention, there are other models that claim that this results from attentional activation that fades at the starting place and grows at the target place, creating an overlap in intermediate locations (Sperling).

Both exogenous and endogenous control produces movement of attention, but they differ in some of their effects. Endogenously moved attention does not lead to Inhibition of Return Endogenous controlled movement does not appear to affect detection

sensitivity, but it does affect discrimination Endogenous controlled effects are stronger and appear earlier

Page 26: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Extension of Posner’s demonstration of attention switchExtension of Posner’s demonstration of attention switch

Does the improved detection in intermediate locations entail that the “spotlight of attention” moves continuously through empty space?

Uncued

Cued

CueFixationframe

Target-cueinterval Detection target

*

Along thepath

*

*

Page 27: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Sperling & Weichselgartner (1995) “Episodic” or Sperling & Weichselgartner (1995) “Episodic” or Quantal Theory of Attention switching Quantal Theory of Attention switching

Assumes a quantal “shift” in attention in which the spotlight pointed at location -2 is extinguished and, simultaneously, the spotlight at location +2 is turned on. Because extinction and onset take a measurable amount of time, there is a brief period when the spotlights partially illuminate both locations simultaneously.

Page 28: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Review of the basis for selectionReview of the basis for selection

If attention serves as a gatekeeper between the world and visual cognition, then we must ask: On what (properties, things) does it base its selection?

We have already seen that attention appears to care about certain kinds of bundles of information that Miller called “chunks”. But what do chunks correspond to in vision?

A visual “chunk” is given by the way the world is “parsed” into things and non-things. What counts as a “thing” is an empirical question still being investigated, but whatever it comes out to be exactly, it seems to be a precursor of real objects

Page 29: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The The object-basedobject-based view of attention selection view of attention selection

When we discuss some of the reasons for attention and the mechanisms involved I will propose that there are good reasons for supposing that attention attaches itself to objects rather than locations

Page 30: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Selecting whole embedded shapesSelecting whole embedded shapes

It seems we can attend to entire (or at least large parts of) random shapes embedded in other random shapes and recall the attended ones to some degree

But we fail to recall the shapes we did not attend to.

Page 31: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

We can select a shape even when it is We can select a shape even when it is intertwined among other similar shapesintertwined among other similar shapes

Are there items on the left and on the right that have the same shape? On a surprise test at the end, subjects were not able to recall shapes that had been present but had not been attended in the task (Rock & Gutman, 1981)

Page 32: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Another negative Another negative attention effect: attention effect: Inattentional Inattentional BlindnessBlindness

Page 33: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Inattentional BlindnessInattentional Blindness

The background task is to report which of two arms of the + is longer. One critical trial per subject, after about 3,4 background trials. Another “critical” trial presented as a divided attention control.

25% of subjects failed to see the square when it was presented in the parafovea (2° from fixation).

But 65% failed to see it when it was at fixation!

When the background task cross was made 10% as large, Inattentional Blindness increased from 25% to 66%.

Inattentional Blindness may be due to concentration of attention at the primary task, or by the inhibition of non-attended regions or objects.Mack, A., & Rock, I. (1998). Inattentional blindness. Cambridge, MA: MIT Press.

Page 34: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Does inhibition play a role? Noticing odd Does inhibition play a role? Noticing odd stimuli when their location is pre-markedstimuli when their location is pre-marked

Ina tte ntio na l Blind ne ss 50%

Ina tte ntio na l Blind ne ss 20%

Page 35: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Evidence of Evidence of negative negative attentionattention or inhibition?or inhibition?

The increase in inattentional blindness when there are markers may be due to the inhibition of markers (since they are potentially distractors to the primary task).

Often attending to one set of things results in the active inhibition of things not attended (since they may be potentially disruptive)

Page 36: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Other examples of attentionally induced inhibitionOther examples of attentionally induced inhibition

Negative Priming (Treisman & DeShepper, 1996).

Is there a figure on the right that is the same as the figure on the left? When the figure on the left is one that had appeared as an ignored

figure on the right, RT is long and accuracy poor. This “negative priming” effect persisted over 200 intervening trials

and lasted for a month!

Page 37: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Inhibition of returnInhibition of return If we vary the time between the cue and target in a modified Posner

paradigm, we find that when the Cue-Target-Onset-Asynchrony (CTOA) gets to around 300-900 ms, reaction time to the target begins to increase. This is called Inhibition-of-return (Klein, 2000).

To get this effect we actually have to attract attention to the target location and then attract it back to the origin. IOR is one of many examples of an inhibition effect being produced by attention.

Slowed detection due to Inhibition of Return

Page 38: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Inhibition of return aids “foraging” and searchThe observer fixates a small black disk at the center of the empty screen where the image to be searched (a picture from the Where’s Waldo series) is presented. After several saccades (illustrated in the top figure) as the observer search for Waldo, the fixation stimulus reappeared at a specified location (black circles on the left) while the search stimulus remained (1b) or was removed (1c). The task was to foveate the target disk as rapidly as possible. Arrows illustrate a saccade to the target in the near (0°) condition. In one experiment this target was presented at the most recently fixated location or other locations around an equi-eccentric circle. Shown in Figure 1(d) are the data from the experiment in which the penultimate fixation (labeled 0°) was used to generate the target location (two back). Saccadic reaction time when the target was located by the first post-saccade (as in b and c) increased with increases in the target’s proximity to a previously fixated place, but only when the scene was maintained.

Page 39: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Attention and visual objectsAttention and visual objects

Page 40: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Exploring the limits of attention and the Exploring the limits of attention and the units over which selection operatesunits over which selection operates

It appears that the human information-processing bottleneck cannot be expressed perspicuously in terms of information-theoretic measures, nor can it be specified in physical parameters (e.g., in terms of locations or spatio-temporal regions), although such measures often do capture important aspects of attention (e.g., visual attention often moves continuously through space).

But there are other possible ways one might consider expressing the limits of attention. Over the past 25 years evidence has been accumulating that the

human attention system is, at least in part, tuned to individual objects in the world. This would certainly make sense from an evolutionary perspective. But what does this mean?

Page 41: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Summary of what we have so farSummary of what we have so far We saw that visual representations must be conceptual for

empirical and logical reasons The empirical reasons derive in part from the nature of

generalizations and errors of recall The logical reason is that vision must interact with thoughts

and lead to new beliefs and plans of action We saw that a large part of vision is cognitively

impenetrable and encapsulated and that cognition can only be brought to bear prior to or after its automatic operation: As attention or interpretation.

We saw that there are good design reasons for vision to be selective and we considered several bases for selection. But selection has turned out to be a more difficult question than appeared – it consists in more that just filtering information to a more manageable amount, but it is also required for other reasons. These other reasons make it plausible that selection should operate over objects rather than bits of information in the Shannon sense.

Page 42: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The increasingly important role played by The increasingly important role played by objectsobjects in studies of visual attention in studies of visual attention

Miller’s ‘Magic Number 7’ has continued to haunt us even beyond studies of short-term memory (STM).

There is a limitation in visual information processing that is beyond the limitation of acuity and of channel capacity: The perceptual system is limited in what it can individuate and how many of these individuals it can deal with at one time.

The capacity to individuate is different from memory capacity and discrimination capacity.

This notion of individuating and of individuals may be related to Miller’s “chunks”, but it has a special role in vision which we will explore later. A chunk is a relatively ill-defined notion in general whereas the units of visual attention are better thought of as objects

Page 43: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Experimental evidence for attentional selection of objects

Single Object Advantage: pairs of judgments are faster when both apply to the same perceived object

Entire objects acquire enhanced sensitivity from focal attention to a part of the object

Single-Object advantage occurs even with generalized “objects” defined in feature space

Simultanagnosia and hemispatial neglect show object-based effect

Attention moves with Moving Objects IOR Object Files MOT

Page 44: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Single-object superiority even when the Single-object superiority even when the shapes are controlledshapes are controlled

Page 45: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

More controls for the Baylis study… More controls for the Baylis study… (Baylis, 1994)(Baylis, 1994)

Controls for separability, convexity, area…

Page 46: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.
Page 47: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

There is also evidence from neuropsychology There is also evidence from neuropsychology that is consistent with the object-based viewthat is consistent with the object-based view

Neglect Balint and simultanagnosic patients

Page 48: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Visual neglect syndrome is object-basedVisual neglect syndrome is object-based

When a right neglect patient is shown a dumbbell that rotates,the patient continues to neglect the object that had been on the right, even though It is now on the left (Behrmann & Tipper, 1999).

Page 49: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Simultanagnosic (Balint Syndrome) patients only attend Simultanagnosic (Balint Syndrome) patients only attend to one object at a timeto one object at a time

Simultanagnosic patients cannot judge the relative length of twolines, but they can tell that a figure made by connecting the endsof the lines is not a rectangle but a trapezoid (Holmes & Horax, 1919).

Page 50: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Balint patients can only attend to one object at a time Balint patients can only attend to one object at a time even if they are overlappingeven if they are overlapping

Luria, 1959

Page 51: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Objecthood endures over time

Picking out objects is an example of the parsing of a scene into things that are likely to be physical objects. But the same must occur in time – temporal parsing entails solving the correspondence problem

Several studies have shown that what counts as an object (as the same object) endures over time and over changes in location; Certain forms of changes in location as well as

disappearances in time preserve objecthood.

This gives what we have been calling a “visual object” a real physical-object character and partly justifies our calling it an “object”.

Page 52: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The “Ternus Configuration” to demonstrate The “Ternus Configuration” to demonstrate the early visual effect of objecthoodthe early visual effect of objecthood

Short time delays result in “element motion” in which the middle object persists as the “same object” and does not appear to move so the end objects appear to move

Page 53: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Long time delays results in “group motion” in which Long time delays results in “group motion” in which the middle object does not persist but is perceived the middle object does not persist but is perceived

as a new object each time it reappearsas a new object each time it reappears

Page 54: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Inhibition of return appears to be object-basedInhibition of return appears to be object-based (as well as to some extent location-based)(as well as to some extent location-based)

The original study used static objects. Then (Tipper, Driver & Weaver, 1991) showed that IOR moves with the inhibited object.

Page 55: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

IOR appears to be object-based (it travels IOR appears to be object-based (it travels with the object that was attended)with the object that was attended)

Page 56: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Objects endure despite changes in location; and Objects endure despite changes in location; and they carry their history with them!they carry their history with them!

Object File Theory of Kahneman & Treisman

Letters are faster to read if they appear in the same box where they appeared initially. Priming travels with the object. According to the theory, when an object first appears, a file is created for it and the properties of the object are encoded and subsequently accessed through this object-file.

1 2 3

A

A

B

Page 57: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Demo of Object File ExperimentDemo of Object File Experiment

Page 58: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Demo of Object File ExperimentDemo of Object File Experiment

Object File 1

Object File 2

Page 59: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The limitation of individuation and selectionThe limitation of individuation and selection

There are obviously limitations on the input side of vision that depend on the acuity of the sensors and the range of physical properties to which they respond.

But there is a limitation beyond that of acuity: The perceptual system is limited in what it can individuate and how many of these individuals it can deal with at one time. The capacity to individuate is different from the capacity to discriminate. There is reason to think that individuating is a separate and

distinct process in early vision

Page 60: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Picking out is different from discriminating:Picking out is different from discriminating:Pick out the third contour from the leftPick out the third contour from the left

Page 61: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Individuating as a distinct processIndividuating as a distinct process Individuating has its own psychometric function: The

minimum distance for individuating is much larger than for discriminating.

It may be that in vision our attention is limited in the number of things we can individuate and simultaneously access (more on this later). But how do you determine what counts as a “thing”?

Individuating is a prerequisite for recognition of patterns and other properties defined among a number of individual parts An example of how we can easily detect patterns if they are

defined over a small enough number of parts is subitizing Another area where the concept of an individual has

become important is in cognitive development, where it is clear that babies are sensitive to the numerosity of individual things in a way that is distinct from their perceptual abilities but is limited in its capacity

Page 62: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Pick out 3 dots and keep track of themPick out 3 dots and keep track of them

In a field of identical elements you can select a number of them and move your attention among them (e.g., “move one up” or Move 2 right” etc) so long as at no time do you have to hold on to more than 4 dots

Page 63: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Pick out 3 dots I will cue and keep track of themPick out 3 dots I will cue and keep track of them

After you pick out the 3 cued dots, I’ll ask you move your attention from the center one. Describe the new relation among the three dots.

In a field of identical elements you can select several of them and move your attention among them (e.g., “move one up” or Move 2 right” etc) so long as at no time do you have to hold on to more than 4 dots (Intriligator & Cavanagh, 2001)

Page 64: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Individuals and patternsIndividuals and patterns

Vision does not recognize patterns by applying templates since the size, shape, retinal location, orientation, and other properties must be abstracted away,

A pattern is encoded over time (and often over saccades), therefore the visual system must keep track of the individual parts and merge descriptions of the same part at different times and stages of encoding

Therefore in order to recognize a pattern, the visual system must pick out individual parts and bind them to the representation being constructed

Examples include what Ullman called “visual routines”

Page 65: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Are there collinear items (n>3)?Are there collinear items (n>3)?

Page 66: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Several objects must be picked out at once Several objects must be picked out at once in making relational judgmentsin making relational judgments

The same is true for other relational judgments like inside or on-the-same-contour… etc. We must pick out the relevant individual objects first. Respond: Inside-same contour? On-same contour?

Page 67: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

When items cannot be individuated, predicates over When items cannot be individuated, predicates over them cannot be evaluatedthem cannot be evaluated

Do these figures contain one or two distinct curves? Individuating these curves requires a “curve tracing” operation, so Number_of_curves (C1, C2, …) takes time proportional to the length of the shortest curve.

Page 68: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The figure on the left is one continuous curve, the one The figure on the left is one continuous curve, the one on the right is two distinct curves – as shown in color.on the right is two distinct curves – as shown in color.

Page 69: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Signature subitizing phenomena only appear when objects Signature subitizing phenomena only appear when objects are automatically individuated and indexedare automatically individuated and indexed

Trick, L. M., & Pylyshyn, Z. W. (1994). Why are small and large numbers enumerated differently? A limited capacity preattentive stage in vision. Psychological Review, 101(1), 80-102.

Page 70: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

A different view of the role of attentionA different view of the role of attention

What’s the equivalent of “chunks” in vision (Visual Chunks?)

Attention as the “glue” that allows properties that occur together to be represented as conjoined

Experiments showing the special difficulty that vision has in detecting conjunctions of several properties have provided a basis for understanding an important problem in in visual analysis

Page 71: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Encoding conjunctions of propertiesEncoding conjunctions of properties

Experiments showing the special difficulty that vision has in detecting conjunctions of several properties have provided a basis for understanding an important problem in in visual analysis

Page 72: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

How are conjunctions of features detected?How are conjunctions of features detected?

Read the vertical line of digits in the following display

Under these conditions Conjunction Errors are very frequent

Page 73: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Rapid visual search Rapid visual search (Treisman)(Treisman)

Find the following simple figure in the next slide:

Page 74: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

This case is easy – and the time is independent of how many nontargets there are – because there is only one red item. This is called a ‘popout’ search

Page 75: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

This case is also easy – and the time is independent of how many nontargets there are – because there is only one right-leaning item. This is also a ‘popout’ search.

Page 76: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Rapid visual search Rapid visual search (conjunction)(conjunction)

Find the following simple figure in the next slide:

Page 77: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.
Page 78: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Find the Find the uniqueunique item in this slide item in this slide

Page 79: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Serial vs parallel search?Serial vs parallel search?

Finding an element that differs from all others in a scene by a single feature – called a single-feature search – is fast, error-free and almost independent of how many nontargets there are;

Finding an object that differs from all others by a conjunction of two or more features (and that shares at least one feature with each object in the scene) – called a conjunction search – is usually slow, error-prone, and is worse the more nontargets there are in the scene*.

These results suggest that in order to find a conjunction, which requires solving the binding problem, attention has to be scanned serially to all objects.

* This way of putting is simplifies things. Under certain conditions the serial-parallel distinction breaks down

Page 80: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Single-Feature Single-Feature vsvs Conjunction-feature search Conjunction-feature search

Page 81: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Treisman’s Treisman’s Attention as GlueAttention as Glue Hypothesis Hypothesis Another answer to what attention is for Another answer to what attention is for

The purpose of visual attention is to The purpose of visual attention is to BindBind properties properties together in order to recognize objectstogether in order to recognize objects This is called the “binding problem” or the “many This is called the “binding problem” or the “many

properties problem” and it is of considerable interest to properties problem” and it is of considerable interest to philosophers as well as vision scientistsphilosophers as well as vision scientists

We can recognize not only the presence of “squareness” We can recognize not only the presence of “squareness” and “redness” in our field of view, but we can also and “redness” in our field of view, but we can also distinguish between different ways they may be conjoineddistinguish between different ways they may be conjoined

Page 82: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

How is the binding problem solved?How is the binding problem solved?

Here is the most common view. To determine whether properties P and Q are conjoined, you detect P, encode its location, then check whether Q is found at that location.

A more detailed model is Treisman’s Feature Integration Theory. It postulates separate maps for each feature type, as well as a master map that allows attention to direct the search for features.

Page 83: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The role of attention to location in Treisman’s The role of attention to location in Treisman’s Feature Integration TheoryFeature Integration Theory

Color maps Shape maps Orientation maps

Master location map

Original Input

Attention “beam”

Conjunction detected

R

Y

G

Page 84: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Problems with solving the binding problem Problems with solving the binding problem in terms of co-location of featuresin terms of co-location of features

In order for the conjunction-by-location to work one has to have determined the location of each property A punctate location will not do since properties have

extension and the extension is relevant (cf encode INSIDE) A region will not do unless one knows the boundaries of

the region

In either case one has to have identified the relevant object before one can use its location

It’s the object that has the properties in question that determines whether the properties are conjoined

Properties are conjoined just in case they are properties of the same object

Page 85: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

But in encoding properties, early vision can’t just bind them But in encoding properties, early vision can’t just bind them together according to their spatial co-occurrence – even their co-together according to their spatial co-occurrence – even their co-occurrence occurrence within the same regionwithin the same region. . That’s because the relevant That’s because the relevant region depends on the object. So the selection and binding must region depends on the object. So the selection and binding must be be according to the objects that have those propertiesaccording to the objects that have those properties

Page 86: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The problem is even worse when the The problem is even worse when the relation between items is “relation between items is “InsideInside””

Page 87: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

If co-location of properties will not give us a way of solving the binding problem, what will?

It’s not being at the same location that binds properties together, it’s being properties of the same object

This is why we need object-based selection and why the object-based attention literature is relevant …

Page 88: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

An alternative view of how we An alternative view of how we solve the binding problemsolve the binding problem

If we assume that only properties of selected objects are encoded and that these are stored in object files associated with each object, then properties that belong to the same object are stored in the same object file, which is why they get bound together This automatically solves the binding problem! This is the view exemplified by both FINST Theory (1989)

and Object File Theory (1992) to be described later The assumption that only properties of selected objects are

encoded raises the question of what happens to properties of the other objects or properties in a display (more on this later)

The logical answer is that they are not encoded and therefore not available to conceptualization and cognition

But this is counter-intuitive!

Page 89: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The FINST theoryThe FINST theory

Why do we need Indexes? Some background on nonconceptual selection

Page 90: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

We need to be able to pick out individual visual We need to be able to pick out individual visual objects directly – without mediation of conceptsobjects directly – without mediation of concepts

We need to make nonconceptual contact with the world through perception in order to stop the regress of concepts being defined in terms of other concepts which are defined in terms of still other concepts What must you be able to do to decide that object O falls under concept C? Sometimes called the symbol grounding problem

The current proposal is that nonconceptual selection of individual objects is the primitive basis for all conceptualization and predication My argument for nonconceptual selection of token objects as the

primitive operation is primarily empirical I begin with the problem of incremental construction of visual

representations

Page 91: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Incremental construction of visual representations Incremental construction of visual representations and the correspondence problemand the correspondence problem

A personal experience: Drawing geometry diagrams and reasoning from the diagram

This problem arises because the visual representation is constructed incrementally over time

But visual representations are always constructed over time Amodal completition (Kanizsa)

Page 92: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Begin by drawing a line….Begin by drawing a line….

Page 93: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Now draw a second line….Now draw a second line….

Page 94: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

And draw a third line….And draw a third line….

Page 95: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Notice what you have so far….Notice what you have so far….(noticings are local – you encode what you attend to)(noticings are local – you encode what you attend to)

There is an intersection of two lines…

But which of the two lines you drew are they?

There is no way to indicate which individual things are seen again unless there is a way to refer to individual things

Page 96: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Look around some more to see what is there ….Look around some more to see what is there ….

Here is another intersection of two lines…

Is it the same intersection as the one seen earlier?

To be able to tell without a reference to individuals you would have to encode unique properties of the individual lines. Which properties should you encode?

L3

L6

Page 97: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Example of geometrical figure used in solving a problem in plane geometry: Not all of it is seen or noticed at once – coding is incremental

Consider what happens when vertices are encountered while the figure is scanned. When are two such encounters of the very same vertex?

Page 98: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

When a new property of a vertex is noticed, which part of the current representation should be updated? When should a new vertex-representation be added? Answering these questions requires keeping track of individual distal objects. We proposed the mechanism of visual indexes (FINSTs) for this function.

Page 99: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Keeping track by encoding unique properties of Keeping track by encoding unique properties of individual items will not work in generalindividual items will not work in general

A description cannot keep picking out the same individual when the individual is changing its properties unpredictably, even if the description is continually updated A perceptual representation is always built up over time, so

you would need a way to retrieve and update the previous representation of a particular token element when new properties of that token element are noticed

Some writers have postulated a “marking” process for counting or computing relational predicates. But where is the “mark” placed? It can’t be placed in the representation, because its purpose is to keep track of things in the world.

People can pick out several individual items even if the items are in a field of identical items – e.g., pick out a dot in a uniform field of dots (examples later)

Page 100: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

* Footnote* Footnote

Notice that in the previous example it would not help if you labeled the diagram as you drew it. Why not? Because to refer to something as “the thing labeled L1” you

would have to be able to think “X is the thing labeled L1” which requires that X be able to pick out that particular thing. But picking out a particular thing is the original problem!

Another way to use the label would be if you could think “This is line L1” But of course you couldn’t have that thought unless you had a way to think “this”! See Perry quote

Being able to think “this” is another way to view the very problem under discussion. You need an independent way to pick out and refer to an individual visual object – even if it is labeled! (You also need to do this for several individuals simultaneously – this1, this2, … thisn – but more on that later).

Page 101: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The requirements for picking out individual things The requirements for picking out individual things and keeping track of them reminded me of an and keeping track of them reminded me of an

early comic book character called “Plastic Man”early comic book character called “Plastic Man”

Page 102: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Imagine being able to place several of your fingers on things in the world without being able to detect their properties in this way, but being able to refer to those things so you could move your gaze or attention to them. If you could you would possess FINgers of INSTantiation = FINSTs!

Page 103: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

FINSTs and subset selection

Burkell demonstration that a subset of objects can be selected and subsequent search carried out over this subset.

Currie demonstration that the subset selection can withstand a saccadic eye movement

Page 104: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Example of the operation of Visual Indexes: Subset selection for search

singlefeaturesearch

conjunctionfeaturesearch

Target =

+ + +

+

Burkell, J., & Pylyshyn, Z. W. (1997). Searching through subsets: A test of the visual indexing hypothesis. Spatial Vision, 11(2), 225-258.

Page 105: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Subset search results:

Only the properties of the subset matter – but properties of the entire subset are taken into account (since that is what distinguishes a feature search from a conjunction search)

If the subset is a single-feature search it is fast and the slope is very shallow

If the subset is a conjunction search set, it takes longer and is more sensitive to the set size

The dispersion among the targets does not matter

Page 106: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Multiple Object TrackingMultiple Object Tracking

One of the clearest cases illustrating object-based attention is Multiple Object Tracking

Keeping track of individual objects in a scene requires a mechanism for individuating, selecting, accessing and tracking the identity of individuals over time These are the functions we have proposed are carried out by

the mechanism of visual indexes (FINSTs)

We have been using a variety of methods for studying visual indexing, including subitizing, subset selection for search, and Multiple Object Tracking (MOT).

Page 107: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Multiple Object TrackingMultiple Object Tracking In a typical experiment, 8 simple identical objects are

presented on a screen and 4 of them are briefly distinguished in some visual manner – usually by flashing them on and off.

After these 4 “targets” have been briefly identified, all objects resume their identical appearance and move randomly. The subjects’ task is to keep track of which ones had earlier been designated as targets.

After a period of 5-10 seconds the motion stops and subjects must indicate, using a mouse, which objects were the targets.

People are very good at this task (80%-98% correct). The question is: How do they do it?

Page 108: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Keep track of the objects that flash

Page 109: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

How do we do it? What properties of individual objects do we use?

Page 110: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Keep track of the objects that flash

Page 111: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

How do we do it? What properties of individual objects do we use?

Page 112: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Basic finding: People (even 5 year old children) can track 4 to 5 individual objects that have no unique visual properties

How is it done? Can it be done by keeping track of the only

distinctive property of objects – their location?

Explaining Multiple Object TrackingExplaining Multiple Object Tracking

Page 113: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

A possible location-based tracking algorithmA possible location-based tracking algorithm

Use of the above algorithm assumes (1) focal attention is required to encode locations (i.e., encoding is not parallel), (2) focal attention is unitary and has to be scanned continuously from location to location. It assumes no encoding (or dwell) time at each element.

1. While the targets are visually distinct, scan attention to each target in turn and encode its location on a list.

2. When targets begin to move, check the n’th position in the list and go to the location encoded there: Call it Loc(n).

3. Find the closest element to Loc(n).4. Update the actual location of the element found in

#3 in position n in the list: this becomes the new value of Loc(n).

5. Move attention to the location encoded in the next list position, Loc(n+1).

6. Repeat from #3 until elements stop moving.7. Report elements whose locations are on the list.

Page 114: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Predicted performance for the serial tracking algorithm as a function of the Predicted performance for the serial tracking algorithm as a function of the speed of movement of attentionspeed of movement of attention

Page 115: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

If we are not using and updating objects’ If we are not using and updating objects’ locations, then how are we tracking them?locations, then how are we tracking them?

Our hypothesis, which is independently motivated, is that there are a small number of primitive indexes or pointers, each of which can pick out a particular individual object The index keeps providing access to the object as the object

changes its properties and its location.

The object is not selected by using an encoding of any of its properties. It is picked it out nonconceptually just as the demonstrative that does in language. Nonconceptual selection is selection without classification

(without encoding the selected thing as having certain properties or as being a member of a certain category)

Nonconceptual contact with the world is essential in order to ground concepts in causal connections

Page 116: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

A FINST is a mechanism that:A FINST is a mechanism that:1. Picks out, and 2. Keeps track of

individual distal visual objects, and3. Does so directly (i.e., without mediation of concepts and

without appealing to or using any encoded properties of the individuals). Therefore,

4. FINSTs pick out and track individuals qua individuals rather than as bearers of certain properties

5. FINSTs do not pick out and track individuals as members of any category: The connection to the world is purely causal and nonconceptual; not a “seeing as” relation.So the perceiver literally does not know what is being selected

and tracked, even though this indexed selection allows further properties of the object in question to be encoded subsequently!

Page 117: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

MOT with occlusion MOT with virtual occluders MOT with implosions MOT with line endpoints "Rubber band" displays MOT with IDs (corners)

Additional examples of MOTAdditional examples of MOT

Page 118: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Superimposed Gabor patches

Blaser, E., Pylyshyn, Z. W., & Holcombe, A. O. (2000). Tracking an object through feature-space. Nature, 408(Nov 9), 196-199.

Page 119: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Changing feature dimensions

Page 120: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Trajectories:

* pseudo-random and independent* frequent changes in speed and direction* Gabors frequently "pass" each other along a dimension(s)

Surfaces in feature-space

Page 121: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Snapshots

snapshots taken every 250 msec

1) People are able to track this fixed-location “object” and

2) Single-object advantage is obtained

Page 122: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Reprise … what are FINSTs?Reprise … what are FINSTs? They are a primitive reference mechanism that refer to

individual objects in the world (FINGs?) Objects are picked out and referred to without using any

encoding of their properties, including their location. Picking out objects is prior to encoding their locations!

Indexing is nonconceptual because it does not represent an individuals as a member of some conceptual category – not even as being in the category individual or object!

FINSTs serve as visual demonstratives, much like the terms this or that do in language, by picking out and referring to individuals without using their properties.

The central function of FINST indexes is to bind arguments of visual predicates or of motor commands to things in the world to which they must refer. Only predicates with bound arguments can be evaluated.

Page 123: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

FINSTs and Object Files form the link FINSTs and Object Files form the link between the world and its conceptualizationbetween the world and its conceptualization

Object File contents are conceptual!

Information (causal) link

FINST Demonstrative reference link

Page 124: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Schema for how FINSTs function in visual-motor control: Schema for how FINSTs function in visual-motor control: FINSTs allow arguments to be bound to things (FINGs) so that FINSTs allow arguments to be bound to things (FINGs) so that

individual token things can be referred to. This is a logical individual token things can be referred to. This is a logical prerequisite for the execution of motor commands like Move(prerequisite for the execution of motor commands like Move(L,xL,x).).

(of course it says nothing about (of course it says nothing about howhow the command can be executed!) the command can be executed!)

Page 125: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The The binding hypothesisbinding hypothesis of the of the visual-cognitive bottleneckvisual-cognitive bottleneck

FINST Theory claims that the bottleneck between vision and cognition is in the number of objects that can be simultaneously bound to the arguments of cognitive routines

Another way to put this is that visual cognition can simultaneously attend to only about 4 objects. There is direct evidence for the limit of about 4 visual

objects in visual working memory (Luck & Vogel, 1997)

This sense of “attend to” means refer to or bind to a mental symbols

Page 126: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Schema for how FINSTs function in Robot VisionSchema for how FINSTs function in Robot Vision

Pylyshyn, Z. W. (2000). Situating vision in the world. Trends in Cognitive Sciences, 4(5), 197-207.

Page 127: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Summary of some properties of indexing revealed Summary of some properties of indexing revealed by recent experimentsby recent experiments

1. Targets can be tracked even when they disappear behind an occluder and, under certain conditions, even when all objects disappear from view (Scholl & Pylyshyn, 1999; Keane & Pylyshyn, VSS2003). Demo: MOT with occlusion

2. Properties of targets are not encoded during MOT nor are they used in tracking. Changes in target properties are not even noticed (Scholl, Pylyshyn & Franconeri, 1999; Bahrami, 2003).

3. Not all well-defined clusters of features can be tracked: Only ones that correspond to objects (Scholl, Pylyshyn & Feldman, 2001). Demo: "Rubber band" displays

Page 128: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Summary of some properties of indexing revealed Summary of some properties of indexing revealed by recent experimentsby recent experiments

4. Indexes are assigned primarily in an exogenous, automatic, involuntary and data-drive manner (cf distinction between interrupt and test). They can also be assigned endogenously (voluntarily) but we believe this happens only by moving focal attention to each target serially (Annan & Pylyshyn, VSS2003).

5. Index maintenance in tracking appears to be non-predictive and non-attentive (Keane & Pylyshyn, VSS2003; Leonard &

Pylyshyn, VSS2003).

6. Target-target confusions are much more numerous than target-nontarget confusions. The reason appears to be that nontargets are inhibited, which may prevent them from being swapped with nontargets (Pylyshyn & Leonard, VSS2003).

Page 129: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Summary of some properties of indexing revealed Summary of some properties of indexing revealed by recent experimentsby recent experiments

7. Keeping track of objects as targets is easier than keeping track of their identity (when the latter is provided at the start of the trial by a name or special location) The poorer recall of object identities is surprising, given that in order to

judge an object as a target one needs to trace its identity back to an object that had been visibly distinct at the start of a trial! So why is ID lost?

8. One reason is that target-target confusions are much more numerous than target-nontarget confusions. But why should this be so?

9. One reason may be that nontargets are inhibited, which may prevent them from being swapped with nontargets. We have shown this is so experimentally. But that leaves a serious puzzle: How can inhibition travel with objects when no indexes are available for tracking?

Page 130: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Further implications of FINST Index TheoryFurther implications of FINST Index Theory

Review and examination of some unstated assumptions

Page 131: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

FINSTs play a central role in the story of object-FINSTs play a central role in the story of object-based selection and the binding problembased selection and the binding problem

When objects are individuated, a FINST index may be assigned to them and their properties may be stored in associated Object Files

FINSTs are reference tokens or demonstrative references to objects in the world (call these FINGs?)

The main function of FINST indexes is to bind arguments of visual predicates to selected objects and thereby to bind objects to their associated Object Files FINSTs are the part of a representation that give it a direct

referential content – that enables the representation to refer directly (and nonconceptually) to things in the world

FINSTs allow individuals to be selected and tracked without conceptualizing them. This primitive individuation is essential for accounting for how vision can connect with action

Page 132: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Some surprising aspects FINST index theorySome surprising aspects FINST index theory

In appealing to FINST theory to explain how the binding problem is solved we tacitly assumed that all properties in a scene are encoded as properties of indexed objects All properties are stored in object files

Page 133: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The assumption that no properties other than properties of selected objects can be encoded is in conflict with strong intuitions – namely that we see much more than we conceptualize and are aware of. So what do we do about the things we “see” but do not conceptualize?Some philosophers say they are represented nonconceptually But what makes this a nonconceptual representation, as

opposed to just a causal reaction? At the very minimum postulating that something is a

representation must allow generalizations to be captured over their content, which would otherwise not be available

Traditionally representations are explanatory because they account for the possibility of misrepresentation and they also enter into conceptualizations and inferences. But unselected objects and unencoded properties don’t seem to fit this requirement (or do they?)

Page 134: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

An intriguing possibility….An intriguing possibility….

Maybe we visually encode far less than we think we do!

This possibility has received a great deal of recent attention with the discovery of various ‘blindnesses’ such as change-blindness and inattentional blindness

Page 135: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Change Blindness:Change Blindness:What changes between flashes?What changes between flashes?

Harborside

Airplane

Helicopter

Dinner

Farm scene

Paris corner

Page 136: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Maybe information about unattended Maybe information about unattended objects is not objects is not representedrepresented at all!! at all!!

A possible view is that certain topographical or biological reactions (e.g., retinal activity) are not representations – because they have no truth values and so cannot misrepresent One must distinguish between causal and represented properties Properties that cause objects to be indexed and tracked and result

in object files being created need not be encoded and made available to cognition

Page 137: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

A few (serious) loose ends …

Page 138: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The beginnings of the puzzle of individuating prior to The beginnings of the puzzle of individuating prior to indexing, and what that might mean!indexing, and what that might mean!

If moving objects are inhibited then inhibition moves along with the objects. How can this be unless they are being tracked? And if they are being tracked there must be at least 8 FINSTs!

This puzzle may signal the need for a kind of individuation that is weaker than the individuation we have discussed so far – a mere clustering, circumscribing, figure-ground distinction without a pointer or access mechanism – i.e. without reference!

It turns out that such a circumscribing-clustering process is needed to fulfill many different functions in early vision. It is needed whenever the correspondence problem arises – whenever visual elements need to be placed in correspondence or paired with other elements. This occurs in computing stereo, apparent motion, and other grouping situations in which the number of elements does not affect ease of pairing (or even results in faster pairing when there are more elements). Correspondence is not computed over continuous visual manifolds but only over some pre-clustered elements.

Page 139: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Puzzles raised by FINST theory and MOT resultsPuzzles raised by FINST theory and MOT results

If the only information about indexed objects is encoded and made available to the cognitive mind, what happens to information about other parts of the visual scene? cf Change Blindness There are, after all, only about 4 or 5 indexes and surely

we see a lot more of the world than 4 or 5 objects! This raises the question about whether non-indexed

objects are ‘processed’ in any sense at all, and whether they are even represented in some (presumably nonconceptual) way.

Do objects that are not indexed have any effect on the visual system at all? The mystery of unattended objects Functional blindness in normal vision

Page 140: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Can objects be individuated but not Can objects be individuated but not indexed? A new twist to this storyindexed? A new twist to this story

We have recently obtained evidence that objects that are not tracked in MOT are nonetheless being inhibited and the inhibition moves with them It is harder to detect a probe dot on an untracked object

than on either a tracked object or empty space!

But how can inhibition move with a nontarget when the space through which they move is not inhibited? Doesn’t this require the nontargets to be tracked?

Page 141: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

What happens to unattended What happens to unattended objects in vision (esp in tracking)?objects in vision (esp in tracking)?

There are three possibilities1. No properties other than of indexed objects are encoded

It may be that the richness of visual phenomenology is illusory!

Visual information without experience & vice-versa2. Other properties are encoded by are only available within

modules (e.g., two visual systems)3. Unattended (unindexed) objects are tracked but access to

them is inhibited Mack & Rock MOT research

Page 142: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Another puzzle: Punctate inhibition of moving objects?Another puzzle: Punctate inhibition of moving objects?

We have recently obtained evidence that nontargets are inhibited (as measured by the rate of detection of small faint probe dots). There appears to be no inhibition of the empty region through which

the nontargets move The inhibition is spatially local

How can a punctate moving object be inhibited unless the object is being tracked? And how can it be tracked if there are many (n > 5) of them? But there is some sense in which moving objects must be tracked:

E.g., Dynamic random-dot stereograms, kinetic depth effect

Maybe Indexing is a two-stage process?

1. Individuate

2. Reference (for accessing)

Page 143: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The beginnings of the puzzle of individuating The beginnings of the puzzle of individuating prior to indexing, and what that might mean!prior to indexing, and what that might mean!

If moving objects are inhibited then inhibition moves along with the objects. How can this be unless they are being tracked? And if they are being tracked there must be at least 8 FINSTs!

This puzzle may signal the need for a kind of individuation that is weaker than the individuation we have discussed so far – a mere clustering, circumscribing, figure-ground distinction without a pointer or access mechanism – i.e. without reference!

It turns out that such a circumscribing-clustering process is needed to fulfill many different functions in early vision. It is needed whenever the correspondence problem arises – whenever visual elements need to be placed in correspondence or paired with other elements. This occurs in stereo, apparent motion, and other situations in which increasing the number of elements does not increase the difficulty of computing correspondences. Correspondence is not computed over continuous visual manifolds but only

over some pre-clustered elements.

Page 144: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Inhibition of nontargetsInhibition of nontargetsProbe Detection while tracking

and not tracking

40%

50%

60%

70%

80%

90%

100%

OpenSpace NonTarget Target

Probe Location

Pro

bes

Det

ecte

d (

%)

Nontrack Control

Tracking

Page 145: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Why does the apparent motion take the form it does?Why does the apparent motion take the form it does?An example of a Natural Constraint (Marr)An example of a Natural Constraint (Marr)

The principle appears to be one of minimizing the vector difference between each possible correspondence pair and that of its nearest neighbors

This principle arises from (is justified by) the natural constraints of rigidity and opacity: In our kind of world most image features arise from distal

elements on the surface of objects, i.e., all but a vanishingly small proportion of perceived distal elements are on the visible surface of opaque rigid objects

Therefore each distal element is likely to move the same amount and in the same direction as elements near to it

Page 146: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Views of a domeViews of a dome

Page 147: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Views of a domeViews of a dome

Animation of spheres

Page 148: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

Structure from Motion Demo

Cylinder Kinetic Depth Effect

Page 149: This seminar is about how cognition (especially visual perception) connects with the world The central concept will be the notion of “picking out” or selecting.

The correspondence problem for biological motionThe correspondence problem for biological motion