1 Mid level vision, neglected yet still important Ken Nakayama Harvard University.

Post on 20-Jan-2016

216 views 0 download

Tags:

Transcript of 1 Mid level vision, neglected yet still important Ken Nakayama Harvard University.

1

Mid level vision, neglected yet still important

Ken Nakayama

Harvard University

2

21st C challenge

Existence and variation of occlusion and variable sources of

illumination pose unsolved problems for vision

3

• Object representation needs an intermediate level format

• Low level vision alone is not even explanatory for wide range of visual processes (motion, stereo, search)

• Missing -- a satisfactory scientific description of surface level vision

4

1950s

1970s Visual take-over of the whole brain

1980s

visual

Half of primate brain and substantial fraction of humanbrain devoted to vision

5

Macaque monkey brain flattened

Visual regionsshown in color

6

Global division ofthe visual system

dorsal

ventral

(where, how)

what

7

image surfaces

where

what

dorsalparietal

ventraltemporal

how

primalsketch

2.5Dsketch

3-Dobject

Marr's 3 levels

alternative view action

objectrecognition

BYPASS?

8

image surfaces

where

what

dorsalparietal

ventraltemporal

how

primalsketch

2.5Dsketch

3-Dobject

Marr's 3 levels

alternative view action

objectrecognition

motion search depth

attentionmotion search depth

attention

9

KanizsaPhenomenology, reviving the Gestalt

approach

Level: surfaces

Method: phenomenology

Practitioner: Gaetano Kanizsa

new concepts: amodal and modal completion

10

Amodal competion(behind)

modal competion(in front)

11

Inferences, but at what level ?

12

13Amodal completion trumps knowledge of horses

Suggests that thereis a completion process within thevisual system

14

Amodal completion allows fragments to be grouped andthus recognized (strongest evidence)

letter B

spot the 5 letter Bs

From Bregman, 1990

same fragments

15

Occlusion and the problem of segmentation for object recognition

y

x

z

Border ownership issues - for 3-D scenes, borderscannot be shared. Border dispute needs resolution

Rule - border belongs to the closest surface

What belongstogether ?

16

Problem of segmentationKanisza’s figure

Normal or amputee ?

17

Border ownership dictatedby “lines” preventsmodal and amodal completion

18

New sources of evidence

Surface in front “owns” the border. Thus face on right is broken up, on left is OK

Stereoscopic disparity

Nakayama et al.Perception ‘89 - faces easier to recognize on left

19

Stereoscopic depth also determines borderownership between regions. Nearer surface will ownthe border (for opaque surfaces)

Nakayama & Shimojo stereo demonstrations

20

Image level can’t even explain much lower level vision

Deployment of attention, motion perception, texture, visual search

21

image surfaces

where

what

dorsalparietal

ventraltemporal

how

22

Surfaces needed for much lower visual function

textureperception

visual

search

motionperception

imagefeatures

surfacerepresentationfeatures

textureperception

visual

search

motionperception

image

23

He and Nakayama search task

Nature (1992)Used stereo vision

24

surfacerepresentationfeatures

textureperception

visual

search

motionperception

image

25

Random dot stereogram

LERE1 0 1 0 1 0

1 1 0 1 0 1

0

1

1 0 1 1 1

1 1 0 1 1

0 0 1 1 0 1

1

1 0 1 0 1 0

1 0 1 0 11

0 1 1

1 0 1

1 1

1 1

0 0 1 1 0 1

0

le f t e y e r ig h t e y e

unpairedpoints

The correspondence problem:an image based problem

26

L.E. only R.E. only

27

invisible toright eye

What would happenif we presented unpaired pointsby themselves?

What givesrise to unpairedpoints?

occludingsurfaces

28

binocular

left only

right only

no depth

front

back

DaVincistereopsis

(Nakayama &Shimojo)

29

Scene depth from unpaired gaps

Gillam and Nakayama, 1999

LERE

30

Forest vs plane

3-D arrangement planar

A plane is a surfaceWhich can occlude, a set of random sticks cannot

31

Planes vs sticks

abutting

separated

interleaved

Gillam and Nakayama, 200

32

Level of processinghigh or low level inference?

Hypothesis - inferences learned via associative cortical learning

33

generic view principle

when faced with more than one surface interpretation of an image, the visual system assumes it is viewing the scene from a generic, not accidental, vantage point.

Nakayama and Shimojo

34

LE RE

folded wings?

folded cards?

Why don’t we interpolate depthand see folded wings and cards?

Some counterintuitive observations

35

36

Accidental vs generic vantage points

37

• accidental view

• generic view

38

cube

square(surface)

(volume)

surfaces images

viewing sphere

39

generic view principle

when faced with more than one surface interpretation of an image, the visual system assumes it is viewing the scene from a generic, not accidental, vantage point.

40

.

I1

SnS2S1

I2

Im

Perception (inverse optics)

Learning(optics)

image sampling

through locomotion

p(I |S )nm

Candidate Surfaces

41

LE RE

folded wings?

folded cards?

Why don’t we interpolate depthand see folded wings and cards?

Some counterintuitive observations

42

A B

LE RE

conclusion:this is a generic viewof crossed barsnot wings

43

a

folded card

1

2

3

transparency

b1

2

3

this is the genericview of transparentsurface in front,not a folded card

44

neural mechanisms ofsurface representation ?Cells in V2 respond to subjective

contours

Strategy: vary stimuli in ways that lead toAppearance and disappearance of subjective contours

45

Recordings from a single cell in area V2 ofmonkey

Physiological correlates of illusory contours in singleunit recordings

ye s

V 2 re ce p tive fie ld

real line

ye s

illusory line

n o

control

46

Bakin, Nakayama, and Gilbert, 2000

47

Edgar Rubin figure and ground

Edge labeling? contrast polarityvs edge labeling

Cells coding Border ownership?

Von der HeydtEt al.

48

+-+

-+-+

-

+-+

- +-+

-

imagebasedresponse

49

Borderownershipbasedresponse

50

Border ownership cells

Von der Heydt and colleagues

51

Von der Heydt (1984)

Bakin, Nakayama, Gilbert(2000)

DaVinci stereopsis

Border ownership cells(V2)

yes

Mechanistic account of surface representation? -->

V2

V2

52

21st C challenge

Existence and variation of occlusion and variable sources of

illumination pose unsolved problems for vision

53

• Object representation needs an intermediate level format

• Low level vision alone is not even explanatory for wide range of visual processes (motion, stereo, search)

• Missing -- a satisfactory scientific description of surface level vision

-- demos the importance for illumination for object recognition

54

importance of shadow processing

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

Ted Adelson

55

outline

no

shadowface

yes

reducecontrast

yes

Shadow processing in object recognition

56

reducecontrast

yes

add outline

no

Outline is very destructive to seeing regionsas shaded. Line is interpreted as a boundingContour of an object

57