Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko...
-
Upload
jakobe-wynter -
Category
Documents
-
view
218 -
download
1
Transcript of Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko...
![Page 1: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/1.jpg)
1
Visual Recognition With Humans in the Loop
Steve Branson
Catherine Wah
Florian Schroff
Boris Babenko
Serge Belongie
Peter WelinderPietro Perona
ECCV 2010, Crete, Greece
![Page 2: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/2.jpg)
2
What type of bird is this?
![Page 3: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/3.jpg)
What type of bird is this?
3…?
Field Guide
![Page 4: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/4.jpg)
4
What type of bird is this?
Computer Vision
?
![Page 5: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/5.jpg)
5
What type of bird is this?
Bird?
Computer Vision
![Page 6: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/6.jpg)
6
What type of bird is this?
Chair?Bottle?
Computer Vision
![Page 7: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/7.jpg)
7
Parakeet Auklet
• Field guides difficult for average users
• Computer vision doesn’t work perfectly (yet)
• Research mostly on basic-level categories
![Page 8: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/8.jpg)
8
Visual Recognition With Humans in the Loop
Parakeet Auklet
What kind of bird is this?
![Page 9: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/9.jpg)
9
Levels of Categorization
Airplane? Chair? Bottle? …
Basic-Level Categories
[Griffin et al. ‘07, Lazebnik et al. ‘06, Grauman et al. ‘06, Everingham et al. ‘06, Felzenzwalb et al. ‘08, Viola et al. ‘01, … ]
![Page 10: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/10.jpg)
10
Levels of Categorization
American Goldfinch? Indigo Bunting? …
Subordinate Categories
[Belhumeur et al. ‘08 , Nilsback et al. ’08, …]
![Page 11: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/11.jpg)
11
Levels of Categorization
Yellow Belly? Blue Belly?…
Parts and Attributes
[Farhadi et al. ‘09, Lampert et al. ’09, Kumar et al. ‘09]
![Page 12: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/12.jpg)
12
Visual 20 Questions Game
Blue Belly? no
Cone-shaped Beak? yes
Striped Wing? yes
American Goldfinch? yes
Hard classification problems can be turned into a sequence of easy ones
![Page 13: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/13.jpg)
13
Recognition With Humans in the Loop
Computer Vision
Cone-shaped Beak? yes
American Goldfinch? yes
Computer Vision
• Computers: reduce number of required questions• Humans: drive up accuracy of vision algorithms
![Page 14: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/14.jpg)
14
Research Agenda
Heavy Reliance on Human Assistance
More Automated
Computer Vision
Improves
Blue belly? noCone-shaped beak? yesStriped Wing? yesAmerican Goldfinch? yes
Striped Wing? yesAmerican Goldfinch? yes
Fully AutomaticAmerican Goldfinch? yes
2010 2015
2025
![Page 15: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/15.jpg)
15
Field Guides
www.whatbird.com
![Page 16: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/16.jpg)
16
Field Guides
www.whatbird.com
![Page 17: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/17.jpg)
17
Example Questions
![Page 18: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/18.jpg)
18
Example Questions
![Page 19: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/19.jpg)
19
Example Questions
![Page 20: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/20.jpg)
20
Example Questions
![Page 21: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/21.jpg)
21
Example Questions
![Page 22: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/22.jpg)
22
Example Questions
![Page 23: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/23.jpg)
23
Basic Algorithm
Input Image ( )x
Question 1:Is the belly black?
Question 2:Is the bill hooked?
Computer Vision
A: NO
A: YES
)|( xcp
),|( 1uxcp
),,|( 21 uuxcp
1u
2u
Max Expected Information Gain
Max Expected Information Gain
…
![Page 24: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/24.jpg)
24
Without Computer Vision
Input Image ( )x
Question 1:Is the belly black?
Question 2:Is the bill hooked?
Class Prior
A: NO
A: YES
)(cp
)|( 1ucp
),|( 21 uucp
1u
2u
Max Expected Information Gain
Max Expected Information Gain
…
![Page 25: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/25.jpg)
25
Basic Algorithm
Select the next question that maximizes expected information gain:• Easy to compute if we can to estimate probabilities of
the form:
)...,,|( 21 tuuuxcp
Object Class
Image Sequence of user responses
![Page 26: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/26.jpg)
26
Basic Algorithm
Zxcpcuuup t )|()|...,( 21
Model of user responses
Computer vision
estimate
)...,,|( 21 tuuuxcp
Normalization factor
![Page 27: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/27.jpg)
27
Basic Algorithm
Zxcpcuuup t )|()|...,( 21
Model of user responses
Computer vision
estimate
)...,,|( 21 tuuuxcp
Normalization factor
![Page 28: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/28.jpg)
28
Modeling User Responses
ti it cupcuuup...121 )|()|...,(• Assume:
• Estimate using Mechanical Turk)|( cup i
grey red
blac k
whi
tebr
own
blue
grey red
blac k
whi
tebr
own
blue
grey red
blac k
whi
tebr
own
blue
Definitely
Probably Guessing
What is the color of the belly?
Pine Grosbeak
![Page 29: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/29.jpg)
29
Incorporating Computer Vision
• Use any recognition algorithm that can estimate: p(c|x)
• We experimented with two simple methods:
)}(exp{)|( xmxcp 1-vs-all SVM
i i capxcp )|()|(
Attribute-based classification
[Lampert et al. ’09, Farhadi et al. ‘09]
![Page 30: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/30.jpg)
30
Incorporating Computer Vision
[Vedaldi et al. ’08, Vedaldi et al. ’09]
Self SimilarityColor Histograms Color Layout
Bag of WordsSpatial Pyramid
Geometric Blur Color SIFT, SIFT
Multiple Kernels
•Used VLFeat and MKL code + color features
![Page 31: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/31.jpg)
31
Birds 200 Dataset•200 classes, 6000+ images, 288 binary attributes
•Why birds?
Black-footed Albatross
Groove-Billed Ani
Parakeet Auklet Field Sparrow Vesper Sparrow
Arctic Tern Forster’s Tern Common Tern Baird’s Sparrow Henslow’s Sparrow
![Page 32: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/32.jpg)
32
Birds 200 Dataset•200 classes, 6000+ images, 288 binary attributes
•Why birds?
Black-footed Albatross
Groove-Billed Ani
Parakeet Auklet Field Sparrow Vesper Sparrow
Arctic Tern Forster’s Tern Common Tern Baird’s Sparrow Henslow’s Sparrow
![Page 33: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/33.jpg)
33
Birds 200 Dataset•200 classes, 6000+ images, 288 binary attributes
•Why birds?
Black-footed Albatross
Groove-Billed Ani
Parakeet Auklet Field Sparrow Vesper Sparrow
Arctic Tern Forster’s Tern Common Tern Baird’s Sparrow Henslow’s Sparrow
![Page 34: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/34.jpg)
34
Results: Without Computer Vision
Comparing Different User Models
![Page 35: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/35.jpg)
35
Results: Without Computer Vision
if users answers agree with field guides…Perfect Users: 100% accuracy in 8≈log2(200) questions
![Page 36: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/36.jpg)
36
Results: Without Computer VisionReal users answer questions
MTurkers don’t always agree with field guides…
![Page 37: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/37.jpg)
37
Results: Without Computer VisionReal users answer questions
MTurkers don’t always agree with field guides…
![Page 38: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/38.jpg)
38
Results: Without Computer VisionProbabilistic User Model: tolerate imperfect user responses
![Page 39: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/39.jpg)
39
Results: With Computer Vision
![Page 40: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/40.jpg)
40
Results: With Computer Vision
Users drive performance: 19% 68%
Just Computer Vision19%
![Page 41: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/41.jpg)
41
Results: With Computer Vision
Computer Vision Reduces Manual Labor: 11.1 6.5 questions
![Page 42: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/42.jpg)
42
Examples
Without computer vision: Q #1: Is the shape perching-like? no (Def.)With computer vision: Q #1: Is the throat white? yes (Def.)
Western Grebe
Different Questions Asked w/ and w/out Computer Vision
perching-like
![Page 43: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/43.jpg)
43
Examples
computer vision
Magnolia Warbler
User Input Helps Correct Computer Vision
Is the breast pattern solid? no (definitely)
Common Yellowthroat
Magnolia Warbler
Common Yellowthroat
![Page 44: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/44.jpg)
44
Recognition is Not Always Successful
Acadian Flycatcher
Least Flycatcher
Parakeet Auklet
Least Auklet
Is the belly multi-colored? yes (Def.)
Unlimited questions
![Page 45: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/45.jpg)
Summary
11.1 6.5 questions
Computer vision reduces manual labor
Users drive up performance
19%
45
Recognition of fine-grained categories
More reliable than field guides
![Page 46: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/46.jpg)
Summary
11.1 6.5 questions
Computer vision reduces manual labor
Users drive up performance
19%
46
Recognition of fine-grained categories
More reliable than field guides
![Page 47: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/47.jpg)
Summary
11.1 6.5 questions
Computer vision reduces manual labor
Users drive up performance
19%
47
Recognition of fine-grained categories
More reliable than field guides
![Page 48: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/48.jpg)
Summary
11.1 6.5 questions
Computer vision reduces manual labor
Users drive up performance
19%
48
Recognition of fine-grained categories
More reliable than field guides
![Page 49: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/49.jpg)
49
Future Work• Extend to domains other than birds• Methodologies for generating questions• Improve computer vision
![Page 50: Visual Recognition With Humans in the Loop Steve Branson Catherine Wah Florian Schroff Boris Babenko Serge Belongie Peter Welinder Pietro Perona ECCV 2010,](https://reader036.fdocuments.in/reader036/viewer/2022062511/551a546d5503463e778b5604/html5/thumbnails/50.jpg)
50
Questions?
Project page and datasets available at:http://vision.caltech.edu/visipedia/
http://vision.ucsd.edu/project/visipedia/