A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early...
Transcript of A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early...
![Page 1: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/1.jpg)
Some slides from A. Gupta, R. Salakhutdinov, A. Efros, L. Zitnick and others
![Page 2: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/2.jpg)
Why should we care about Computer Vision
![Page 3: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/3.jpg)
Computer Vision / Deep Learning is Everywhere• Google, Facebook, Uber, Apple
– Strong deep learning / computer vision groups hiring everywhere..
– Beyond Research: Development• Image Search• Automated Driving
Startups Sold Everyday• Vision Factory, EuVision, Flutter….
![Page 4: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/4.jpg)
Computer Vision Works!
● Surprisingly recent development
![Page 8: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/8.jpg)
What is the goal of Computer Vision
![Page 9: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/9.jpg)
What is the goal of Computer Vision
To create autonomous systems that “understand” visual data
![Page 10: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/10.jpg)
What does it mean to understand?
![Page 11: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/11.jpg)
Early days of Computer Vision
![Page 12: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/12.jpg)
Early days of Computer Vision“What does it mean, to see? The plain man's answer (and Aristotle's, too). would be, to know what is where by looking.”
-- David Marr, Vision (1982)
![Page 13: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/13.jpg)
Early days of Computer Vision“What does it mean, to see? The plain man's answer (and Aristotle's, too). would be, to know what is where by looking.”
-- David Marr, Vision (1982)
In other words, vision is the process of discovering from images what is present in the world, and where it is.”
![Page 14: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/14.jpg)
Early days of Computer Vision
Slide Credit: Abhinav Gupta
![Page 15: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/15.jpg)
Early days of Computer Vision
Answer #1: pixel of brightness 43 at position (124,54) …and depth .7 meters
![Page 16: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/16.jpg)
Early days of Computer Vision
Answer #1: pixel of brightness 43 at position (124,54) …and depth .7 meters
Answer #2: looks like flat sittable surface of the couch
![Page 17: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/17.jpg)
So we’re done?
![Page 18: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/18.jpg)
So we’re done?
No!
![Page 19: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/19.jpg)
Measurement vs. Perception
![Page 20: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/20.jpg)
Brightness: Measurement vs. Perception
Slide Credit: Alyosha Efros
![Page 21: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/21.jpg)
Brightness: Measurement vs. Perception
Proof!
![Page 22: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/22.jpg)
Measurement
Length
Müller-Lyer Illusionhttp://www.michaelbach.de/ot/sze_muelue/index.html
![Page 23: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/23.jpg)
Measurement• Capturing physical quantities like pixel
brightness, depth, etc.
Perception/Understanding• a high-level representation that captures the
semantic structure of the scene and its constituent objects.
• Subjective – Depends on Task and Agent
• Intersection of what you see and what you believe (prior knowledge)
Slide Credit: Abhinav Gupta
![Page 24: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/24.jpg)
…but why do we care about perception?
The goals of computer vision (what + where) are in terms of what humans care about.
![Page 25: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/25.jpg)
So what do humans care about?
Slide Credit: Abhinav Gupta
![Page 26: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/26.jpg)
Image Classification/ Scene Recognition
Living Room
![Page 27: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/27.jpg)
Couch
Table
Object Detection
![Page 28: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/28.jpg)
Couch
Table
Object Segmentation/Categorization
![Page 29: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/29.jpg)
3D Understanding
![Page 30: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/30.jpg)
Can Sit
Can Walk
Can Move
Can Push
Functional Understanding
![Page 31: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/31.jpg)
Pose Estimation:
Slide Credit: Abhinav Gupta
![Page 32: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/32.jpg)
Activity Recognition:
What is he doing?What is he doing?
![Page 33: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/33.jpg)
Input Image Surface Connection Graph
Surface Normal Segmentation
![Page 34: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/34.jpg)
Surface Normal Estimation
Slide Credit: Abhinav Gupta
![Page 35: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/35.jpg)
Why are these problems hard?
![Page 36: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/36.jpg)
Challenges 1: view point variation
Michelangelo 1475-1564 slide by Fei Fei, Fergus & Torralba
![Page 37: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/37.jpg)
Challenges 2: illumination
slide credit: S. Ullman
![Page 38: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/38.jpg)
Challenges 3: occlusion
Magritte, 1957 slide by Fei Fei, Fergus & Torralba
![Page 39: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/39.jpg)
Challenges 4: scale
slide by Fei Fei, Fergus & Torralba
![Page 40: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/40.jpg)
Challenges 5: deformation
Xu, Beihong 1943
![Page 41: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/41.jpg)
Challenges 6: background clutter
Klimt, 1913 slide by Fei Fei, Fergus & Torralba
![Page 42: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/42.jpg)
Challenges 7: object intra-class variation
slide by Fei-Fei, Fergus & Torralba
![Page 43: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/43.jpg)
Challenges 8: local ambiguity
slide by Fei-Fei, Fergus & Torralba
![Page 44: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/44.jpg)
Challenges 9: the world behind the image
Slide Credit: Alyosha Efros
![Page 45: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/45.jpg)
How do we solve it?
![Page 46: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/46.jpg)
In the days of old
About 4 years ago...
![Page 47: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/47.jpg)
In the days of old
● Take original image● Do some feature preprocessing
○ Histogram of Oriented Gradients (HOG)○ SIFT○ SURF
● Run through some classifier○ Often SVMs or Decision Trees (Specifically random
forests)■ We’ll cover SVMs later in class (similar to
perceptron)
![Page 48: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/48.jpg)
Histograms of oriented gradients
From Deva Ramanan’s lake Como slides
![Page 49: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/49.jpg)
Lowe’s SIFT features
![Page 50: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/50.jpg)
Deep Learning Era
● We can actually learn better representations directly from raw pixel values!
● Run through Convolutional Neural Networks, and other types of Neural Networks○ We’ll cover neural networks and CNNs later in class
![Page 51: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/51.jpg)
What Changed?
![Page 52: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/52.jpg)
More Computational Power
![Page 53: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/53.jpg)
Better Algorithms
![Page 54: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/54.jpg)
Most Importantly
![Page 55: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/55.jpg)
More Data!!!
![Page 56: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/56.jpg)
Revisiting “Understanding”
● Is it actually enough to just know what’s in an image and where?
![Page 57: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/57.jpg)
Q: Do you see a fruit that Gallagher would likely smash with the Sledge-O-Matic?
![Page 58: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/58.jpg)
Clearly more here
![Page 59: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/59.jpg)
Idea 1: Caption Generation
![Page 60: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/60.jpg)
Slide Credit: Larry Zitnick
Problem: No good evaluations
![Page 61: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/61.jpg)
Next try: Visual Question Answering
● Input○ I: An image (MS COCO)○ Q: A question about the image
● Output○ A: The answer to the question
![Page 62: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/62.jpg)
![Page 63: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/63.jpg)
Do we need “embodiment”
● Perhaps we can only judge how good perception / language understanding is in the context of an agent
![Page 65: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/65.jpg)
Other Interesting Directions
![Page 66: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/66.jpg)
![Page 67: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/67.jpg)
Slide Credit: Larry Zitnick
![Page 68: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/68.jpg)
Incorporating Outside Knowledge
![Page 69: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/69.jpg)
Using Knowledge Graphs
![Page 70: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/70.jpg)
![Page 71: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/71.jpg)
Elephant Shrew
![Page 72: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/72.jpg)
Elephant Shrew•
![Page 73: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/73.jpg)
Elephant Shrew•
•
![Page 74: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/74.jpg)
Elephant Shrew•
••
![Page 75: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/75.jpg)
Elephant Shrew•
•••
![Page 76: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/76.jpg)
Elephant Shrew•
••••
![Page 77: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/77.jpg)
![Page 78: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/78.jpg)
![Page 79: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/79.jpg)
![Page 80: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/80.jpg)
![Page 81: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/81.jpg)
Elephant Shrew•••••
![Page 82: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/82.jpg)
Using Web Search
![Page 83: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/83.jpg)
Q: Do you see a fruit that Gallagher would likely smash with the Sledge-O-Matic?
![Page 84: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/84.jpg)
How I would solve the question
![Page 85: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/85.jpg)
How I would solve the question
![Page 86: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/86.jpg)
Q: Do you see a fruit that Gallagher would likely smash with the Sledge-O-Matic?
![Page 87: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/87.jpg)
Q: Do you see a fruit that Gallagher would likely smash with the Sledge-O-Matic?
![Page 88: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/88.jpg)
Q: Do you see a fruit that Gallagher would likely smash with the Sledge-O-Matic?
A: Yes
![Page 89: A. Efros, L. Zitnick and others Some slides from A. Gupta ...ninamf/courses/401sp18/... · Early days of Computer Vision Slide Credit: Abhinav Gupta. Early days of Computer Vision](https://reader035.fdocuments.in/reader035/viewer/2022081407/5f1bffa192cc18723b221b94/html5/thumbnails/89.jpg)
Use Query to get info
qy Page
Article Encoder
hinfo