CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural...
Transcript of CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural...
![Page 1: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/1.jpg)
CS 2770: Computer Vision
Introduction
Prof. Adriana KovashkaUniversity of Pittsburgh
January 5, 2017
![Page 2: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/2.jpg)
About the Instructor
Born 1985 in Sofia, Bulgaria
Got BA in 2008 atPomona College, CA(Computer Science & Media Studies)
Got PhD in 2014at University of Texas at Austin(Computer Vision)
![Page 3: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/3.jpg)
Course Info
• Course website: http://people.cs.pitt.edu/~kovashka/cs2770
• Instructor: Adriana Kovashka([email protected])
– Use "CS2770" at the beginning of your Subject
• Office: Sennott Square 5325
• Office hours: Tue/Thu, 3:30pm - 5:30pm
![Page 4: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/4.jpg)
TA
• Keren Ye ([email protected])
• Office: Sennott Square 5501
• Office hours: TBD
– Do the Doodle by the end of Friday:
http://doodle.com/poll/v3m8acmcdsiydqhq
![Page 5: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/5.jpg)
Textbooks
• Computer Vision: Algorithms and Applicationsby Richard Szeliski
• Visual Object Recognition by Kristen Graumanand Bastian Leibe
• More resources available on course webpage
• Your notes from class are your best study material, slides are not complete with notes
![Page 6: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/6.jpg)
Course Goals
• To learn about the basic computer vision tasks and approaches
• To get experience with some computer vision techniques
• To learn/apply basic machine learning (a key component of modern computer vision)
• To think critically about vision approaches, and to see connections between works and potential for improvement
![Page 7: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/7.jpg)
Policies and Schedule
http://people.cs.pitt.edu/~kovashka/cs2770/
![Page 8: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/8.jpg)
Should I take this class?
• It will be a lot of work!
– But you will learn a lot
• Some parts will be hard and require that you pay close attention!
– But I will have periodic ungraded pop quizzes to see how you’re doing
– I will also pick on students randomly to answer questions
– Use instructor’s and TA’s office hours!!!
![Page 9: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/9.jpg)
Questions?
![Page 10: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/10.jpg)
Plan for Today
• Introductions
• What is computer vision?
– Why do we care?
– What are the challenges?
– What is the current research like?
• Overview of topics (if time)
![Page 11: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/11.jpg)
Introductions
• What is your name?
• What one thing outside of school are you passionate about?
• Do you have any prior experience with computer vision?
• What do you hope to get out of this class?
• Every time you speak, please remind me your name
![Page 12: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/12.jpg)
Computer Vision
![Page 13: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/13.jpg)
What is computer vision?
Done?
Kristen Grauman (adapted)
"We see with our brains, not with our eyes“ (Oliver Sacks and others)
![Page 14: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/14.jpg)
• Automatic understanding of images and video
– Computing properties of the 3D world from visual
data (measurement)
– Algorithms and representations to allow a
machine to recognize objects, people, scenes, and
activities (perception and interpretation)
– Algorithms to mine, search, and interact with
visual data (search and organization)
Kristen Grauman
What is computer vision?
![Page 15: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/15.jpg)
Vision for measurement
Real-time stereo Structure from motion
NASA Mars Rover
Pollefeys et al.
Multi-view stereo for
community photo collections
Goesele et al.
Slide credit: L. LazebnikKristen Grauman
![Page 16: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/16.jpg)
sky
water
Ferris
wheel
amusement park
Cedar Point
12 E
tree
tree
tree
carouseldeck
people waiting in line
ride
ride
ride
umbrellas
pedestrians
maxair
bench
tree
Lake Erie
people sitting on ride
Objects
Activities
Scenes
Locations
Text / writing
Faces
Gestures
Motions
Emotions…
The Wicked
Twister
Vision for perception, interpretation
Kristen Grauman
![Page 17: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/17.jpg)
Visual search, organization
Image or video
archives
Query Relevant
content
Kristen Grauman
![Page 18: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/18.jpg)
Related disciplines
Cognitive
science
Algorithms
Image
processing
Artificial
intelligence
GraphicsMachine
learningComputer
vision
Kristen Grauman
![Page 19: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/19.jpg)
Vision and graphics
ModelImages Vision
Graphics
Inverse problems: analysis and synthesis.
Kristen Grauman
![Page 20: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/20.jpg)
Why vision?• Images and video are everywhere!
Personal photo albums
Surveillance and security
Movies, news, sports
Medical and scientific images
Adapted from Lana Lazebnik
144k hours uploaded to YouTube daily
4.5 mil photos uploaded to Flickr daily
10 bil images indexed by Google
![Page 21: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/21.jpg)
• As image sources multiply, so do applications
– Relieve humans of boring, easy tasks
– Human-computer interaction
– Perception for robotics / autonomous agents
– Organize and give access to visual content
– Description of image content for the visually
impaired
– Fun applications (e.g. transfer art styles to my
photos)
Adapted from Kristen Grauman
Why vision?
![Page 22: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/22.jpg)
Faces and digital cameras
Setting camera
focus via face
detection
Camera waits for
everyone to smile to
take a photo [Canon]
Kristen Grauman
![Page 23: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/23.jpg)
Face recognition
Devi Parikh
![Page 24: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/24.jpg)
Linking to info with a mobile device
kooaba
Situated search
Yeh et al., MIT
MSR Lincoln
Kristen Grauman
![Page 25: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/25.jpg)
Snavely et al.
Kristen Grauman
Exploring photo collections
![Page 26: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/26.jpg)
Special visual effects
The Matrix What Dreams May Come
Mocap for Pirates of the Carribean, Industrial Light and Magic
Source: S. Seitz
Kristen Grauman
![Page 27: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/27.jpg)
Yong Jae Lee
Interactive systems
![Page 28: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/28.jpg)
Video-based interfaces
Human joystick
NewsBreaker Live
Assistive technology systems
Camera Mouse
Boston College
Kristen Grauman
YouTube Link
![Page 29: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/29.jpg)
Vision for medical & neuroimages
Image guided surgery
MIT AI Vision Group
fMRI data
Golland et al.
Kristen Grauman
![Page 30: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/30.jpg)
Safety & security
Navigation,
driver safety Monitoring pool (Poseidon)
SurveillancePedestrian detection
MERL, Viola et al.Kristen Grauman
![Page 31: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/31.jpg)
Healthy eating
Im2calories by Myers et al., ICCV 2015
figure source
FarmBot.io
YouTube Link
![Page 32: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/32.jpg)
Pirsiavash et al., “Assessing the Quality of Actions”, ECCV 2014
Self-training for sports?
![Page 33: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/33.jpg)
Image generation
Radford et al., ICLR 2016
Reed et al., ICML 2016
![Page 34: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/34.jpg)
YouTube link
Seeing AI
![Page 35: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/35.jpg)
Obstacles?
Kristen GraumanRead more about the history: Szeliski Sec. 1.2
![Page 36: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/36.jpg)
What the computer gets
Adapted from Kristen Grauman and Lana Lazebnik
Why is this problematic?
![Page 37: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/37.jpg)
Why is vision difficult?
• Ill-posed problem: real world much more
complex than what we can measure in
images
– 3D 2D
• Impossible to literally “invert” image formation
process with limited information
• Need information outside of this particular
image to generalize what image portrays (e.g.
to resolve occlusion)
Adapted from Kristen Grauman
![Page 38: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/38.jpg)
Challenges: many nuisance parameters
Illumination Object pose Clutter
ViewpointIntra-class
appearanceOcclusions
Kristen Grauman
Think again about the pixels…
![Page 39: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/39.jpg)
Challenges: intra-class variation
slide credit: Fei-Fei, Fergus & Torralba
CMOA Pittsburgh
![Page 40: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/40.jpg)
Challenges: importance of context
slide credit: Fei-Fei, Fergus & Torralba
![Page 41: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/41.jpg)
• Thousands to millions of pixels in an image
• 3,000-30,000 human recognizable object categories
• 30+ degrees of freedom in the pose of articulated
objects (humans)
• Billions of images indexed by Google Image Search
• 1.424 billion smart camera phones sold in 2015
• About half of the cerebral cortex in primates is
devoted to processing visual information [Felleman
and van Essen 1991]
Kristen Grauman
Challenges: Complexity
![Page 42: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/42.jpg)
Challenges: Limited supervision
MoreLess
Kristen Grauman
![Page 43: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/43.jpg)
Challenges: Vision requires reasoning
Antol et al., “VQA: Visual Question Answering”, ICCV 2015
![Page 44: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/44.jpg)
• Ok, clearly the vision problem is deep and
challenging… time to give up?
• Active research area with exciting progress!
• How datasets changed:
…… …
… ……
… …
…
Kristen Grauman
![Page 45: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/45.jpg)
Datasets today
ImageNet: 22k categories, 14mil images
Microsoft COCO: 80 categories, 300k images
PASCAL: 20 categories, 12k images
SUN: 5k categories, 130k images
![Page 46: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/46.jpg)
Some Visual Recognition Problems
![Page 47: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/47.jpg)
Recognition: What is this?
![Page 48: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/48.jpg)
Recognition: What objects do you see?
carriagehorse
person
person
truck
street
building
table
balcony
car
![Page 49: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/49.jpg)
Detection: Where are the cars?
![Page 50: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/50.jpg)
Activity: What is this person doing?
![Page 51: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/51.jpg)
Scene: Is this an indoor scene?
![Page 52: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/52.jpg)
Instance: Which city? Which building?
![Page 53: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/53.jpg)
Visual question answering: What are all these people participating in?
![Page 54: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/54.jpg)
The Latest at CVPR 2016
* CVPR = IEEE Conference on
Computer Vision and Pattern
Recognition
![Page 55: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/55.jpg)
Our ability to detect objects has gone
from 34 mAP in 2008
to 73 mAP at 7 FPS (frames per second)
or 63 mAP at 45 FPS
in 2016
![Page 56: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/56.jpg)
Redmon et al., CVPR 2016
You Only Look Once:
Unified, Real-Time Object Detection
![Page 57: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/57.jpg)
Force from Motion:
Decoding Physical Sensation from a First Person Video
Park et al., CVPR 2016
![Page 58: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/58.jpg)
MovieQA: Understanding Stories in Movies through
Question-Answering
Tapaswi et al., CVPR 2016
![Page 59: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/59.jpg)
Visually Indicated Sounds
Owens et al., CVPR 2016
![Page 60: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/60.jpg)
Anticipating Visual Representations from Unlabeled Video
Vondrick et al., CVPR 2016
![Page 61: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/61.jpg)
Gatys et al., CVPR 2016
Image Style Transfer Using
Convolutional Neural Networks
![Page 62: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/62.jpg)
DeepArt.io – try it for yourself!(Image Style Transfer Using Convolutional Neural Networks)
Images:
Styles:
![Page 63: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/63.jpg)
DeepArt.io – try it for yourself!(Image Style Transfer Using Convolutional Neural Networks)
Results:
![Page 64: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/64.jpg)
Seeing Behind the Camera:
Identifying the Authorship of a Photograph
Thomas and Kovashka, CVPR 2016
![Page 65: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/65.jpg)
Is computer vision solved?
• Given an image, we can guess with 81% accuracy what object categories are shown (ResNet)
• … but we only answer “why” questions about images with 14% accuracy!
![Page 66: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/66.jpg)
Why does it seem that it’s solved?
• Deep learning makes excellent use of massive data (labeled for the task of interest?)– But it’s hard to understand how it does so
– It doesn’t work well when massive data is not available and your task is different than tasks for which data is available
• Sometimes the manner in which deep methods work is not intellectually appealing, but our “smarter” / more complex methods perform worse
![Page 67: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/67.jpg)
Overview of Topics
![Page 68: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/68.jpg)
Overview of topics
• Lower-level vision– Analyzing textures, edges and gradients in images,
without concern for the semantics (e.g. objects) of the image
• Higher-level vision– Making predictions about the semantics or higher-
level functions of content in images (e.g. objects, attributes, styles, motion, etc.)
– Involves machine learning; we’ll cover some basics of this then go back to low-level tasks
![Page 69: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/69.jpg)
Features and filters
• Transforming and
describing images;
textures, colors, edgesKristen Grauman
![Page 70: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/70.jpg)
• Detecting distinctive + repeatable features
• Describing images with local statistics
Features and filters
![Page 71: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/71.jpg)
Indexing and search
Kristen Grauman
• Matching features and
regions across images
![Page 72: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/72.jpg)
• How does light in
3d world project
to form 2d
images?
Kristen Grauman
Image formation
![Page 73: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/73.jpg)
Hartley and Zisserman
Lowe
• Multi-view geometry,
matching, invariant
features, stereo vision
Fei-Fei Li
Kristen Grauman
Multiple views
![Page 74: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/74.jpg)
[fig from Shi et al]
• Clustering,
segmentation,
fitting; what parts
belong together?Kristen Grauman
Grouping and fitting
![Page 75: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/75.jpg)
• Recognizing objects
and categories,
learning techniques
Kristen Grauman
Visual recognition
![Page 76: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/76.jpg)
• Detecting novel instances of objects
• Classifying regions as one of several categories
Object detection
![Page 77: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/77.jpg)
• Describing the high-level properties of objects
• Allows recognition of unseen objects
Attribute-based description
![Page 78: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/78.jpg)
Convolutional neural networks
• State-of-the-art on many recognition tasks
ImagePrediction
Yosinski et al., ICML DL workshop 2015
Krizhevsky et al.
![Page 79: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/79.jpg)
Recurrent neural networks
• Sequence processing, e.g. question answering
Wu et al., CVPR 2016
![Page 80: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/80.jpg)
• Tracking objects, video analysis
Tomas Izo
Kristen Grauman
Motion and tracking
![Page 81: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/81.jpg)
Pose and actions
• Automatically annotating human pose (joints)
• Recognizing actions in first-person video
![Page 82: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/82.jpg)
Your Homework
• Fill out Doodle
• Read entire course website
• Do first reading
![Page 83: CS 2770: Computer Visionkovashka/cs2770_sp17/vision... · 2017-01-03 · Convolutional Neural Networks. ... • Fill out Doodle • Read entire course website • Do first reading.](https://reader034.fdocuments.in/reader034/viewer/2022050609/5fafded0a1cb10494c24afb3/html5/thumbnails/83.jpg)
Next Time
• Linear algebra review
• Matlab tutorial