Selim Aksoy Department of Computer Engineering Bilkent...
Transcript of Selim Aksoy Department of Computer Engineering Bilkent...
Introduction
Selim Aksoy
Department of Computer EngineeringDepartment of Computer Engineering
Bilkent University
What is computer vision?
� Analysis of digital images by a computer.
Stockman and Shapiro: making useful decisions about real � Stockman and Shapiro: making useful decisions about real physical objects and scenes based on sensed images.
� Trucco and Verri: computing properties of the 3D world from one or more digital images.
� Ballard and Brown: construction of explicit, meaningful
CS 484, Spring 2009 ©2009, Selim Aksoy 2
� Ballard and Brown: construction of explicit, meaningful description of physical objects from images.
� Forsyth and Ponce: extracting descriptions of the world from pictures or sequences of pictures.
Why study computer vision?
� Possibility of building intelligent machines is fascinating.
� Capability of understanding the visual world is a prerequisite for such machines.
� Much of the human brain is dedicated to vision.
CS 484, Spring 2009 ©2009, Selim Aksoy 3
� Humans solve many visual problems effortlessly, yet we have little understanding of visual cognition.
Why study computer vision?
� An image is worth 1000 words.
Images and videos are everywhere.� Images and videos are everywhere.
� Fast growing collections and many useful applications.
Goals of vision research:
CS 484, Spring 2009 ©2009, Selim Aksoy 4
� Goals of vision research:� Give machines the ability to understand scenes.
� Aid understanding and modeling of human vision.
� Automate visual operations.
Applications
� Medical image analysis� Security
Biometrics
� Industrial inspection, quality control
Document analysis� Biometrics� Surveillance� Tracking� Target recognition
� Remote sensing� Robotics
� Document analysis
� Multimedia
� Assisted living
� Human-computer interfaces
CS 484, Spring 2009 ©2009, Selim Aksoy 5
� …
Medical image analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 6
http://www.clarontech.com
Medical image analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 7
http://www.clarontech.com
Medical image analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 8
http://www.clarontech.com
Medical image analysis
Cancer detection and grading
CS 484, Spring 2009 ©2009, Selim Aksoy 9
Medical image analysis
Slice of lung
CS 484, Spring 2009 ©2009, Selim Aksoy 10
Adapted from Linda Shapiro, U of Washington
Biometrics
CS 484, Spring 2009 ©2009, Selim Aksoy 11
Adapted fromAnil Jain,
Michigan State
Biometrics
CS 484, Spring 2009 ©2009, Selim Aksoy 12
Adapted from Anil Jain, Michigan State
Surveillance and tracking
CS 484, Spring 2009 ©2009, Selim Aksoy 13
University of Central Florida, Computer Vision Lab
Surveillance and tracking
CS 484, Spring 2009 ©2009, Selim Aksoy 14
Adapted from Octavia Camps, Penn State
Surveillance and tracking
CS 484, Spring 2009 ©2009, Selim Aksoy 15
Adapted from Martial Hebert, CMU
Surveillance and tracking
CS 484, Spring 2009 ©2009, Selim Aksoy 16
University of Central Florida, Computer Vision Lab
Generating traffic patterns
Surveillance and tracking
CS 484, Spring 2009 ©2009, Selim Aksoy 17
Adapted from Martial Hebert, CMU, andMasaharu Kobashi, U of Washington
Tracking in UAV videos
Vehicle and pedestrian protection
CS 484, Spring 2009 ©2009, Selim Aksoy 18
http://www.mobileye-vision.com
Lane departure warning, collision warning, traffic sign recognition,pedestrian recognition, blind spot warning
Forest fire monitoring system
CS 484, Spring 2009 ©2009, Selim Aksoy 19
Adapted from Enis Cetin, Bilkent University
Early warning of forest fires
Land cover classification
CS 484, Spring 2009 ©2009, Selim Aksoy 20
Object recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 21
Object recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 22
Recognition of buildings and building groups
Object recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 23
Automatic mapping; agriculture
Content-based retrieval
CS 484, Spring 2009 ©2009, Selim Aksoy 24
Finding similar regions: airports
Robotics
CS 484, Spring 2009 ©2009, Selim Aksoy 25
Adapted from Linda Shapiro, U of Washington
Robotics
CS 484, Spring 2009 ©2009, Selim Aksoy 26
Adapted from Steven Seitz, U of Washington
Autonomous navigation
CS 484, Spring 2009 ©2009, Selim Aksoy 27
Michigan State UniversityGeneral Dynamics Robotics Systems
http://www.gdrs.com
Industrial automation
CS 484, Spring 2009 ©2009, Selim Aksoy 28
Color Vision Systemshttp://www.cvs.com.au
Automatic fruit sorting
Industrial automation
Industrial robotics;bin picking
CS 484, Spring 2009 ©2009, Selim Aksoy 29
http://www.braintech.com
Postal service automation
CS 484, Spring 2009 ©2009, Selim Aksoy 30
General Dynamics Robotics Systemshttp://www.gdrs.com
Document analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 31
Digit recognition, AT&T labshttp://www.research.att.com/~yann
Adapted from Steven Seitz, U of Washington
Document analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 32
Adapted from Shapiro and Stockman
Document analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 33
Adapted from Linda Shapiro, U of Washington
Sports video analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 34
http://www.hawkeyeinnovations.co.ukTennis review system
Scene classification
CS 484, Spring 2009 ©2009, Selim Aksoy 35
Organizing image archives
CS 484, Spring 2009 ©2009, Selim Aksoy 36
Adapted from Pinar Duygulu, Bilkent University
Photo tourism: exploring photo collections
CS 484, Spring 2009 ©2009, Selim Aksoy 37
Building 3D scene models from individual photos
Adapted from Steven Seitz, U of Washington
Content-based retrieval
CS 484, Spring 2009 ©2009, Selim Aksoy 38
Content-based retrieval
CS 484, Spring 2009 ©2009, Selim Aksoy 39
Content-based retrieval
CS 484, Spring 2009 ©2009, Selim Aksoy 40
Online shopping catalog search
http://www.like.com
Face detection and recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 41
Object recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 42
Adapted from Rob Fergus, MIT
3D scanning
CS 484, Spring 2009 ©2009, Selim Aksoy 43
Adapted from Linda Shapiro, U of Washington
3D reconstruction
CS 484, Spring 2009 ©2009, Selim Aksoy 44
Adapted from David Forsyth, UC Berkeley
3D reconstruction
CS 484, Spring 2009 ©2009, Selim Aksoy 45
Adapted from David Forsyth, UC Berkeley
Motion capture
CS 484, Spring 2009 ©2009, Selim Aksoy 46
Adapted from Linda Shapiro, U of Washington
Visual effects
CS 484, Spring 2009 ©2009, Selim Aksoy 47
Adapted from Linda Shapiro, U of Washington
Mozaic
CS 484, Spring 2009 ©2009, Selim Aksoy 48
Adapted from David Forsyth, UC Berkeley
Mozaic
CS 484, Spring 2009 ©2009, Selim Aksoy 49
Adapted from David Forsyth, UC Berkeley
Critical issues
� What information should be extracted?
� How can it be extracted?
� How should it be represented?
How can it be used to aid analysis and
CS 484, Spring 2009 ©2009, Selim Aksoy 50
� How can it be used to aid analysis and understanding?
Challenge
� What do you see in the picture?
� A hand holding a man
� A hand holding a shiny sphere
� An Escher drawing
CS 484, Spring 2009 ©2009, Selim Aksoy 51
Adapted from Octavia Camps, Penn State
Perception and grouping
Subjective contours
CS 484, Spring 2009 ©2009, Selim Aksoy 52
Perception and grouping
Subjective contours
CS 484, Spring 2009 ©2009, Selim Aksoy 53
Adapted from Michael Black, Brown University
Perception and grouping
CS 484, Spring 2009 ©2009, Selim Aksoy 54
Adapted from Gonzales and Woods
Perception and grouping
CS 484, Spring 2009 ©2009, Selim Aksoy 55
Adapted from Gonzales and Woods
Perception and grouping
Occlusion
CS 484, Spring 2009 ©2009, Selim Aksoy 57
Adapted from Michael Black, Brown University
Perception and grouping
� The shape of junctions constrains the possible interpretations of the interpretations of the scene.
� Ambiguous: paint and surface boundaries can be confused.
CS 484, Spring 2009 ©2009, Selim Aksoy 58
Adapted from Michael Black, Brown University
Challenges 1: view point variation
CS 484, Spring 2009 ©2009, Selim Aksoy 59
Michelangelo 1475-1564
Adapted from L. Fei-Fei,R. Fergus, A. Torralba
Challenges 2: illumination
CS 484, Spring 2009 ©2009, Selim Aksoy 60
Adapted from L. Fei-Fei, R. Fergus, A. Torralba
Challenges 3: occlusion
Magritte, 1957
CS 484, Spring 2009 ©2009, Selim Aksoy 61
Adapted from L. Fei-Fei,R. Fergus, A. Torralba
Challenges 4: scale
CS 484, Spring 2009 ©2009, Selim Aksoy 62
Adapted from L. Fei-Fei,R. Fergus, A. Torralba
Challenges 5: deformation
CS 484, Spring 2009 ©2009, Selim Aksoy 63
Xu, Beihong 1943
Adapted from L. Fei-Fei, R. Fergus, A. Torralba
Challenges 6: background clutter
Klimt, 1913
CS 484, Spring 2009 ©2009, Selim Aksoy 64
Adapted from L. Fei-Fei,R. Fergus, A. Torralba
Challenges 7: intra-class variation
CS 484, Spring 2009 ©2009, Selim Aksoy 65
Adapted from L. Fei-Fei, R. Fergus, A. Torralba
Recognition
� How can different cues such as color, texture, shape, motion, etc., can be used texture, shape, motion, etc., can be used for recognition?
� Which parts of image should be recognized together?
� How can objects be recognized without focusing on detail?
CS 484, Spring 2009 ©2009, Selim Aksoy 66
focusing on detail?
� How can objects with many free parameters be recognized?
� How do we structure very large model bases?
Color
CS 484, Spring 2009 ©2009, Selim Aksoy 67
Adapted from Martial Hebert, CMU
Texture
CS 484, Spring 2009 ©2009, Selim Aksoy 68
Adapted from David Forsyth, UC Berkeley
Original Images Color Regions Texture Regions Line Clusters
Segmentation
CS 484, Spring 2009 ©2009, Selim Aksoy 69
Adapted from Linda Shapiro, U of Washington
Segmentation
CS 484, Spring 2009 ©2009, Selim Aksoy 70
Adapted from Jianbo Shi, U Penn
Shape
CS 484, Spring 2009 ©2009, Selim Aksoy 71
Model database
Recognized objects
Adapted from Enis Cetin, Bilkent University
Motion
CS 484, Spring 2009 ©2009, Selim Aksoy 72
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 73
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 74
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 75
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 76
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 77
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 78
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 79
Adapted from David Forsyth, UC Berkeley
Detection
CS 484, Spring 2009 ©2009, Selim Aksoy 80
Adapted from David Forsyth, UC Berkeley
Detection
CS 484, Spring 2009 ©2009, Selim Aksoy 81
Adapted from David Forsyth, UC Berkeley
Detection
CS 484, Spring 2009 ©2009, Selim Aksoy 82
Adapted from Michael Black, Brown University
Parts and relations
CS 484, Spring 2009 ©2009, Selim Aksoy 83
Adapted from Michael Black, Brown University
Parts and relations
CS 484, Spring 2009 ©2009, Selim Aksoy 84
Adapted from Michael Black, Brown University
Context
CS 484, Spring 2009 ©2009, Selim Aksoy 85
Adapted from Antonio Torralba, MIT
Context
CS 484, Spring 2009 ©2009, Selim Aksoy 86
Adapted from Antonio Torralba, MIT
Context
CS 484, Spring 2009 ©2009, Selim Aksoy 87
Adapted from Derek Hoiem, CMU
Context
CS 484, Spring 2009 ©2009, Selim Aksoy 88
Adapted fromDerek Hoiem, CMU
Stages of computer vision
� Low-levelimage � imageimage � image
� Mid-levelimage � features / attributes
Image analysis / image understanding
CS 484, Spring 2009 ©2009, Selim Aksoy 89
� High-levelfeatures � “making sense”, recognition
sharpening
Low-level
sharpening
CS 484, Spring 2009 ©2009, Selim Aksoy 90
blurring
Adapted from Linda Shapiro, U of Washington
Canny
Low-level
original image edge image
Mid-level
CS 484, Spring 2009 ©2009, Selim Aksoy 91
ORT
edge image circular arcs and line segments
datastructure
Adapted from Linda Shapiro, U of Washington
K-meansclustering
Mid-level
clustering
(followed byconnectedcomponentanalysis)
CS 484, Spring 2009 ©2009, Selim Aksoy 92
original color image regions of homogeneous color
datastructure
Adapted from Linda Shapiro, U of Washington
low-level
Low-level to high-level
edge image
consistent
low-level
mid-level
CS 484, Spring 2009 ©2009, Selim Aksoy 93
consistentline clustershigh-level
Adapted from Linda Shapiro, U of Washington