Post on 05-Aug-2021
Introduction
Selim Aksoy
Department of Computer EngineeringDepartment of Computer Engineering
Bilkent University
saksoy@cs.bilkent.edu.tr
What is computer vision?
� Analysis of digital images by a computer.
Stockman and Shapiro: making useful decisions about real � Stockman and Shapiro: making useful decisions about real physical objects and scenes based on sensed images.
� Trucco and Verri: computing properties of the 3D world from one or more digital images.
� Ballard and Brown: construction of explicit, meaningful
CS 484, Spring 2009 ©2009, Selim Aksoy 2
� Ballard and Brown: construction of explicit, meaningful description of physical objects from images.
� Forsyth and Ponce: extracting descriptions of the world from pictures or sequences of pictures.
Why study computer vision?
� Possibility of building intelligent machines is fascinating.
� Capability of understanding the visual world is a prerequisite for such machines.
� Much of the human brain is dedicated to vision.
CS 484, Spring 2009 ©2009, Selim Aksoy 3
� Humans solve many visual problems effortlessly, yet we have little understanding of visual cognition.
Why study computer vision?
� An image is worth 1000 words.
Images and videos are everywhere.� Images and videos are everywhere.
� Fast growing collections and many useful applications.
Goals of vision research:
CS 484, Spring 2009 ©2009, Selim Aksoy 4
� Goals of vision research:� Give machines the ability to understand scenes.
� Aid understanding and modeling of human vision.
� Automate visual operations.
Applications
� Medical image analysis� Security
Biometrics
� Industrial inspection, quality control
Document analysis� Biometrics� Surveillance� Tracking� Target recognition
� Remote sensing� Robotics
� Document analysis
� Multimedia
� Assisted living
� Human-computer interfaces
CS 484, Spring 2009 ©2009, Selim Aksoy 5
� …
Medical image analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 6
http://www.clarontech.com
Medical image analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 7
http://www.clarontech.com
Medical image analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 8
http://www.clarontech.com
Medical image analysis
Cancer detection and grading
CS 484, Spring 2009 ©2009, Selim Aksoy 9
Medical image analysis
Slice of lung
CS 484, Spring 2009 ©2009, Selim Aksoy 10
Adapted from Linda Shapiro, U of Washington
Biometrics
CS 484, Spring 2009 ©2009, Selim Aksoy 11
Adapted fromAnil Jain,
Michigan State
Biometrics
CS 484, Spring 2009 ©2009, Selim Aksoy 12
Adapted from Anil Jain, Michigan State
Surveillance and tracking
CS 484, Spring 2009 ©2009, Selim Aksoy 13
University of Central Florida, Computer Vision Lab
Surveillance and tracking
CS 484, Spring 2009 ©2009, Selim Aksoy 14
Adapted from Octavia Camps, Penn State
Surveillance and tracking
CS 484, Spring 2009 ©2009, Selim Aksoy 15
Adapted from Martial Hebert, CMU
Surveillance and tracking
CS 484, Spring 2009 ©2009, Selim Aksoy 16
University of Central Florida, Computer Vision Lab
Generating traffic patterns
Surveillance and tracking
CS 484, Spring 2009 ©2009, Selim Aksoy 17
Adapted from Martial Hebert, CMU, andMasaharu Kobashi, U of Washington
Tracking in UAV videos
Vehicle and pedestrian protection
CS 484, Spring 2009 ©2009, Selim Aksoy 18
http://www.mobileye-vision.com
Lane departure warning, collision warning, traffic sign recognition,pedestrian recognition, blind spot warning
Forest fire monitoring system
CS 484, Spring 2009 ©2009, Selim Aksoy 19
Adapted from Enis Cetin, Bilkent University
Early warning of forest fires
Land cover classification
CS 484, Spring 2009 ©2009, Selim Aksoy 20
Object recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 21
Object recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 22
Recognition of buildings and building groups
Object recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 23
Automatic mapping; agriculture
Content-based retrieval
CS 484, Spring 2009 ©2009, Selim Aksoy 24
Finding similar regions: airports
Robotics
CS 484, Spring 2009 ©2009, Selim Aksoy 25
Adapted from Linda Shapiro, U of Washington
Robotics
CS 484, Spring 2009 ©2009, Selim Aksoy 26
Adapted from Steven Seitz, U of Washington
Autonomous navigation
CS 484, Spring 2009 ©2009, Selim Aksoy 27
Michigan State UniversityGeneral Dynamics Robotics Systems
http://www.gdrs.com
Industrial automation
CS 484, Spring 2009 ©2009, Selim Aksoy 28
Color Vision Systemshttp://www.cvs.com.au
Automatic fruit sorting
Industrial automation
Industrial robotics;bin picking
CS 484, Spring 2009 ©2009, Selim Aksoy 29
http://www.braintech.com
Postal service automation
CS 484, Spring 2009 ©2009, Selim Aksoy 30
General Dynamics Robotics Systemshttp://www.gdrs.com
Document analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 31
Digit recognition, AT&T labshttp://www.research.att.com/~yann
Adapted from Steven Seitz, U of Washington
Document analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 32
Adapted from Shapiro and Stockman
Document analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 33
Adapted from Linda Shapiro, U of Washington
Sports video analysis
CS 484, Spring 2009 ©2009, Selim Aksoy 34
http://www.hawkeyeinnovations.co.ukTennis review system
Scene classification
CS 484, Spring 2009 ©2009, Selim Aksoy 35
Organizing image archives
CS 484, Spring 2009 ©2009, Selim Aksoy 36
Adapted from Pinar Duygulu, Bilkent University
Photo tourism: exploring photo collections
CS 484, Spring 2009 ©2009, Selim Aksoy 37
Building 3D scene models from individual photos
Adapted from Steven Seitz, U of Washington
Content-based retrieval
CS 484, Spring 2009 ©2009, Selim Aksoy 38
Content-based retrieval
CS 484, Spring 2009 ©2009, Selim Aksoy 39
Content-based retrieval
CS 484, Spring 2009 ©2009, Selim Aksoy 40
Online shopping catalog search
http://www.like.com
Face detection and recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 41
Object recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 42
Adapted from Rob Fergus, MIT
3D scanning
CS 484, Spring 2009 ©2009, Selim Aksoy 43
Adapted from Linda Shapiro, U of Washington
3D reconstruction
CS 484, Spring 2009 ©2009, Selim Aksoy 44
Adapted from David Forsyth, UC Berkeley
3D reconstruction
CS 484, Spring 2009 ©2009, Selim Aksoy 45
Adapted from David Forsyth, UC Berkeley
Motion capture
CS 484, Spring 2009 ©2009, Selim Aksoy 46
Adapted from Linda Shapiro, U of Washington
Visual effects
CS 484, Spring 2009 ©2009, Selim Aksoy 47
Adapted from Linda Shapiro, U of Washington
Mozaic
CS 484, Spring 2009 ©2009, Selim Aksoy 48
Adapted from David Forsyth, UC Berkeley
Mozaic
CS 484, Spring 2009 ©2009, Selim Aksoy 49
Adapted from David Forsyth, UC Berkeley
Critical issues
� What information should be extracted?
� How can it be extracted?
� How should it be represented?
How can it be used to aid analysis and
CS 484, Spring 2009 ©2009, Selim Aksoy 50
� How can it be used to aid analysis and understanding?
Challenge
� What do you see in the picture?
� A hand holding a man
� A hand holding a shiny sphere
� An Escher drawing
CS 484, Spring 2009 ©2009, Selim Aksoy 51
Adapted from Octavia Camps, Penn State
Perception and grouping
Subjective contours
CS 484, Spring 2009 ©2009, Selim Aksoy 52
Perception and grouping
Subjective contours
CS 484, Spring 2009 ©2009, Selim Aksoy 53
Adapted from Michael Black, Brown University
Perception and grouping
CS 484, Spring 2009 ©2009, Selim Aksoy 54
Adapted from Gonzales and Woods
Perception and grouping
CS 484, Spring 2009 ©2009, Selim Aksoy 55
Adapted from Gonzales and Woods
Perception and grouping
Occlusion
CS 484, Spring 2009 ©2009, Selim Aksoy 57
Adapted from Michael Black, Brown University
Perception and grouping
� The shape of junctions constrains the possible interpretations of the interpretations of the scene.
� Ambiguous: paint and surface boundaries can be confused.
CS 484, Spring 2009 ©2009, Selim Aksoy 58
Adapted from Michael Black, Brown University
Challenges 1: view point variation
CS 484, Spring 2009 ©2009, Selim Aksoy 59
Michelangelo 1475-1564
Adapted from L. Fei-Fei,R. Fergus, A. Torralba
Challenges 2: illumination
CS 484, Spring 2009 ©2009, Selim Aksoy 60
Adapted from L. Fei-Fei, R. Fergus, A. Torralba
Challenges 3: occlusion
Magritte, 1957
CS 484, Spring 2009 ©2009, Selim Aksoy 61
Adapted from L. Fei-Fei,R. Fergus, A. Torralba
Challenges 4: scale
CS 484, Spring 2009 ©2009, Selim Aksoy 62
Adapted from L. Fei-Fei,R. Fergus, A. Torralba
Challenges 5: deformation
CS 484, Spring 2009 ©2009, Selim Aksoy 63
Xu, Beihong 1943
Adapted from L. Fei-Fei, R. Fergus, A. Torralba
Challenges 6: background clutter
Klimt, 1913
CS 484, Spring 2009 ©2009, Selim Aksoy 64
Adapted from L. Fei-Fei,R. Fergus, A. Torralba
Challenges 7: intra-class variation
CS 484, Spring 2009 ©2009, Selim Aksoy 65
Adapted from L. Fei-Fei, R. Fergus, A. Torralba
Recognition
� How can different cues such as color, texture, shape, motion, etc., can be used texture, shape, motion, etc., can be used for recognition?
� Which parts of image should be recognized together?
� How can objects be recognized without focusing on detail?
CS 484, Spring 2009 ©2009, Selim Aksoy 66
focusing on detail?
� How can objects with many free parameters be recognized?
� How do we structure very large model bases?
Color
CS 484, Spring 2009 ©2009, Selim Aksoy 67
Adapted from Martial Hebert, CMU
Texture
CS 484, Spring 2009 ©2009, Selim Aksoy 68
Adapted from David Forsyth, UC Berkeley
Original Images Color Regions Texture Regions Line Clusters
Segmentation
CS 484, Spring 2009 ©2009, Selim Aksoy 69
Adapted from Linda Shapiro, U of Washington
Segmentation
CS 484, Spring 2009 ©2009, Selim Aksoy 70
Adapted from Jianbo Shi, U Penn
Shape
CS 484, Spring 2009 ©2009, Selim Aksoy 71
Model database
Recognized objects
Adapted from Enis Cetin, Bilkent University
Motion
CS 484, Spring 2009 ©2009, Selim Aksoy 72
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 73
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 74
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 75
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 76
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 77
Adapted from Michael Black, Brown University
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 78
Recognition
CS 484, Spring 2009 ©2009, Selim Aksoy 79
Adapted from David Forsyth, UC Berkeley
Detection
CS 484, Spring 2009 ©2009, Selim Aksoy 80
Adapted from David Forsyth, UC Berkeley
Detection
CS 484, Spring 2009 ©2009, Selim Aksoy 81
Adapted from David Forsyth, UC Berkeley
Detection
CS 484, Spring 2009 ©2009, Selim Aksoy 82
Adapted from Michael Black, Brown University
Parts and relations
CS 484, Spring 2009 ©2009, Selim Aksoy 83
Adapted from Michael Black, Brown University
Parts and relations
CS 484, Spring 2009 ©2009, Selim Aksoy 84
Adapted from Michael Black, Brown University
Context
CS 484, Spring 2009 ©2009, Selim Aksoy 85
Adapted from Antonio Torralba, MIT
Context
CS 484, Spring 2009 ©2009, Selim Aksoy 86
Adapted from Antonio Torralba, MIT
Context
CS 484, Spring 2009 ©2009, Selim Aksoy 87
Adapted from Derek Hoiem, CMU
Context
CS 484, Spring 2009 ©2009, Selim Aksoy 88
Adapted fromDerek Hoiem, CMU
Stages of computer vision
� Low-levelimage � imageimage � image
� Mid-levelimage � features / attributes
Image analysis / image understanding
CS 484, Spring 2009 ©2009, Selim Aksoy 89
� High-levelfeatures � “making sense”, recognition
sharpening
Low-level
sharpening
CS 484, Spring 2009 ©2009, Selim Aksoy 90
blurring
Adapted from Linda Shapiro, U of Washington
Canny
Low-level
original image edge image
Mid-level
CS 484, Spring 2009 ©2009, Selim Aksoy 91
ORT
edge image circular arcs and line segments
datastructure
Adapted from Linda Shapiro, U of Washington
K-meansclustering
Mid-level
clustering
(followed byconnectedcomponentanalysis)
CS 484, Spring 2009 ©2009, Selim Aksoy 92
original color image regions of homogeneous color
datastructure
Adapted from Linda Shapiro, U of Washington
low-level
Low-level to high-level
edge image
consistent
low-level
mid-level
CS 484, Spring 2009 ©2009, Selim Aksoy 93
consistentline clustershigh-level
Adapted from Linda Shapiro, U of Washington