CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of...

53
CS376 Computer Vision Qixing Huang January 23 th 2019 Slide Credit: Kristen Grauman

Transcript of CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of...

Page 1: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

CS376 Computer Vision

Qixing Huang

January 23th 2019

Slide Credit: Kristen Grauman

Page 2: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Introductions

• Instructor

– Prof. Qixing Huang

• TA:

– Zhenpei Yang

Page 3: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Today

• Course overview

• Requirements, logistics

Page 4: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

What is computer vision

Done?

Page 5: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Computer Vision

• Automatic understanding of images and video

– 1. Computing properties of the 3D world from visual data (measurement)

Page 6: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

1. Vision for measurement (Sensing)

Page 7: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Computer Vision

• Automatic understanding of images and video

– 1. Computing properties of the 3D world from visual data (measurement)

– 2. Data representations and algorithms to allow a machine to recognize objects, people, scenes, and activities. (perception and interpolation)

Page 8: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

2. Vision for perception, interpretation

Page 9: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Computer Vision

• Automatic understanding of images and video

– 1. Computing properties of the 3D world from visual data (measurement)

– 2. Data representations and algorithms to allow a machine to recognize objects, people, scenes, and activities. (perception and interpolation)

– 3. Data representations and algorithms to mine, search, and interact with visual data (search and organization)

Page 10: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

3. Visual search, organization

Different representations and algorithms involved

Page 11: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Computer Vision

• Automatic understanding of images and video

– 1. Computing properties of the 3D world from visual data (measurement)

– 2. Data representations and algorithms to allow a machine to recognize objects, people, scenes, and activities. (perception and interpolation)

– 3. Data representations and algorithms to mine, search, and interact with visual data (search and organization)

Page 12: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Related disciplines

Computervision

GeometryTopology

Functional analysis

OptimizationMachineLearning

Algorithms

Graphics

CognitiveScience

NLPRobotics

Bio/Medical Imaging

Page 13: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Vision and graphics

Inverse problems: analysis and synthesis

Page 14: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Dataset

Representation

ML Algorithm

Page 15: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Progress charted by datasets

Page 16: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Progress charted by datasets

Page 17: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Progress charted by datasets

Page 18: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Progress charted by datasets

Page 19: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Data Representations

Page 20: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Deformable-Part-Model (Felzenszwalb et al. 10)

Pictorial Structures (Fischler et al. 73)

1973 2010

Page 21: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Deformable-Part-Model (Felzenszwalb et al. 10)

Pictorial Structures (Fischler et al. 73)

1973 2010

Page 22: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

SIFT (Lowe 04)

1973 20102004

Page 23: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

HOG (Dalal and Triggs 05)GIST (Oliva and Torralba 01)

1973 201020042001 2005

Page 24: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

AlexNet (Krizhevsky et al. 12)

1973 201020042001 2005 2012

Page 25: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

VGG19 (Simonyan and Zisserman 14)

1973 201020042001 2005 20122014

Page 26: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

ResNet (He et al. 16)

1973 201020042001 2005 20122014 2016

Page 27: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

PointNet (R. Qi and Su et al. 17)

1973 201020042001 2005 20122014 2016 2017

Page 28: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Machine Learning Algorithms

Page 29: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Normalized Cut (Shi and Malik 97)

1997

Page 30: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Graph Cut (Boykov et al. 99)

1997 1999

Page 31: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

AdaBoosting for face detection (Viola and Jones 04)

TextonBoost for segmentation (Shotton et al. 06)

1997 1999 2004 2006

Page 32: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Support Vector Machine in Deformable Part Model(Felzenszwalb et al. 10)

1997 1999 2004 2006 2010

Page 33: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Back-propagation in neural network training/implementation(Rumelhart et al. 86, LeCun et al. 98, Abadi et al. 16)

1997 1999 2004 2006 2010 2012

Page 34: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Adam: A method for stochastic optimization(Kingma and Ba 14)

1997 1999 2004 2006 2010 2012 2014

Page 35: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Goals of this course

• Upper division undergrad course

• Introduction to primary topics– Fundamentals of computer vision- image processing,

grouping, multiple views– Recognition – emphasis on machine learning (the most

active research area in computer vision)

• Hands-on experience with algorithms • Basic knowledge of the research field of computer

vision

Page 36: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Topics overview

• Features & filters

• Grouping & fitting

• Multiple views

• Recognition

Page 37: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Features and filters

Transforming and describing images;Textures, colors, edges

Building blocks forneural networks

Page 38: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Grouping & fitting

Clustering, Segmentation, fitting; what parts belong together?

Shi et al.

Page 39: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Multiple Views

Invariant features, matchingEpipolar geometryStructure-from-motion, stereo

Page 40: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Recognition and learning

Data representation (vectorized) -> machine learning techniques

Page 41: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Textbooks

• Recommend book:

– Computer Vision: Algorithms and Applications

– By Rick Szeliski

– http://szeliski.org/Book

– Online

Page 42: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Assignments

• Majority - Programming problem

– Implementation

– Explanation, results

• Code in Matlab – available on CS Unix machines (see course page)

• Optional Latex templates

• Most of these assignments take significant time to do. We recommend starting early

Page 43: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Matlab

• Built-in toolboxes for low-level image processing, visualization

• Compact programs

• Intuitive interactive debugging

• Widely used in engineering

• I use it a lot!

Page 44: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Assignment 0

• A0: Matlab warmup + basic image manipulation

• Out today, due Tues Jan 29

• Verify CS account and Matlab access

• Look at the tutorial online

Page 45: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Preview of assignments

Seam carving

Page 46: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Preview of assignments

Grouping for segmentation

Page 47: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Preview of assignments

Structure-from-motion and stereo

Page 48: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Preview of assignments

Image classification

Page 49: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Preview of assignments

Object detection

Page 50: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Collaboration policy

• All responses and code must be written individually unless otherwise specified

• Students submitting answers or code found to be identical or substantially similar (due to inappropriate collaboration) risk failing the course.

Page 51: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Assignment deadlines

• Due about every two weeks

– Tentative deadlines posted online but could slightly shift depending on lecture pace

• Assignments in by 11:59 pm on due date

– Submit on Canvas, following submission instructions given in assignment

– Deadlines are firm. We will use timestamp

Page 52: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Miscellaneous

• Slides, announcements via class website

• No laptops, phones, tablets, etc. open in class please

• Please use the front rows

Page 53: CS376 Computer Visionhuangqx/CS376_Lecture_1.pdf · Computer Vision •Automatic understanding of images and video –1. Computing properties of the 3D world from visual data (measurement)

Coming up

• Now: check out Matlab tutorial online

• A0 due Tuesday Jan. 29

• Textbook reading posted for next week