20110220 computer vision_eruhimov_lecture01
-
Upload
computer-science-club -
Category
Documents
-
view
7.184 -
download
1
Transcript of 20110220 computer vision_eruhimov_lecture01
![Page 1: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/1.jpg)
Computer vision for robotics
Victor EruhimovCTO, itseez
http://www.itseez.com
![Page 2: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/2.jpg)
Why do we need computer vision?
• Smart video surveillance• Biometrics• Automatic Driver Assistance Systems• Machine vision (Visual inspection)• Image retrieval (e.g. Google Goggles)• Movie production• Robotics
![Page 3: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/3.jpg)
Vision is hard! Even for humans…
![Page 4: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/4.jpg)
Texai parking
![Page 5: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/5.jpg)
Agenda• Camera model• Stereo vision
– Stereo vision on GPU
• Object detection methods– Sliding window– Local descriptors
• Applications– Textured object detection– Outlet detection– Visual odometry
![Page 6: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/6.jpg)
Pinhole camera model
![Page 7: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/7.jpg)
Distortion model
![Page 8: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/8.jpg)
Reprojection error
€
ui
v i
⎛
⎝ ⎜
⎞
⎠ ⎟
⎧ ⎨ ⎩
⎫ ⎬ ⎭i=1..n
€
x i
y i
zi
⎛
⎝
⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟
⎧
⎨ ⎪
⎩ ⎪
⎫
⎬ ⎪
⎭ ⎪
i=1..n
€
uip
v ip
⎛
⎝ ⎜
⎞
⎠ ⎟= ˆ P f
x i
y i
zi
⎛
⎝
⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟,α
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥
€
error P( ) =ui
v i
⎛
⎝ ⎜
⎞
⎠ ⎟−
uip
v ip
⎛
⎝ ⎜
⎞
⎠ ⎟
⎡
⎣ ⎢ ⎢
⎤
⎦ ⎥ ⎥
2
i
∑
![Page 9: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/9.jpg)
Homography
€
˜ u =h11u + h12v + h13
h31u + h32v + h33
˜ v =h21u + h22v + h23
h31u + h32v + h33
€
˜ u
˜ v
1
⎛
⎝
⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟= H
u
v
1
⎛
⎝
⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟
![Page 10: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/10.jpg)
Perspective-n-Points problem
• P4P• RANSAC (RANdom SAmple Consensus)€
uip
v ip
⎛
⎝ ⎜
⎞
⎠ ⎟= ˆ P R
x i
y i
zi
⎛
⎝
⎜ ⎜ ⎜
⎞
⎠
⎟ ⎟ ⎟+ T
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥
![Page 11: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/11.jpg)
Stereo: epipolar geometry
0
1
1,,
R
R
LL y
x
Fyx
Fundamental matrix constraint
![Page 12: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/12.jpg)
Stereo Rectification• Algorithm steps are shown at right:• Goal:
– Each row of the image contains the same world points– “Epipolar constraint”
12
Result: Epipolar alignment of features:
All: Gary Bradski and Adrian Kaehler: Learning OpenCV
![Page 13: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/13.jpg)
Stereo correspondence
• Block matching• Dynamic programming• Inter-scanline dependencies
– Segmentation– Belief propagation
![Page 14: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/14.jpg)
Stereo correspondence block matching
For each block in left image:
Search for the corresponding block in the right image such that SSD or SAD between pixel intensities is minimum
![Page 15: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/15.jpg)
Pre- and post processing
• Low texture filtering• SSD/SAD minimum
ambiguity removal• Using gradients
instead of intensities• Speckle filtering
![Page 16: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/16.jpg)
Stereo Matching
![Page 17: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/17.jpg)
Parallel implementation of block matching
• The outer cycle iterates through disparity values
• We compute SSD and compare it with the current minimum for each pixel in a tile
• Different tiles reuse the results of each other
17
![Page 18: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/18.jpg)
Parallelization scheme
18
![Page 19: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/19.jpg)
Optimization concepts
• Not using texture – saving registers• 1 thread per 8 pixels processing – using cache• Reducing the amount of arithmetic
operations• Non-parallelizable functions (speckle
filtering) are done on CPU
19
![Page 20: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/20.jpg)
Performance summary
• CPU (i5 750 2.66GHz), GPU (Fermi card 448 cores)
• Block matching on CPU+2xGPU is 10 times faster than CPU implementation with SSE optimization, enabling real-time processing of HD images!
![Page 21: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/21.jpg)
Full-HD stereo in realtime
http://www.youtube.com/watch?v=ThE7sRAtaWU
![Page 22: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/22.jpg)
Applications of stereo vision
• Machine vision• Automatic Driver Assistance• Movie production• Robotics
– Object recognition– Visual odometry / SLAM
![Page 23: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/23.jpg)
Object detection
![Page 24: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/24.jpg)
Sliding window approach
![Page 25: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/25.jpg)
Cascade classifier
Stage 1 Stage 2 Stage 3
image face face
Not face Not face Not face
face
Real-time in year 2000!
![Page 26: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/26.jpg)
Face detection
![Page 27: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/27.jpg)
Object detection with local descriptors
• Detect keypoints• Calculate local descriptors for each point• Match descriptors for different images• Validate matches with a geometry model
![Page 28: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/28.jpg)
FAST feature detector
![Page 29: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/29.jpg)
Keypoints example
![Page 30: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/30.jpg)
SIFT descriptor
David Lowe, 2004
![Page 31: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/31.jpg)
SURF descriptor
• 4x4 square regions inside a square window 20*s
• 4 values per square region
![Page 32: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/32.jpg)
More descriptors
• One way descriptor• C-descriptor, FERNS, BRIEF• HoG• Daisy
![Page 33: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/33.jpg)
Matching descriptors example
![Page 34: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/34.jpg)
Ways to improve matching
• Increase the inliers to outliers ratio– Distance threshold– Distance ratio threshold (second to first NN distance)
– Backward-forward matching– Windowed matching
• Increase the amount of inliers– One to many matching
![Page 35: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/35.jpg)
Random Sample Consensus
• Do n iterations until #inliers > inlierThreshold– Draw k matches randomly– Find the transformation– Calculate inliers count– Remember the best solution
The number of iterations required ~
€
10 *# matches
# inliers
⎛
⎝ ⎜
⎞
⎠ ⎟k
![Page 36: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/36.jpg)
Geometry validation
![Page 37: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/37.jpg)
Scaling up
• FLANN (Fast Library for Approximate Nearest Neighbors)– In OpenCV thanks to Marius Muja
• Bag of Words– In OpenCV thanks to Ken Chatfield
• Vocabulary trees– Is going to be in OpenCV thanks to Patrick
Mihelich
![Page 38: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/38.jpg)
Projects
• Textured object detection• PR2 robot automatic plugin• Visual odometry / SLAM
![Page 39: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/39.jpg)
Textured object detection
![Page 40: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/40.jpg)
Object detection exampleIryna Gordon and David G. Lowe, "What and where: 3D object recognition with accurate pose," in Toward Category-Level Object Recognition, eds. J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, (Springer-Verlag, 2006), pp. 67-82.
Manuel Martinez Torres, Alvaro Collet Romea, and Siddhartha Srinivasa, MOPED: A Scalable and Low Latency Object Recognition and Pose Estimation System, Proceedings of ICRA 2010, May, 2010.
![Page 41: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/41.jpg)
Keypoint detection
• We are looking for small dark regions
• This operation takes only ~10ms on 640x480 image
• The rest of the algorithm works only with keypoint regions
Itseez Ltd. http://itseez.com
![Page 42: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/42.jpg)
Classification with one way descriptor
• Introduced by Hinterstoisser et al (Technical U of Munich, Ecole Polytechnique) at CVPR 2009
• A test patch is compared to samples of affine-transformed training patches with Euclidean distance
• The closest patch together with a pose guess are reconstructed
Itseez Ltd. http://itseez.com
![Page 43: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/43.jpg)
Keypoint classification examples
• One way descriptor does the most of the outlet detection job for us. Few holes are misclassified Ground hole
Power hole
Non-hole keypoint from outlet image
Background keypoint
Itseez Ltd. http://itseez.com
![Page 44: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/44.jpg)
Object detection
• Object pose is reconstructed by geometry validation (using geomertic hashing)
Itseez Ltd. http://itseez.com
![Page 45: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/45.jpg)
Outlet detection: challenging cases
Shadows
Severe lighting conditions
Partial occlusions
Itseez Ltd. http://itseez.com
![Page 46: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/46.jpg)
PR2 plugin (outlet and plug detection)
http://www.youtube.com/watch?v=GWcepdggXsU
![Page 47: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/47.jpg)
Visual odometry
![Page 48: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/48.jpg)
Visual odometry (II)
![Page 49: 20110220 computer vision_eruhimov_lecture01](https://reader035.fdocuments.in/reader035/viewer/2022062513/55583608d8b42acb078b483e/html5/thumbnails/49.jpg)
More fun