Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely...
-
Upload
asher-lyons -
Category
Documents
-
view
214 -
download
0
Transcript of Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely...
![Page 1: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/1.jpg)
Project 4 Results
• Representation– SIFT and HoG are popular and successful.
• Data– Hugely varying results from hard mining.
• Learning– Non-linear classifier usually better.
Zachary, Hung-I, Paul, Emanuel
![Page 2: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/2.jpg)
Project 5
• Project 5 handout
![Page 3: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/3.jpg)
Final Project
• Proposals due Friday.– One paragraph with one paper link.
• Can opt out of project 5.• Higher expectations; must be able to present
during final exam slot.
![Page 4: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/4.jpg)
11/14/2011
CS143, Brown
James Hays
Stereo:Epipolar geometry
Slides by Kristen Grauman
![Page 5: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/5.jpg)
Multiple views
Hartley and Zisserman
Lowe
Multi-view geometry, matching, invariant features, stereo vision
![Page 6: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/6.jpg)
Why multiple views?• Structure and depth are inherently ambiguous from
single views.
Images from Lana Lazebnik
![Page 7: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/7.jpg)
Why multiple views?• Structure and depth are inherently ambiguous from
single views.
Optical center
P1P2
P1’=P2’
![Page 8: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/8.jpg)
• What cues help us to perceive 3d shape and depth?
![Page 9: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/9.jpg)
Shading
[Figure from Prados & Faugeras 2006]
![Page 10: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/10.jpg)
Focus/defocus
[figs from H. Jin and P. Favaro, 2002]
Images from same point of view, different camera parameters
3d shape / depth estimates
![Page 11: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/11.jpg)
Texture
[From A.M. Loh. The recovery of 3-D structure using visual texture patterns. PhD thesis]
![Page 12: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/12.jpg)
Perspective effects
Image credit: S. Seitz
![Page 13: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/13.jpg)
Motion
Figures from L. Zhang http://www.brainconnection.com/teasers/?main=illusion/motion-shape
![Page 14: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/14.jpg)
Estimating scene shape
• “Shape from X”: Shading, Texture, Focus, Motion…
• Stereo: – shape from “motion” between two views– infer 3d shape of scene from two (multiple)
images from different viewpoints
scene pointscene point
optical centeroptical center
image planeimage plane
Main idea:
![Page 15: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/15.jpg)
Outline
• Human stereopsis
• Stereograms
• Epipolar geometry and the epipolar constraint
– Case example with parallel optical axes
– General case with calibrated cameras
![Page 16: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/16.jpg)
Human eye
Fig from Shapiro and Stockman
Pupil/Iris – control amount of light passing through lens
Retina - contains sensor cells, where image is formed
Fovea – highest concentration of cones
Rough analogy with human visual system:
![Page 17: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/17.jpg)
Human stereopsis: disparity
Human eyes fixate on point in space – rotate so that corresponding images form in centers of fovea.
![Page 18: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/18.jpg)
Disparity occurs when eyes fixate on one object; others appear at different visual angles
Human stereopsis: disparity
![Page 19: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/19.jpg)
Disparity: d = r-l = D-F.
Human stereopsis: disparity
Forsyth & Ponce
![Page 20: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/20.jpg)
Random dot stereograms
• Julesz 1960: Do we identify local brightness patterns before fusion (monocular process) or after (binocular)?
• To test: pair of synthetic images obtained by randomly spraying black dots on white objects
![Page 21: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/21.jpg)
Random dot stereograms
Forsyth & Ponce
![Page 22: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/22.jpg)
Random dot stereograms
![Page 23: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/23.jpg)
Random dot stereograms
• When viewed monocularly, they appear random; when viewed stereoscopically, see 3d structure.
• Conclusion: human binocular fusion not directly associated with the physical retinas; must involve the central nervous system
• Imaginary “cyclopean retina” that combines the left and right image stimuli as a single unit
![Page 24: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/24.jpg)
Stereo photography and stereo viewers
Invented by Sir Charles Wheatstone, 1838 Image from fisher-price.com
Take two pictures of the same subject from two slightly different viewpoints and display so that each eye sees only one of the images.
![Page 25: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/25.jpg)
http://www.johnsonshawmuseum.org
![Page 26: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/26.jpg)
http://www.johnsonshawmuseum.org
![Page 27: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/27.jpg)
Public Library, Stereoscopic Looking Room, Chicago, by Phillips, 1923
![Page 28: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/28.jpg)
http://www.well.com/~jimg/stereo/stereo_list.html
![Page 29: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/29.jpg)
Autostereograms
Images from magiceye.com
Exploit disparity as depth cue using single image.
(Single image random dot stereogram, Single image stereogram)
![Page 30: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/30.jpg)
Images from magiceye.com
Autostereograms
![Page 31: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/31.jpg)
Estimating depth with stereo
• Stereo: shape from “motion” between two views
• We’ll need to consider:• Info on camera pose (“calibration”)• Image point correspondences
scene pointscene point
optical optical centercenter
image planeimage plane
![Page 32: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/32.jpg)
Two cameras, simultaneous views
Single moving camera and static scene
Stereo vision
![Page 33: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/33.jpg)
Camera parameters
Camera frame 1
Intrinsic parameters:Image coordinates relative to camera Pixel coordinates
Extrinsic parameters:Camera frame 1 Camera frame 2
Camera frame 2
• Extrinsic params: rotation matrix and translation vector
• Intrinsic params: focal length, pixel sizes (mm), image center point, radial distortion parameters
We’ll assume for now that these parameters are given and fixed.
![Page 34: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/34.jpg)
Outline
• Human stereopsis
• Stereograms
• Epipolar geometry and the epipolar constraint
– Case example with parallel optical axes
– General case with calibrated cameras
![Page 35: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/35.jpg)
Geometry for a simple stereo system
• First, assuming parallel optical axes, known camera parameters (i.e., calibrated cameras):
![Page 36: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/36.jpg)
baseline
optical center (left)
optical center (right)
Focal length
World point
image point (left)
image point (right)
Depth of p
![Page 37: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/37.jpg)
• Assume parallel optical axes, known camera parameters (i.e., calibrated cameras). What is expression for Z?
Similar triangles (pl, P, pr) and (Ol, P, Or):
Geometry for a simple stereo system
Z
T
fZ
xxT rl
lr xx
TfZ
disparity
![Page 38: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/38.jpg)
Depth from disparity
image I(x,y) image I´(x´,y´)Disparity map D(x,y)
(x´,y´)=(x+D(x,y), y)
So if we could find the corresponding points in two images, we could estimate relative depth…
![Page 39: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/39.jpg)
Summary• Depth from stereo: main idea is to triangulate from
corresponding image points.• Epipolar geometry defined by two cameras
– We’ve assumed known extrinsic parameters relating their poses
• Epipolar constraint limits where points from one view will be imaged in the other– Makes search for correspondences quicker
• Terms: epipole, epipolar plane / lines, disparity, rectification, intrinsic/extrinsic parameters, essential matrix, baseline
![Page 40: Project 4 Results Representation – SIFT and HoG are popular and successful. Data – Hugely varying results from hard mining. Learning – Non-linear classifier.](https://reader036.fdocuments.in/reader036/viewer/2022062516/56649d9d5503460f94a86ef5/html5/thumbnails/40.jpg)
Coming up
– Computing correspondences– Non-geometric stereo constraints