3D Vision – Real Objects, Real Cameras · 2017. 2. 21. · Without a 3-D model of the world,...

3D Vision – Real Objects, Real Cameras Chapter 11 (parts of) , 12 (parts of) Computerized Image Analysis 2 Anders Brun, anders@cb.uu.se

3D Vision

!  Philisophy !  Image formation

"  The pinhole camera "  Projective geometry "  Artefacts and challenges

!  Camera calibration !  Stereo vision !  Structured Light

Philosophy: Why 3-D?

!  Why do we model things in 3-D? !  Without a 3-D model of the

world, events are more difficult to predict! Movement, grasping, collision estimation, real size estimation, …

!  Example: 2-D: A car on the highway looks bigger and drives faster when it approaches

!  3-D: A car on the highway has constant size and speed when it approaches

Philosophy: 3-D cues …

Photo: Greg Keene

• Shape from: • Focus • Lighting • Stereo • Structured light • …

Philosophy: 3-D cues …

Philosophy: Marr and 2.5-D

!  Primal sketch: Edges and areas !  2.5-D sketch: Texture and depth !  3-D model: A hierarchical 3-D model of the world

Teddy dataset, from http://cat.middlebury.edu/

Philosophies

!  Build accurate 3-D world representation 1.  Build a complete 3-D model of the scene 2.  Plan the task using the 3-D model 3.  Example: Build a model of the scene, then

find the teddy bear and send a robot arm to grab it.

!  Plan as you go, act and react 1.  Collect features from the scene 2.  Use the features to guide your actions 3.  Example: Find the teddy bear using

template matching in image, then send the robot hand in that direction. Possibly take more images when halfway.

Passive, Active and Dynamic Vision

!  Passive vision: "  The camera has a fixed location

!  Dynamic vision: "  The camera is moving but cannot be steered

!  Active vision: "  The camera can be steered

The pinhole camera

!  The Pinhole camera is an idealized model

!  A real aperture is not a point. !  A real aperture has a non-vanishing area and

typically also a lens…

The pinhole camera model

!  Where is the point P projected on the image plane inside the camera?

P=(X,Y,Z)

focal point or origin (the “pinhole”)

image plane

x = − fXZ

The pinhole camera model (alternative)

!  Imagine an observer is located at the focal point !  A screen is placed at distance f from observer. !  Where on this screen is P projected

f = focal length

P=(X,Y,Z)

focal point (the “observer”)

screen

x = + fXZ

y = + f YZ

The pinhole camera model

!  In the pinhole camera, the world appears to be upside down (or 180° rotated).

!  The alternative interpretation is useful in computer graphics. It tells you exactly where to draw P on a screen, in front of the observer, in order to make it appear real for the observer. (OBS: the change of sign)

!  The alternative interpretation leads directly to “projective geometry”.

Projective Geometry (Very Briefly)

!  Points in 2-D are represented by lines in 3-D !  The 3-D space is called the embedding space !  All points along a line are equivalent !  This is analogous to a photography, every point (position) in a

photograph (2-D) corresponds to a line or ray in reality (3-D) Equivalence class

!  We can convert points in the ordinary plane to the projective plane

!  2-D (x,y) # 3-D (x,y,1) !  In general: D-dimensional # (D+1)-dimensional !  Points x and α x are equivalent, α ≠ 0

Equivalence class

(linear) transformation H x’

α x'y'1

h11 h12 h13

h21 h22 h23

h31 h32 h33

x'= h11x + h12y + h13h31x + h32y + h33

y'=h21x + h22y + h23h31x + h32y + h33

!  Homography, a map from (D+1)-dim to (D+1)-dim !  Linear in the (D+1)-dim embedding space !  x’ = H x !  Represents a perspective transformation in D-dim space !  This is very nice!

(linear) transformation H x’

!  Using homographies, we can express a rich class of transformations using linear mappings

Identity Similarity Isometric Affine Perspective

R −Rt0 1#

sR −Rt0 1

A t0 1"

det(H) ≠ 0

Perspective Transformations

!  Remember this example? We wanted to compute the perspective transformation parameters.

From Feature based methods for structure and motion estimation by P. H. S. Torr and A. Zisserman

x'= h11x + h12y + h13h31x + h32y + h33

y'=h21x + h22y + h23h31x + h32y + h33

!  Estimating H from point correspondences (simplified version, check the book for a more advanced version)

!  Each point correspondence translates to 2 linear equations (in the coefficients of H)

!  Assuming h33 =1, we need 4 corresponding 2-D point pairs (x,y,x’,y’) to solve this equation system (8 unknowns).

!  This way of solving the for the parameters has severe practical disadvantages, but it shows that it is possible at least...

h31xx'+h32yx'+h33x'−h11x − h12y − h13 = 0h31xy'+h32yy'+h33y'−h21x − h22y − h23 = 0

Px (x,y,x ',y')Py (x,y,x ',y')"

& ' h =

!  A cleaner and more stable solution !  Multiply both sides with the “cross product matrix”

0 −1 y'1 0 −x'−y ' x ' 0

) ) ) x'y'1

0 −1 y '1 0 −x '−y' x' 0

h11 h12 h13

h21 h22 h23

h31 h32 h33

) ) ) xy1

0 −1 y '1 0 −x '−y' x' 0

h11 h12 h13

h21 h22 h23

h31 h32 h33

) ) ) xy1

0 =Q(x,y,x',y ')h

“Now three equations killing two unknowns”

Single perspective camera

αu =f s −w0

0 g −v0

1 0 0 00 1 0 00 0 1 0

''' R −Rt0T 1

αu =MX

M: Projection matrix

Internal parameters

External parameters

!  Estimation of M from known coordinates (X,Y,Z,1) projections in a camera (x,y,1)

!  This is analogous to the homographic projection !  Algorithms exist to solve this with 6

correspondences €

α x'y'1

m11 m12 m13 m14

m21 m22 m23 m24

m31 m32 m33 m34

% % % %

( ( ( (

!  This enables calibration from 6 known points !  M can be factored: You can estimate camera

focal length, image coordinate systems, camera position and rotation.

!  Triangulation: If you known several Mi, then you can also estimate a position X (3-D) using several camera projections ui ,(2-D).

Marker based motion capture

Images: courtesy of Lennart Svensson

External calibration

!  Rotation + position, 6 DoF, ”calibration”

Motion capture applications

!  Animation !  Biomechanical analysis !  Industrial analysis

Image formation – Lenses

!  Thin lens ! 

zz'= f 2

Image focal point

object focal point

Image plane

Object plane

!  Magnification, m = x/X !  From similarity, x/z’ = X/f

Image focal point

object focal point

Image plane

Object plane

!  Depth of field

!  Thus, objects within depth of field, are scattered within an area smaller than a pixel, i.e. they are depicted sharp

Image focal point

object focal point

Image plane

Object plane

= size of a pixel

Image focal point

object focal point

Image plane

Object plane

ε = size of a pixel

!  Depth of field

!  Aperture size and focal length both affects the depth of field. A larger aperture will yield a smaller depth of field.

Image focal point

object focal point

Image plane

Object plane

ε = size of a pixel

!  Depth of focus

!  “Depth of focus” is analogous. How much the image plane can be shifted without scattering light from a point in focus more than a pixel

AACAM – @ Matlab File Exchange

!  Matlab code for non-perfect pinhole camera "  Set aperture radius and focal length "  Set depth of field "  Set object distance and aperture radius

!  (systems of) lenses # distortions: !  Spherical aberration !  Shorter focal length close to edges of lens

(Image from wikipedia)

!  (systems of) lenses # distortions: !  Coma

!  (systems of) lenses # distortions: !  Chromatic aberration

!  (systems of) lenses # distortions: !  Astigmatism

!  (systems of) lenses # distortions: !  Geometric distortion

Barrel distortion Pincushion distortion

Is this really a problem?

!  In old and cheap cameras, yes !  Uppsala 1999-01-01

From http://www.uu.se/carpediem/1999/

Is this really a problem?

!  But also for e.g. modern GoPRO cameras!

Camera Calibration Toolbox

!  A Matlab toolbox for camera calibration: !  http://www.vision.caltech.edu/bouguetj/calib_doc/ !  Freely available

Camera Toolbox Calibration

!  Focal length: The focal length in pixels is stored in the 2x1 vector fc. !  Principal point: The principal point coordinates are stored in the 2x1

vector cc. !  Skew coefficient: The skew coefficient defining the angle between

the x and y pixel axes is stored in the scalar alpha_c. !  Distortions: The image distortion coefficients (radial and tangential

distortions) are stored in the 5x1 vector kc.

Stereo – Basic equations

P=(X,Y,Z)

x1 = − fXZ

x2 = − fX − BZ

⇒ Z =fB

x2 − x1=fBd

P=(X,Y,Z)

Stereo – the general case

!  It may happen that the relation between the two cameras is not a paralax translation

!  Then the “epipolar constraint” applies !  By “rectification” epipolar lines are aligned with

scanlines

From: Epipolar Rectification by Fusiello et al.

Stereo – Disparity Estimation

!  Search horizontally for patch disparity, use e.g. sum of squared differences (SSD)

Teddy dataset, from http://cat.middlebury.edu/

Stereo – Depth estimation

!  A simple formula converting disparity d to distance z when the inter camera distance is B:

Z =fBd

Patch based estimate Ground truth

Stereo – Constraints

!  Constraints (Marr and Poggio): "  Each point in each image is assigned at most one

disparity value "  The disparity varies smoothly at most locations in the

images !  However… !  Different regularization

may be applied to the depth function x

Stereo from Segmentation

!  Alternative approach: "  Make a segmentation of the image first "  Apply a linear model in each segmented region "  Refine the models in the regions …

From Segment-based Stereo Matching Using Graph Cuts by Hong and CHen

Large Scale 3D Maps (C3/SAAB)

Courtesy of Petter Torle C3 Technologies

Large Scale 3D Maps (C3/SAAB)

Structured Light

!  A lightsource helps the stereo algorithm to find matching points.

!  Often used in industrial applications

From: http://mesh.brown.edu/3DPGP-2009/homework/hw2/hw2.html

More Structured Light

!  Microsoft Kinect, using infrared light

• http://www.youtube.com/watch?v=nvvQJxgykcU

Other Computer Vision Code

!  Open CV "  Free to use "  Supports IPP speedups "  http://en.wikipedia.org/wiki/OpenCV "  http://sourceforge.net/projects/opencvlibrary/ "  http://opencv.willowgarage.com/wiki/

!  Intel® Integrated Performance Primitives 6.0 "  http://www.intel.com/cd/software/products/asmo-na/eng/

302910.htm "  Commercial (but cheap) "  Includes Computer Vision, Signal Processing, Data

Compression, ….

Typical Exam Questions …

!  Project this object (points) using a pinhole camera

!  Can geometric transformation compensate for lens distortions in general?

!  Explain the parameters building up the projection matrix M

f s −w0

0 g −v0

1 0 0 00 1 0 00 0 1 0

R −Rt0T 1#

Thank You!

!  Email questions to: anders.brun@it.uu.se

3D Vision – Real Objects, Real Cameras · 2017. 2. 21. · Without a 3-D model of the world,...

Documents

Transcript of 3D Vision – Real Objects, Real Cameras · 2017. 2. 21. · Without a 3-D model of the world,...

Parallel Collision Check for Sensor Based Real-Time Motion …labrob/pub/papers/ICRA17_PCC... · 2017-06-05 · Parallel Collision Check for Sensor Based Real-Time Motion Planning

Collision Overload: Reducing the Impact in Real-time Physics Final ...

Confidence-aware motion prediction for real-time collision ...prediction for real-time collision avoidance Andrea Bajcsy Long-term Human Motion Prediction Workshop ICRA 2019 Work with

Multiagent Approach for Real-Time Collision Avoidance and Path ...

Volumetric Grasping Network: Real-time 6 DOF Grasp ...

Closing the Loop for Robotic Grasping: A Real-time, Generative …roboticsproceedings.org/rss14/p21.pdf · 2019-08-31 · for previously unseen items. Our Generative Grasping Con-volutional

GPU-based Parallel Collision Detection for Real-Time Motion Planning

Real Time Collision Detection

Introduction to Collision Detection Lecture based on Real Time Collision Detection, Christer Ericson, Morgan Kauffman, 2005

Real-Time 3D Collision Avoidance for Biped Robotsmediatum.ub.tum.de/doc/1232126/281328.pdf · Real-Time 3D Collision Avoidance for Biped Robots ... generation of a global foot step

The Profitable Collision of Real and Virtual Worlds · The Profitable Collision of Real and Virtual Worlds. Steve Eglash Executive Director, Strategic Research Initiatives, Computer

Real-time Collision Detection with Implicit Objects - DiVA

Investing in German Real Estate - iwminstitute.comiwminstitute.com/docs/41827_Studie_Real_Estate_180215_SCREEN... · Investing in German Real Estate Grasping opportunities in one

Real-Time Collision Avoidance for Dexterous 7-DOF Arms

Real-Time Collision Avoidance in Teleoperated Whole-Sensitive … · 2015. 11. 28. · Real-Time Collision Avoidance in Teleoperated Abstract Whole-Sensitive Robot Arm Manipulators

Transferable Active Grasping and Real Embodied Dataset

Collision Overload: Reducing the Impact in Real-time Physics Final

Kinematics & Grasping

Ezra Pound's Dream City in Collision with the Real

Efﬁcient Grasping from RGBD Images: Learning using a new …pr.cs.cornell.edu/grasping/jiang_rectanglerepresentation_fast... · Efﬁcient Grasping from RGBD Images: Learning using