SURF Featuresjacky/...IntelligentMobileRobotics/.../L03_Su… · Robot localization Texture...
Transcript of SURF Featuresjacky/...IntelligentMobileRobotics/.../L03_Su… · Robot localization Texture...
SURF Features
Jacky BaltesDept. of Computer Science
University of ManitobaEmail: [email protected]
WWW: http://www.cs.umanitoba.ca/~jacky
Salient Spatial Features
● Trying to find interest points● Points that can be found independent of
perspective transformations– Distinctive (Unique in local region)
● Surrounding of pixel is rich in structure
– Repeatable (Different views)
– Stable under geometric, photometric transformations
– Stable to noise
– Well defined position in the image
Applications
● Tracking● Object recognition● Human action recognition● Panorama stitching● Robot localization● Texture recognition
Covariant to Scale Change
● Changes in scale do not alter the structure of the image
Salient Features
● Scale Invariant Feature Transform (SIFT)
– David Lowe, 1999● Harris Affine Corner Detector
● Hessian Affine Corner Detector
● Edge Based Regions
● Intensity Based Regions
● Maximally Stable Extremal Regions (MSER)
● Entropy Based Salient Regions
Scale Invariant Feature Transform (SIFT)
● Scale-space Theorem:– A local 3D Maximum of |NLOG| in (x,y,σ)
– Can be identified at different scales (scale invariant keypoint)
– Laplacian Kernel●
NLoG x , y , σ =σ 2∇ 2G
The Laplacian of Gaussian (LoG)
● Convolution of an image by the following kernel
–
●
●
● Diameter t, L is called scale-space representation
●
● Based on Laplacian operator, which is sum of partial derivatives in Euclidean space
● A popular blob detector is based on LoG
g x , y , t =1
2 t 2e− x 2 y 2/2t
L x , y ; t =g x , y , t ∗I x , y
∇ norm2 L x , y ; t =t L xxL yy
Everything Clear?
● Maybe if you are a mathematician● Another derivation● Sobel, Prewitt edge detectors are gradient
based● Maximum of 1st derivative● Or zero-crossing of 2nd derivative● How to calculate 2nd derivative?
Laplacian● Need to calculate 2nd derivative
– Change in 1st derivative
● 4 connectedness– (f – e) – (e – d) + (h-e) – (e-b)
– = +f +d +h+b-4e
● 8 Connectedness– (f – e) – (e – d) + (h-e) – (e-b) +
(i-e) – (e – a) + (c-e) – (e – g)
– = +f + d + h + b + i + a + c + g - 8e
a b c
d e f
g h i
1 1 1
1 -8 1
1 1 1
0 1 0
1 -4 1
0 1 0
Laplacian
● Detect the zero crossing of the Laplacian to detect edges
● Because of the use of the 2nd derivative, very sensitive to noise
● Remove noise by blurring the image first with a Gaussian kernel of size σ
LoG x , y =−1
4 [1−x2 y2
22 ]e−x2
y2
2 2
Gradient based procedure
Sobel
Sobel
Laplacian
1 1 1
1 -8 1
1 1 1
Sobel
-1 0 1
-2 0 2
-1 0 1
-1 -2 -1
0 0 0
1 2 1
Laplacian
1 1 1
1 -8 1
1 1 1
Zero-crossing based procedure
LoG
Laplacian of Gaussian
Gaussian
● Plot in Scilab
Edge-based Segmentation: examples
Prewitt: needs edge linking Canny: needs “cleaning”
Difference of Gaussian
● Calculate the difference of Gaussian– Radius 1 = 1.0, Radius 2 = 2.0
Difference of Gaussian
● Approximation of Laplacian |NLoG|●
●
●
●
●
●
● Invariant to scale and rotation
DoG x , y ,=G x , y , k−G x , y ,
SIFT Features
Scale Space Representation
● L is the scale-space representation● Obtained by convoluting with a Gaussian
kernel of size t●
●
●
● And partial derivatives of L●
g x , y , t =1
2 t 2e− x 2 y 2/2t
L x , y ; t =g x , y , t ∗I x , y
L x=∂ L∂ x, L y=
∂ L∂ y
Structure Tensor
● Also called 2nd moment matrix● Derived from the gradient● Measures the pre-dominant gradient in a
neighborhood and its coherence
Sw p= [ I x2 p I x p I y p
I x p I y p I y2 p ]
Harris Affine Corner Detector
● Gradient distribution matrix (M)●
●
●
● Calculate Eigenvalues of M
M=D2 g 1 [ I x
2 p ,D I x p ,D I y p ,D
I x p ,D I y p ,D I y2 p ,D ]
Eigenvectors and Eigenvalues
● Are vectors and scalars such for a matrix M such that
●
● Only exists if (M-λI) has no inverse.● Characteristic/secular equation
– det(M-λI)=0
M∗x=∗x
Eigenvectors and Eigenvalues
● Given a 2*2 matrix M●
M= [a11 a12
a21 a22]
det M− I =0
1,2=a11a22±a11a22
2−4a11a22−a12a21
2
Harris Affine Corner Detector
● Eigenvalues show● the direction of change●
● Curvature●
●
● Corners are stable under different lighting conditions
C=det M −k trace2M
C=12−k 122
Hessian Affine Corner Detector
● The Hessian of a function with two arguments is defined as
●
●
●
●
● If function is continuous, then●
H p= [∂2 I
∂ x2 p∂2 I
∂ x∂ y p
∂2 I∂ y∂ x
p∂2 I
∂ y2 p ]∂2 I
∂ x ∂ y p=
∂2 I∂ y ∂ x
p
Hessian Approximation
● Hessian approximation using Gaussians●
●
● Convolution of image by Gaussian●
●
● Blob detector using maximums of the determinant
H p ,=[ Lxx p , L xy p ,
Lxy p , L yy p , ]
L xx p ,=∂2 I
∂ x2 g ∗I p
SURF Algorithm
● Uses the integral image to compute averages over areas efficiently (4 lookups and 3 arithmetic operations)
Integral Image
● Sum of all the pixels to the left and top of the pixel
● Sum of any rectangular region can be extracted in constant time using four lookups and arithmetic
+ Bottum Right
- Top Right
- Bottum Left
+ Top Left
SURF and Hessian Matrix
● SURF is based on calculating the Hessian matrix
● Authors claim more robust than Harris detector
● Hessian is an approximation of 2nd order derivative – large values for maxima and minima
Approximation of Gaussian
● 2nd order partial derivatives● Cropped and discretized
– L_yy L_xy
Box Filters
● Coarse approximation allows computation of value by integral image
– L_yy L_xy
Scale Space Transform
● Useful for finding interest points● Can scale filter without increased
computational cost
Scale Space
● What filter sizes do we need to use?● 9x9 box filters are approximations of
Gaussian with σ = 1.2●
● W is a weight that corrects for the approximation of the Gaussian.
● Analysis shows that w=0.9 is a good enough approximation
det H¿=Dxx∗D yy−wD xy
2
Scale Space
● The approximation of the Hessian determinant is equivalent to finding blobs
● Used to detect local maximas in the scale space using different sized filters
● Stored in the so-called blob response map
Scale Space
● Doubling of σ represents one octave of the scale space
● Each octave has a constant number of scale levels
● Filter sized needs to be increased by 6 pixels
– Lobes are set of 1/3 of filter
– Needs to be increased by 2
– To keep a central pixel
Scale Space
Scale Space
● 9,15,21,27 are first octave– Corresponds to a change of σ of 1.2 to 3.2
– Min and max scale leves per octave are used to suppress maximas that are not maximas in scale space
● Filter size increase doubles per octave● 15,27,39,51● Large change in σ in first two octaves
can be avoided by scaling image first
Scale Space
Haar Wavelets
Orientation
● S is the scale at which an interest point was detected
● Calculate response of Haar wavelets in the x and y direction around the interest point
● Radius is 6*s– Sampling is s
– So take 6 samples along one direction
● Size of the Haar wavelets is 4*s
Orientation
● Responses weighted by Gaussian with σ=2s
● X direction is Haar wavelet response in x● Y direction is Haar wavelet response in y● Sum all points in 60 deg. ● window
– Import parameter
● Orientation is window with the longest vector
Feature Descriptors
● Similar to SIFT (David Lowe)● Generate square window
– Size is 20s
– Orientation along the orientation calculated
● Split window into 4x4 square subregions● Calculate Haar wavelet response in x and y● Rotate along the orientation● Weight with a Gaussian of σ=3.3s
Feature Descriptor
Feature Descriptor
● Calculate sum of changes as well as absolute sum of changes
● 4 entries per field, 16 sub-regions = 64 entries
Robustness to Noise
Evaluation
● The set of features of the feature descriptors was arrived at by experimentation
● Evaluations– Camera calibration
– Object detection
● Faster and more robust than other detectors
References
● Herbert Bay, Andreas Ess, Tinne Tuytelaars, Luc Van Gool, "SURF: Speeded Up Robust Features", Computer Vision and Image Understanding (CVIU), Vol. 110, No. 3, pp. 346--359, 2008
● SURF, David Tam (Ryerson), Computer Robotics Vision (CRV) Tutorials 2010, www.computerroboticsvision.org
● Salient Feature Detectors and Descriptors: Affine-Hessian, Harris, MSER, SIFT, SURF, Amir-Hossein Shabani, Computer Robotics Vision (CRV) Tutorials 2009, www.computerroboticsvision.org
● Chris Evans. Notes on the OpenSURF Library. January 18, 2009, http://opensurf1.googlecode.com/files/OpenSURF.pdf
● http://en.wikipedia.org/wiki/Hessian_matrix