A Fully Affine Invariant Feature Detector
Wei Li1, 2 Zelin Shi2 Jian Yin3
1Graduate University of 2Shenyang Institute of Automation 3The Research Institute
Chinese Academy of Sciences Chinese Academy of Sciences on General Development
Beijing, 100049, China Shenyang, 110016, China of Air Force
[email protected] [email protected] Beijing, 100076, China
Abstract
This paper proposes a Fully Affine Invariant
Feature (FAIF) detector which is based on affine
Gaussian scale-space. The covariance matrix of
Maximally Stable Extremal Region is interpreted as an
isotropy measure of an image patch. A local
anisotropic image patch can be supposed as an affine
transformed isotropic image patch. Therefore, the
affine deformation of a MSER can be estimated by its
covariance matrix. According to affine Gaussian
scale-space theory, filters must be compatible with
local image structures. An anisotropic image patch
should be smoothed by an elliptical Gaussian filter
which is difficult to be constructed directly. In order to
use circular Gaussian filters, FAIF transforms affine
Gaussian scale-space into scale space by the way that
rotating and compressing an anisotropic image region
into an isotropic one. The fully affine invariant
features are detected on isotropic image patches by
Scale Invariant Feature Transform (SIFT) algorithm.
Experimental results show that FAIF has much more
matches than the state-of-the-art algorithms.
1. Introduction
Affine invariant features have been shown to be
well suited to matching and recognition as well as
many other applications. Because affine
transformation is sufficient to locally model image
distortions arising from geometric deformations, the
majority of image features have been built to be affine
invariant. MSER [1] is one of the most outstanding
image feature detectors. It has always been referred as
an affine invariant feature detector for its superior
performance in wide baseline image feature matching.
MSER features have high localization accuracy and
can be matched in large view angles [3], [4]. However,
MSER is sensitive to scale changes and usually detects
too few features. Therefore, MSER is not fully affine
invariant. SIFT [2] is the most widely used image
feature extraction method for its perfect scale space
implementing method [5] and robust descriptor [9].
However, SIFT does not deal with view angles in
affine transformation [6]. Therefore, we still face the
problem of extracting affine invariant features. The
difficulty is to obtain invariance to scale and view
angle changes. Lindeberg has put forward scale space
and affine Gaussian scale-space [7] theories to solve
this problem. Scale space has been widely used to
cope with scale changes. But affine Gaussian
scale-space has not been concerned so much because it
is too complex to build. To deal with this problem,
FAIF converts the affine Gaussian scale-space
construction problem into scale space construction
problem. As scale space can be easily implemented,
affine Gaussian scale-space is not complex any more.
This paper introduces and analyzes scale space and
affine Gaussian scale-space in Section 2. In Section 3,
we put forward the FAIF algorithm. Section 4 is
devoted to experiments where FAIF is compared with
the state-of-the-art algorithms. We conclude the paper
in Section 5.
2. Scale space and affine Gaussian
scale-space
Scale space theory has been proposed by Lindeberg
to deal with isotropic scale changes. The scale space of
an image is defined as a function, , ,L x y ,that is
produced from the convolution of a variable-scale
circular Gaussian kernel, , ,G x y ,with an input
image, ,I x y :
, , , , ,L x y G x y I x y
where is the convolution operation in x and y
direction, and
2 2 22
2
1, ,
2
x yG x y e
21st International Conference on Pattern Recognition (ICPR 2012)November 11-15, 2012. Tsukuba, Japan
978-4-9906441-1-6 ©2012 IAPR 2768
The Laplacian-of-Gaussians is defined by:
2, , ,n n xx n yy nLoG X L X L X (1)
where n is the size of Gaussian kernel and aL is
the derivative computed in the a direction. The
absolute value of LoG attains an extremum when the
size of Gaussian kernel matches with the size of a
blob-like structure.
In the case of affine transformation, when the scale
change is not necessarily the same in every direction,
scales detected in scale space do not reflect the real
transformation of a point. To deal with this problem,
Lindeberg has put forward the affine Gaussian scale-
space [7].
In affine Gaussian scale-space the second moment
matrix , at a given point is defined by:
11 12
21 22
2
2
, ,
, ,,
, ,
I D
x D x y D
I
x y D y D
X
L X L L Xg X
L L X L X
(2)
where I is the integration scale, D is the
differentiation scale and , Ig X (defined in Eq.(3))
is an anisotropic Gaussian filter which is used to
smooth the neighborhood of the point X .
1 /21
;2 det
TX Xg X e
(3)
1 1, , , RI Dt s t s
(4)
Affine Gaussian scale-space theory has demonstrated
that if a point’s second moment matrix satisfies Eq.(4),
this point is affine invariant. To summarize, affine Gaussian scale-space theory
show that we should smooth an image by different
filters on different image patches in affine invariant
feature extraction. Gaussian filters must be compatible
with local image structures which are measured by
second moment matrixes (see Fig.1).
3. Fully affine invariant feature detector
Figure1. Gaussian filters compatible with local image structures
Figure 1 shows that Gaussian filters used in affine
Gaussian scale-space are elliptic. Three parameters
should be set to determine an ellipse. Therefore, it is
too complex to smooth an image by elliptic filters. The
core of the affine Gaussian scale-space theory is that
Gaussian filters and image structures must be
compatible. So we can transform image structure to
make it compatible with circular Gaussian filters which
are used in scale space.
Figure2. Ellipses correspond to MSERs
Figure3. Relationship between an ellipse and the covariance matrix of MSER
As shown in Figure 2, the covariance matrix of
MSER corresponds to an ellipse. This ellipse is
represented by
21 , RT
x q q x q x q (5)
where q is the center of the ellipse, q is the
covariance matrix of MSER. 1 and 2 in Figure 3 [8]
are eigenvalues of q . The orientation of the ellipse
is determined by the eigenvectors of q . FAIF
interprets the covariance matrix of MSER as an
isotropy measure. An image patch is isotropic if 1
equals 2 , otherwise it is anisotropic. Without loss of
generality, we suppose that a local anisotropic image
region is an affine transformed isotropic region. If we
transform image patches from anisotropic to isotropic,
elliptic patches are transformed into circular patches.
Consequently, circular Gaussian filters can be used to
smooth these isotropic image patches and affine
Gaussian scale-space has been transformed into scale
space.
1/2
11/
1/2
21/
y
x
Affine
2769
Figure 4.Transformation from affine to similarity
We propose a rotating and compressing procedure to
transform anisotropic image patches into isotropic ones.
As shown in Figure 4, we rotate the ellipse until its
major axis coincides with the x axis of the image
coordinate and then compress the ellipse along its
major axis until it becomes a circle. The compression
ratio is the length ratio of the ellipse’s major and minor
axes. Before compress the ellipse, we smooth the
anisotropic image patch along the x direction by a
Gaussian filter with the standard deviation of 20.8 1t , where t is the compression ratio.
Morel [5] has found that we should smooth image in
this manner in order to guarantee an aliasing free
image sampling.
Note that we use the transformation from an ellipse
to a circle to illustrate how to transform an anisotropic
patch into an isotropic one. In the implementation of
FAIF, this procedure is applied on a rectangle patch
which embraces the ellipse. The rectangle and ellipse
share the same center. The rectangle’s width and length
are twice of the ellipse’s minor and major axis.
Because SIFT is the best image feature extraction
method as far as translation 、 rotation and scale
invariance are concerned, FAIF detects fully affine
invariant features on isotropic image patches by SIFT.
The outline of FAIF is presented in the following:
1. Detect MSERs.
2. Compute the covariance matrix of MSER. Then
compute eigenvalues and eigenvectors of the
inverse of covariance matrix.
3. Rotate and smooth a rectangle patch which
contains the ellipse that corresponds to the
covariance matrix of MSER. Compress this image
patch until rectangle becomes square.
4. Extract FAIF features from the isotropic patch by
SIFT. Translate coordinates of FAIF features into
their original image coordinates.
5. Match FAIF features and eliminate outliers by
RANSAC [10].
4. Experiments
In this experiment, FAIF will be compared with
SIFT and MSER in feature matching performance.
Three groups of images have been used in our
experiment. We call them Graffiti [4]、Magazine T0
and Magazine T2 [6] respectively. The Graffiti group
includes the standard test images provided by
Mikolajczyk. The other groups of images are the
pictures of a magazine taken from different viewpoints
with varying tilts and transition tilts [6]. Tilt is the
consequence of the change in latitude view angle.
When both of latitude and longitude angles change, it
produces a transition tilt. Images in group Magazine T2
have transition tilts. An image in Magazine T2 has
greater affine deformation than an image in Magazine
T0 if they have the same longitude angle.
Figures 5 to 7 illustrate the matching results. In each
figure, from top to bottom are matching results of SIFT、MSER and FAIF. In each picture, correspondences are
connected by white segments. Tables 1 summarizes in
detail the number of matches in each algorithm.
Figure 5. Matches of Graffiti by SIFT(top)、MSER
(middle) and FAIF(bottom)
Rotate
Rotate
Compress
Compress
AffineSimilarity
2770
Figure 6. Matches of Magazine T0 by SIFT(top)、
MSER (middle) and FAIF(bottom)
Figure7. Matches of Magazine T2 by SIFT(top)、
MSER (middle) and FAIF(bottom)
Table 1. The comparison of matches of FAIF、MSER and SIFT
FAIF MSER SIFT
Graffiti 5000 66 0
Magazine T0 2450 31 0
Magazine T2 8283 66 1
Experimental results show that FAIF works perfectly
in all images. As shown in Figure 6, FAIF works well
until 80 degrees, and it would be unrealistic to insist on
bigger angles. Because the observed surface becomes
in general reflective and the image in the resulting
photo is totally different from the frontal view with
such a big view angle change. The matching results
illustrated in Table 1 indicate that FAIF is fully affine
invariant.
SIFT fails completely in this experiment. Although
images in Magazine T2 is very clear, SIFT has no
match. One reason for this phenomenon is that SIFT
extracts features in scale space, but affine Gaussian
scale-space should be used to cope with affine
deformations caused by changes in view point angle.
In many region detector comparison experiments,
MSER has shown better performance than other
detectors. This conclusion has been indicated by this
experiment. MSER has a certain number of matches
even in very large view angles. The reason for this
result is that MSER has no Gaussian smoothing.
Therefore, MSER does not have the problem that
filters do not compatible with image structures.
5. Conclusion
In this paper, we propose a fully affine invariant
feature detector. FAIF uses the covariance matrix of
MSER to transform an anisotropic region into an
isotropic region. Fully affine invariant features are
detected on the isotropic patch. In this manner, we can
detect fully affine invariant features based on affine
Gaussian scale-space. Experimental results show that
FAIF is fully affine invariant. It has so many matches
that meet the requirement for robust image matching.
In future work, we intend to investigate new methods
which transform affine Gaussian scale-space into scale
space.
References [1] J. Matas, O. Chum, M. Urban and T. Pajdla. Robust
wide baseline stereo from maximally stable extremal
regions. Image and vision computing, 22(10): 761-767,
2004.
[2] D.G. Lowe. Distinctive image features from
scale-invariant keypoints. International journal of
computer vision, 60(2): 91-110, 2004.
[3] T. Tuytelaars and K. Mikolajczyk. Local invariant
feature detectors: a survey. Foundations and Trends in
Computer Graphics and Vision, 3(3): 177-280, 2008.
[4] K. Mikolajczyk, et al. A comparison of affine region
detectors. International journal of computer vision,
65(1): 43-72, 2005.
[5] J.M. Morel and G. Yu. Is SIFT scale invariant? Inverse
Problems and Imaging, 5(1): 115-136, 2011.
[6] J.M. Morel and G. Yu. ASIFT, A new framework for
fully affine invariant image comparison. SIAM Journal
on Imaging Sciences, 2(2):438-469, 2009.
[7] T. Lindeberg and J. Garding. Shape-adapted smoothing
in estimation of 3-D shape cues from affine
deformations of local 2-D brightness structure. Image
and vision computing, 15(6): 415-434, 1997.
[8] J. Garding and T. Lindeberg. Direct computation of
shape cues using scale-adapted spatial derivative
operators. International journal of computer vision,17(2): 163-191, 1996.
[9] K. Mikolajczyk and C. Schmid. A performance
evaluation of local descriptors. IEEE Transactions on
Pattern Analysis and Machine Intelligence, 27(10):
1615-1630, 2005.
[10] M. A. Fischler and R. C. Bolles. Random sample
consensus: a paradigm for model fitting with
applications to image analysis and automated
cartography. Communications of the ACM, 24(6):
381-395, 1981.
2771
Top Related