Download - A Fully Affine Invariant Feature Detector - Paviavision.unipv.it/corsi/computervision/secondchoice/0484 A Fully... · A Fully Affine Invariant Feature Detector ... distortions arising

A Fully Affine Invariant Feature Detector

Wei Li1, 2 Zelin Shi2 Jian Yin3

1Graduate University of 2Shenyang Institute of Automation 3The Research Institute

Chinese Academy of Sciences Chinese Academy of Sciences on General Development

Beijing, 100049, China Shenyang, 110016, China of Air Force

[email protected] [email protected] Beijing, 100076, China

Abstract

This paper proposes a Fully Affine Invariant

Feature (FAIF) detector which is based on affine

Gaussian scale-space. The covariance matrix of

Maximally Stable Extremal Region is interpreted as an

isotropy measure of an image patch. A local

anisotropic image patch can be supposed as an affine

transformed isotropic image patch. Therefore, the

affine deformation of a MSER can be estimated by its

covariance matrix. According to affine Gaussian

scale-space theory, filters must be compatible with

local image structures. An anisotropic image patch

should be smoothed by an elliptical Gaussian filter

which is difficult to be constructed directly. In order to

use circular Gaussian filters, FAIF transforms affine

Gaussian scale-space into scale space by the way that

rotating and compressing an anisotropic image region

into an isotropic one. The fully affine invariant

features are detected on isotropic image patches by

Scale Invariant Feature Transform (SIFT) algorithm.

Experimental results show that FAIF has much more

matches than the state-of-the-art algorithms.

1. Introduction

Affine invariant features have been shown to be

well suited to matching and recognition as well as

many other applications. Because affine

transformation is sufficient to locally model image

distortions arising from geometric deformations, the

majority of image features have been built to be affine

invariant. MSER [1] is one of the most outstanding

image feature detectors. It has always been referred as

an affine invariant feature detector for its superior

performance in wide baseline image feature matching.

MSER features have high localization accuracy and

can be matched in large view angles [3], [4]. However,

MSER is sensitive to scale changes and usually detects

too few features. Therefore, MSER is not fully affine

invariant. SIFT [2] is the most widely used image

feature extraction method for its perfect scale space

implementing method [5] and robust descriptor [9].

However, SIFT does not deal with view angles in

affine transformation [6]. Therefore, we still face the

problem of extracting affine invariant features. The

difficulty is to obtain invariance to scale and view

angle changes. Lindeberg has put forward scale space

and affine Gaussian scale-space [7] theories to solve

this problem. Scale space has been widely used to

cope with scale changes. But affine Gaussian

scale-space has not been concerned so much because it

is too complex to build. To deal with this problem,

FAIF converts the affine Gaussian scale-space

construction problem into scale space construction

problem. As scale space can be easily implemented,

affine Gaussian scale-space is not complex any more.

This paper introduces and analyzes scale space and

affine Gaussian scale-space in Section 2. In Section 3,

we put forward the FAIF algorithm. Section 4 is

devoted to experiments where FAIF is compared with

the state-of-the-art algorithms. We conclude the paper

in Section 5.

2. Scale space and affine Gaussian

scale-space

Scale space theory has been proposed by Lindeberg

to deal with isotropic scale changes. The scale space of

an image is defined as a function, , ,L x y ，that is

produced from the convolution of a variable-scale

circular Gaussian kernel, , ,G x y ,with an input

image, ,I x y :

, , , , ,L x y G x y I x y

where is the convolution operation in x and y

direction, and

2 2 22

2

1, ,

2

x yG x y e

21st International Conference on Pattern Recognition (ICPR 2012)November 11-15, 2012. Tsukuba, Japan

978-4-9906441-1-6 ©2012 IAPR 2768

The Laplacian-of-Gaussians is defined by:

2, , ,n n xx n yy nLoG X L X L X (1)

where n is the size of Gaussian kernel and aL is

the derivative computed in the a direction. The

absolute value of LoG attains an extremum when the

size of Gaussian kernel matches with the size of a

blob-like structure.

In the case of affine transformation, when the scale

change is not necessarily the same in every direction,

scales detected in scale space do not reflect the real

transformation of a point. To deal with this problem,

Lindeberg has put forward the affine Gaussian scale-

space [7].

In affine Gaussian scale-space the second moment

matrix , at a given point is defined by:

11 12

21 22

2

2

, ,

, ,,

, ,

I D

x D x y D

I

x y D y D

X

L X L L Xg X

L L X L X

(2)

where I is the integration scale, D is the

differentiation scale and , Ig X (defined in Eq.(3))

is an anisotropic Gaussian filter which is used to

smooth the neighborhood of the point X .

1 /21

;2 det

TX Xg X e

(3)

1 1, , , RI Dt s t s

(4)

Affine Gaussian scale-space theory has demonstrated

that if a point’s second moment matrix satisfies Eq.(4),

this point is affine invariant. To summarize, affine Gaussian scale-space theory

show that we should smooth an image by different

filters on different image patches in affine invariant

feature extraction. Gaussian filters must be compatible

with local image structures which are measured by

second moment matrixes (see Fig.1).

3. Fully affine invariant feature detector

Figure1. Gaussian filters compatible with local image structures

Figure 1 shows that Gaussian filters used in affine

Gaussian scale-space are elliptic. Three parameters

should be set to determine an ellipse. Therefore, it is

too complex to smooth an image by elliptic filters. The

core of the affine Gaussian scale-space theory is that

Gaussian filters and image structures must be

compatible. So we can transform image structure to

make it compatible with circular Gaussian filters which

are used in scale space.

Figure2. Ellipses correspond to MSERs

Figure3. Relationship between an ellipse and the covariance matrix of MSER

As shown in Figure 2, the covariance matrix of

MSER corresponds to an ellipse. This ellipse is

represented by

21 , RT

x q q x q x q (5)

where q is the center of the ellipse, q is the

covariance matrix of MSER. 1 and 2 in Figure 3 [8]

are eigenvalues of q . The orientation of the ellipse

is determined by the eigenvectors of q . FAIF

interprets the covariance matrix of MSER as an

isotropy measure. An image patch is isotropic if 1

equals 2 , otherwise it is anisotropic. Without loss of

generality, we suppose that a local anisotropic image

region is an affine transformed isotropic region. If we

transform image patches from anisotropic to isotropic,

elliptic patches are transformed into circular patches.

Consequently, circular Gaussian filters can be used to

smooth these isotropic image patches and affine

Gaussian scale-space has been transformed into scale

space.

1/2

11/

1/2

21/

y

x

Affine

2769

Figure 4.Transformation from affine to similarity

We propose a rotating and compressing procedure to

transform anisotropic image patches into isotropic ones.

As shown in Figure 4, we rotate the ellipse until its

major axis coincides with the x axis of the image

coordinate and then compress the ellipse along its

major axis until it becomes a circle. The compression

ratio is the length ratio of the ellipse’s major and minor

axes. Before compress the ellipse, we smooth the

anisotropic image patch along the x direction by a

Gaussian filter with the standard deviation of 20.8 1t , where t is the compression ratio.

Morel [5] has found that we should smooth image in

this manner in order to guarantee an aliasing free

image sampling.

Note that we use the transformation from an ellipse

to a circle to illustrate how to transform an anisotropic

patch into an isotropic one. In the implementation of

FAIF, this procedure is applied on a rectangle patch

which embraces the ellipse. The rectangle and ellipse

share the same center. The rectangle’s width and length

are twice of the ellipse’s minor and major axis.

Because SIFT is the best image feature extraction

method as far as translation 、 rotation and scale

invariance are concerned, FAIF detects fully affine

invariant features on isotropic image patches by SIFT.

The outline of FAIF is presented in the following:

1． Detect MSERs.

2． Compute the covariance matrix of MSER. Then

compute eigenvalues and eigenvectors of the

inverse of covariance matrix.

3． Rotate and smooth a rectangle patch which

contains the ellipse that corresponds to the

covariance matrix of MSER. Compress this image

patch until rectangle becomes square.

4． Extract FAIF features from the isotropic patch by

SIFT. Translate coordinates of FAIF features into

their original image coordinates.

5． Match FAIF features and eliminate outliers by

RANSAC [10].

4. Experiments

In this experiment, FAIF will be compared with

SIFT and MSER in feature matching performance.

Three groups of images have been used in our

experiment. We call them Graffiti [4]、Magazine T0

and Magazine T2 [6] respectively. The Graffiti group

includes the standard test images provided by

Mikolajczyk. The other groups of images are the

pictures of a magazine taken from different viewpoints

with varying tilts and transition tilts [6]. Tilt is the

consequence of the change in latitude view angle.

When both of latitude and longitude angles change, it

produces a transition tilt. Images in group Magazine T2

have transition tilts. An image in Magazine T2 has

greater affine deformation than an image in Magazine

T0 if they have the same longitude angle.

Figures 5 to 7 illustrate the matching results. In each

figure, from top to bottom are matching results of SIFT、MSER and FAIF. In each picture, correspondences are

connected by white segments. Tables 1 summarizes in

detail the number of matches in each algorithm.

Figure 5. Matches of Graffiti by SIFT(top)、MSER

(middle) and FAIF（bottom）

Rotate

Rotate

Compress

Compress

AffineSimilarity

2770

Figure 6. Matches of Magazine T0 by SIFT(top)、

MSER (middle) and FAIF（bottom）

Figure7. Matches of Magazine T2 by SIFT(top)、

MSER (middle) and FAIF（bottom）

Table 1. The comparison of matches of FAIF、MSER and SIFT

FAIF MSER SIFT

Graffiti 5000 66 0

Magazine T0 2450 31 0

Magazine T2 8283 66 1

Experimental results show that FAIF works perfectly

in all images. As shown in Figure 6, FAIF works well

until 80 degrees, and it would be unrealistic to insist on

bigger angles. Because the observed surface becomes

in general reflective and the image in the resulting

photo is totally different from the frontal view with

such a big view angle change. The matching results

illustrated in Table 1 indicate that FAIF is fully affine

invariant.

SIFT fails completely in this experiment. Although

images in Magazine T2 is very clear, SIFT has no

match. One reason for this phenomenon is that SIFT

extracts features in scale space, but affine Gaussian

scale-space should be used to cope with affine

deformations caused by changes in view point angle.

In many region detector comparison experiments,

MSER has shown better performance than other

detectors. This conclusion has been indicated by this

experiment. MSER has a certain number of matches

even in very large view angles. The reason for this

result is that MSER has no Gaussian smoothing.

Therefore, MSER does not have the problem that

filters do not compatible with image structures.

5. Conclusion

In this paper, we propose a fully affine invariant

feature detector. FAIF uses the covariance matrix of

MSER to transform an anisotropic region into an

isotropic region. Fully affine invariant features are

detected on the isotropic patch. In this manner, we can

detect fully affine invariant features based on affine

Gaussian scale-space. Experimental results show that

FAIF is fully affine invariant. It has so many matches

that meet the requirement for robust image matching.

In future work, we intend to investigate new methods

which transform affine Gaussian scale-space into scale

space.

References [1] J. Matas, O. Chum, M. Urban and T. Pajdla. Robust

wide baseline stereo from maximally stable extremal

regions. Image and vision computing, 22(10): 761-767,

2004.

[2] D.G. Lowe. Distinctive image features from

scale-invariant keypoints. International journal of

computer vision, 60(2): 91-110, 2004.

[3] T. Tuytelaars and K. Mikolajczyk. Local invariant

feature detectors: a survey. Foundations and Trends in

Computer Graphics and Vision, 3(3): 177-280, 2008.

[4] K. Mikolajczyk, et al. A comparison of affine region

detectors. International journal of computer vision,

65(1): 43-72, 2005.

[5] J.M. Morel and G. Yu. Is SIFT scale invariant? Inverse

Problems and Imaging, 5(1): 115-136, 2011.

[6] J.M. Morel and G. Yu. ASIFT, A new framework for

fully affine invariant image comparison. SIAM Journal

on Imaging Sciences, 2(2):438-469, 2009.

[7] T. Lindeberg and J. Garding. Shape-adapted smoothing

in estimation of 3-D shape cues from affine

deformations of local 2-D brightness structure. Image

and vision computing, 15(6): 415-434, 1997.

[8] J. Garding and T. Lindeberg. Direct computation of

shape cues using scale-adapted spatial derivative

operators. International journal of computer vision，17(2): 163-191, 1996.

[9] K. Mikolajczyk and C. Schmid. A performance

evaluation of local descriptors. IEEE Transactions on

Pattern Analysis and Machine Intelligence, 27(10):

1615-1630, 2005.

[10] M. A. Fischler and R. C. Bolles. Random sample

consensus: a paradigm for model fitting with

applications to image analysis and automated

cartography. Communications of the ACM, 24(6):

381-395, 1981.

2771