Analysis of Pca Based and Fisher ant Based Image Recognition Algorithms

download Analysis of Pca Based and Fisher ant Based Image Recognition Algorithms

of 76

Transcript of Analysis of Pca Based and Fisher ant Based Image Recognition Algorithms

Computer Science Technical ReportANALYSIS OF PCA-BASED AND FISHER DISCRIMINANT-BASED IMAGE RECOGNITION ALGORITHMSWendy S. Yambor July 2000 Technical Report CS-00-103

Computer Science Department Colorado State University Fort Collins, CO 80523-1873 Phone: (970) 491-5792 Fax: (970) 491-2466 WWW: http://www.cs.colostate.edu

THESIS ANALYSIS OF PCA-BASED AND FISHER DISCRIMINANT-BASED IMAGE RECOGNITION ALGORITHMS

Submitted by Wendy S. Yambor Department of Computer Science

In Partial Fulfillment of the Requirements For the Degree of Master of Science Colorado State University Fort Collins, Colorado Summer 2000

COLORADO STATE UNIVERSITY

July 6, 2000 WE HEREBY RECOMMEND THAT THE THESIS PREPARED UNDER OUR SUPERVISION BY WENDY S. YAMBOR ENTITLED ANALYSIS OF PCA-BASED AND FISHER DISCRIMINANT-BASED IMAGE RECOGNITION ALGORITHMS BE ACCEPTED AS FULFILLING IN PART REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE.

Committee on Graduate Work

Advisor Co-Advisor Department Head

ii

ABSTRACT OF THESIS ANALYSIS OF PCA-BASED AND FISHER DISCRIMINANT-BASED IMAGE RECOGNITION ALGORITHMS

One method of identifying images is to measure the similarity between images. This is accomplished by using measures such as the L1 norm, L2 norm, covariance, Mahalanobis distance, and correlation. These similarity measures can be calculated on the images in their original space or on the images projected into a new space. I discuss two alternative spaces in which these similarity measures may be calculated, the subspace created by the eigenvectors of the covariance matrix of the training data and the subspace created by the Fisher basis vectors of the data. Variations of these spaces will be discussed as well as the behavior of similarity measures within these spaces. Experiments are presented comparing recognition rates for different similarity measures and spaces using hand labeled imagery from two domains: human face recognition and classifying an image as a cat or a dog.

Wendy S. Yambor Computer Science Department Colorado State University Fort Collins, CO 80523 Summer 2000

iii

AcknowledgmentsI thank my committee, Ross Beveridge, Bruce Draper, Micheal Kirby, and Adele Howe, for their support and knowledge over the past two years. Every member of my committee has been involved in some aspect of this thesis. It is through their interest and persuasion that I gained knowledge in this field.

I thank Jonathon Phillips for providing me with the results and images from the FERET evaluation. Furthermore, I thank Jonathon for patiently answering numerous questions.

iv

Table of Contents1. Introduction 1.1 Previous Work. 1.2 A General Algorithm... 1.3 Why Study These Subspaces?. 1.4 Organization of Following Sections 2. Eigenspace Projection 2.1 Recognizing Images Using Eigenspace, Tutorial on Original Method... 2.2 Tutorial for Snapshot Method of Eigenspace Projection.... 2.3 Variations. 3. Fisher Discriminants 3.1 Fisher Discriminants Tutorial (Original Method).... 3.2 Fisher Discriminants Tutorial (Orthonormal Basis Method)... 4. Variations 4.1 Eigenvector Selection.. 4.2 Ordering Eigenvectors by Like-Image Difference.. 4.3 Similarity & Distance Measures.. 4.4 Are similarity measures the same inside and outside of eigenspace?. 5. Experiments 5.1 Datasets 5.1.1 The Cat & Dog Dataset 5.1.2 The FERET Dataset. 5.1.3 The Restructured FERET Dataset 5.2 Bagging and Combining Similarity Measures. 5.2.1 Adding Distance Measures... 5.2.2 Distance Measure Aggregation 5.2.3 Correlating Distance Metrics 5.3 Like-Image Difference on the FERET dataset.... 5.4 Cat & Dog Experiments... 5.5 FERET Experiments. 6. Conclusion . 6.1 Experiment Summary...... 6.2 Future Work.. Appendix I References 1 1 2 3 4 5 6 11 14 15 15 20 29 29 30 32 35 43 43 43 44 45 45 46 48 49 51 54 56 61 62 66 68 69

v

1. IntroductionTwo image recognition systems are examined, eigenspace projection and Fisher discriminants. Each of these systems examines images in a subspace. The eigenvectors of the covariance matrix of the training data create the eigenspace. The basis vectors calculated by Fisher discriminants create the Fisher discriminants subspace. Variations of these subspaces are examined. The first variation is the selection of vectors used to create the subspaces. The second variation is the measurement used to calculate the difference between images projected into these subspaces. Experiments are performed to test hypotheses regarding the relative performance of subspace and difference measures.

Neither eigenspace projection nor Fisher discriminants are new ideas. Both have been examined by researches for many years. It is the work of these researches that has helped to revolutionize image recognition and bring face recognition to the point where it is now usable in industry.

1.1 Previous Work Projecting images into eigenspace is a standard procedure for many appearance-based object recognition algorithms. A basic explanation of eigenspace projection is provided by [20]. Michael Kirby was the first to introduce the idea of the low-dimensional characterization of faces. Examples of his use of eigenspace projection can be found in [7,8,16]. Turk & Pentland worked with eigenspace projection for face recognition [21].

1

More recently Shree Nayar used eigenspace projection to identify objects using a turntable to view objects at different angles as explained in [11].

R.A. Fisher developed Fishers linear discriminant in the 1930s [5]. Not until recently have Fisher discriminants been utilized for object recognition. An explanation of Fisher discriminants can be found in [4]. Swets and Weng used Fisher discriminants to cluster images for the purpose of identification in 1996 [18,19,23]. Belhumeur, Hespanha, and Kriegman also used Fisher discriminants to identify faces, by training and testing with several faces under different lighting [1].

1.2 A General Algorithm An image may be viewed as a vector of pixels where the value of each entry in the vector is the grayscale value of the corresponding pixel. For example, an 8x8 image may be unwrapped and treated as a vector of length 64. The image is said to sit in N-dimensional space, where N is the number of pixels (and the length of the vector). This vector representation of the image is considered to be the original space of the image.

The original space of an image is just one of infinitely many spaces in which the image can be examined. Two specific subspaces are the subspace created by the eigenvectors of the covariance matrix of the training data and the basis vectors calculated by Fisher discriminants. The majority of subspaces, including eigenspace, do not optimize discrimination characteristics. Eigenspace optimizes variance among the images. The

2

exception to this statement is Fisher discriminants, which does optimize discrimination characteristics.

Although some of the details may vary, there is a basic algorithm for identifying images by projecting them into a subspace. First one selects a subspace on which to project the images. Once this subspace is selected, all training images are projected into this subspace. Next each test image is projected into this subspace. Each test image is compared to all the training images by a similarity or distance measure, the training image found to be most similar or closest to the test image is used to identify the test image.

1.3 Why Study These Subspaces? Projecting images into subspaces has been studied for many years as discussed in the previous work section. The research into these subspaces has helped to revolutionize image recognition algorithms, specifically face recognition. When studying these subspaces an interesting question arises: under what conditions does projecting an image into a subspace improve performance. The answer to this question is not an easy one. What specific subspace (if any at all) improves performance depends on the specific problem. Furthermore, variations within the subspace also effect performance. For example, the selection of vectors to create the subspace and measures to decide which images are a closest match, both effect performance.

3

1.4 Organization of Following Sections I discuss two alternative spaces commonly used to identify images. In chapter 2, I discuss eigenspaces. Eigenspace projection, also know as Karhunen-Loeve (KL) and Principal Component Analysis (PCA), projects images into a subspace such that the first orthogonal dimension of this subspace captures the greatest amount of variance among the images and the last dimension of this subspace captures the least amount of variance among the images. Two methods of creating an eigenspace are examined, the original method and a method designed for high-resolution images know as the snapshot method. In chapter 3, Fisher discriminants is discussed. Fisher discriminants project images such that images of the same class are close to each other while images of different classes are far apart. Two methods of calculating Fisher discriminants are examined. One method is the original method and the other method first projects the images into an orthonormal basis defining a subspace spanned by the training set.

Once images are projected into one of these spaces, a similarity measure is used to decide which images are closest matches. Chapter 4 discusses variations of these two methods, such as methods of selecting specific eigenvectors to create the subspace and similarity measures. In chapter 5, I discuss experiments performed on both these methods on two datasets. The first dataset is the Cat & Dog dataset, which was developed at Colorado State University. The second dataset is the FERET dataset, which was made available to me by Jonathan Phillips at the National Institute of Standard and Technology [10,12,13].

4

2. Eigenspace ProjectionEigenspace is calculated by identifying the eigenvectors of the covariance matrix derived from a set of training images. The eigenvectors corresponding to non-zero eigenvalues of the covariance matrix form an orthonormal basis that rotates and/or reflects the images in the N-dimensional space. Specifically, each image is stored in a vector of size N.x i = x1i

[

i ... x N

]

T

(1)

The images are mean centered by subtracting the mean image from each image vector1.x i = x i m , where m =

1 P i x P i =1

(2)

These vectors are combined, side-by-side, to create a data matrix of size NxP (where P is the number of images). X = x1 | x 2

[

| ... | x P

]

(3)

The data matrix X is multiplied by its transpose to calculate the covariance matrix. = XX T (4)

This covariance matrix has up to P eigenvectors associated with non-zero eigenvalues, assuming P