Face Recognition
description
Transcript of Face Recognition
A Project ReportOn
(Face Recognition)
Submitted in partial fulfillment of the requirementsFor the award of the degree of
Bachelor of TechnologyIn
Computer Science and Engineering
By(Ramesh Kumar Verma)Roll. No. 1309710918
Semester VII
Under the Supervision of (Mr. Lucknesh Kumar)
Galgotias College of Engineering & Technology
Greater Noida 201306
Affiliated to
Uttar Pradesh Technical UniversityLucknow
1
GALGOTIAS COLLEGE OF ENGINEERING & TECHNOLOGYGREATER NOIDA - 201306, UTTER PRADESH, INDIA.
CERTIFICATE
This is to certify that the project report entitled “Face Recognition” submitted by
Ramesh Kumar Verma, Deepak Kumar And Arun Kumar to the Dr. A.P.J. Abdul
Kalam Technical University , Utter Pradesh in partial fulfillment for the award
of Degree of Bachelor of Technology in Computer science & Engineering is a
bonafide record of the project work carried out by them under my supervision during
the year 2015-2016.
Dr. Bhawna MallickProfessor and HeadDepartment of CSE
Mr. Lucknesh KumarDesignationDepartment. of CSE
2
ACKNOWLEDGEMENT
The significant contribution in the preparation of this project would not have been possible without mentioning few names, which have extended substantial support.
It is with great pleasure that we present our project report on “Human Face Detection and Recognition” to Department of Computer Science Engineering in “Galgotias College of Engineering & Technology Greater Noida”.
I am grateful to Galgotias College of Engineering & Technology for permitting me to undergo for a research project on “Human Face Detection and Recognition”. While developing this project, I have learnt a lot. This will be an un-forgetful experience. While developing this project, a lot of difficulties were faced by me. But it was the help of some special people that I have gained much confidence and developed the project quite well.
On the submission of our thesis report on “Human Face Detection and Recognition”, we would like to extend our gratitude and sincere thanks to our supervisor Mr. Lucknesh Kumar, Department of Computer Science and Engineering for his constant motivation and support during the course of our work in the last one year. We truly appreciate and value his esteemed guidance and encouragement from the beginning to the end of this thesis. We are indebted to him for having helped us shape the problem and providing insights towards the solution.
I would like to thank all the Staff of “Galgotias College of Engineering & Technology Greater Noida” for their support and for making my training valuable.
Ramesh Kumar Verma Deepak Kumar Roll no.-1309710918 Roll no.-1209710033
Arun Kumar Roll no. - 1209710024
3
ABSTRACT
Keyword: PCA (principal component analysis), MPCA, LDA, EgineFace and OpenCV.
Human face detection and recognition play important roles in many applications such as video surveillance and face image database management. In our project, we have studied worked on both face recognition and detection techniques and developed algorithms for them. In face recognition the algorithm used is PCA (principal component analysis), MPCA(Multilinear Principal Component Analysis) and LDA(Linear Discriminant Analysis) in which we recognize an unknown test image by comparing it with the known training images stored in the database as well as give information regarding the person recognized. These techniques works well under robust conditions like complex background, different face positions. These algorithms give different rates of accuracy under different conditions as experimentally observed. In face detection, we have developed an algorithm that can detect human faces from an image. We have taken skin color as a tool for detection. This technique works well for Indian faces which have a specific complexion varying under certain range. We have taken real life examples and simulated the algorithms in C# (.NET) successfully
4
LIST OF FIGURES
Figure Title Page
3.1 Lagrangian Droplet Motion 70
4.1 Vertical Manifold 95
4.2 20O Bend Manifold 95
4.3 90O Bend Manifold 95
4.4 Spiral Manifold 95
4.5 Spiral Manifold Configuration ( θ = 225o) 96
4.6 Spiral Manifold with Different Flow Entry Angles (20O, 32.5O and 45O) 96
4.7 Helical Manifold (Helical Angles 30O, 35O, 40O, 45O and 50O) 97
4.8 Spiral Manifold 97
5
CONTENTS
Title Page
CERTIFICATE I
ACKNOWLEDGEMENTS II
ABSTRACT III
LIST OF FIGURES IV
CHAPTER 1 INTRODUCTION
1.1 Introduction 71.2 Face Recognition 7
CHAPTER 2 TOOLs/ENVIRONMENT
2.1 Software Requirements 92.2 Hardware Requirements 9
CHAPTER 3 ANALYSIS
3.1 Modules 103.2 Flow Diagram 13
CHAPTER 4 DESIGN
4.1. Mathematical Background 144.2. PCA Algorithm 18
CHAPTER 5
5. Conclusion 22
References 23
6
CHAPTER- 1
1.1. INTRODUCTION
Humans are very good at recognizing faces and complex patterns. Even a passage of time
doesn't affect this capability and therefore it would help if computers become as robust as
humans in face recognition. Face recognition system can help in many ways:
Checking for criminal records.
Enhancement of security by using surveillance cameras in conjunction with face
recognition system.
Finding lost children's by using the images received from the cameras fitted at public
places.
Knowing in advance if some VIP is entering the hotel.
Detection of a criminal at public place.
Can be used in different areas of science for comparing an entity with a set of entities.
Pattern Recognition.
This project is a step towards developing a face recognition system which can recognize
static images. It can be modified to work with dynamic images. In that case the dynamic
images received from the camera can first be converted in to the static one's and then the
same procedure can be applied on them. But then there are lots of other things that should
be considered. Like distance between the camera and the person, magnification factor,
view [top, side, front] etc.
1.2. FACE RECOGNITION
The face recognition algorithms used here are Principal Component Analysis(PCA), Multilinear Principal Component Analysis (MPCA) and Linear Discriminant Analysis(LDA). Every algorithm has its own advantage. While PCA is the most simple and fast algorithm, MPCA and LDA which have been applied together as a single algorithm named MPCALDA provide better results under complex circumstances like face position, luminance variation etc. Each of them has been discussed one by one below.
2.1 PRINCIPAL COMPONENT ANALYSIS (PCA)
Principal component analysis (PCA) was invented in 1901 by Karl Pearson. PCA involves a mathematical procedure that transforms a number of possibly correlated
7
variables into a number of uncorrelated variables called principal components, related to the original variables by an orthogonal transformation. This transformation is defined in such a way that the first principal component has as high a variance as possible (that is, accounts for as much of the variability in the data as possible), and each succeeding component in turn has the highest variance possible under the constraint that it be orthogonal to the preceding components. PCA is sensitive to the relative scaling of the original variables. Depending on the field of application, it is also named the discrete Karhunen–Loève transform (KLT), the Hotelling transform or proper orthogonal decomposition (POD).
The major advantage of PCA is that the eigenface approach helps reducing the size of the database required for recognition of a test image. The trained images are not stored as raw images rather they are stored as their weights which are found out projecting each and every trained image to the set of eigenfaces obtained.
2.1.1 The eigenface approach
In the language of information theory, the relevant information in a face needs to be extracted, encoded efficiently and one face encoding is compared with the similarly encoded database. The trick behind extracting such kind of information is to capture as many variations as possible from the set of training images. Mathematically, the principal components of the distribution of faces are found out using the eigenface approach. First the eigenvectors of the covariance matrix of the set of face images is found out and then they are sorted according to their corresponding eigenvalues. Then a threshold eigenvalue is taken into account and eigenvectors with eigenvalues less than that threshold values are discarded. So ultimately the eigenvectors having the most significant eigenvalues are selected. Then the set of face images are projected into the significant eigenvectors to obtain a set called eigenfaces. Every face has a contribution to the eigenfaces obtained. The best M eigenfaces from a M dimensional subspace is called “face space” Each individual face can be represented exactly as the linear combination of “eigenfaces” or each face can also be approximated using those significant eigenfaces obtained using the most significant eigen values.
8
CH A P TER - 2
Tools/Environment Used
Software Requirements:
Operating System : Windows operating system
Language : C#
Hardware Requirements:
Processor : Pentium processor of 400MHz or higher.
RAM : Minimum 64MB primary memory.
Hard disk : Minimum 1GB hard disk space.
Monitor : Preferably color monitor (16 bit color) and above.
Web camera.
Compact Disk drive.
A keyboard and a mouse.
9
CH A P TER - 3
Analysis
3.1Modules
Add Image/Registration
Image Capture
Login
Eigen face Computation
Identification
A module is a small part of our project. This plays a very important role in the project
and in coding concepts. In Software Engineering concept we treat it has a small part of a
system but whereas in our programming language it is a small part of the program, which
we also called as function in, some cases which constitute the main program.
Importance of modules in any software development side is we can easily understand
what the system we are developing and what its main uses are. At the time of project we
may create many modules and finally we combine them to form a system.
Module Description
Add Image/Registration
Add Image is a module that is considered with adding image along with the user id for
login of the person of whom we are taking image. In this we add Image by capturing
from web camera and store them in our system. During registration four images are
captured. Each image is stored four times as minimum of sixteen images are required for
the algorithm of comparison.
Image Capture Module
10
This module is used to capture image using web camera. This is written as a separate
thread to avoid system hanging. This module is used to capture image in login module
and registration module.
Login
This modules function is to compare the captured image with stored images in the
system. This module uses Eigenface computation defined in next modules for
comparison.
Eigenface Computation
This module is used to compute the "face space" used for face recognition. The
recognition is actually being carried out in the FaceBundle object, but the preparation of
such object requires doing lots of computations. The steps are:
* Compute an average face.
* Build a covariance matrix.
* Compute eigenvalues and eigenvector
* Select only sixteen largest eigenvalues (and its corresponding eigenvectors)
* Compute the faces using our eigenvectors
* Compute Eigenspace for our given images.
Identification
This module contains the functionality to take the image from above module and it
compares or searches with the images already there in the database. If any image is
matched then a success` message is shown to the user.
11
Registration:
Login:
12
Identifying the image
Saving the username & image for future use
Register your image RegistrationCreate Username
FalseTrue
Processing the Request
Authentication DeclinedAuthentication Provided
Enter Password (image)Enter Username
Flow Diagram:
Stages of face recognition:
13
Start
Capture Image Enter Login Id
Action
Success
Success Message Failure Message
Capture Image
Store
Login Register
Compare
Face location detection
Feature extraction
Facial image classification
CH A P TER - 4
Design
5.1 Mathematical Background
This section will illustrate mathematical algorithm that are the back bone of Principal
Component Analysis. It is less important to remember the exact mechanics of
mathematical techniques than it is to understand the intuition behind them. The topics are
covered independently of each other and examples are given.
Variance, Covariance, Covariance Matrix and Eigenvectors and Eigenvalues are basis of
the design algorithm.
a. Variance
The variance is a measure of the spread of data. Statisticians are usually concerned with
taking a sample of a population. To use election polls as an example, the population is all
the people in the country, whereas a sample is a subset of the population that the
statisticians measure. The great thing about statistics is that by only measuring a sample
of the population, we can work out what is most likely to be the measurement if we used
the entire population.
Let's take an example:
X = [1 2 4 6 12 25 45 68 67 65 98]
We could simply use the symbol X to refer to this entire set of numbers. For referring to
an individual number in this data set, we will use subscript on the symbol X to indicate a
specific number. There are number of things that we can calculate about a data set. For
example we can calculate the mean of the sample. It can be given by the formulae:-
Mean = sum of all numbers / total no. of numbers
Unfortunately, the mean doesn't tell us a lot about the data except for a sort of middle
point. For example, these two data sets have exactly the same mean (10), but are
obviously quite different: [0 8 12 20] and [8 9 11 12]So what is different about these two
sets? It is the spread of the data that is different. The Variance is a measure of how spread
out data is. It’s just like Standard Deviation.
14
SD is "The average distance from the mean of the data set to a point". The way to
calculate it is to compute the squares of the distance from each data point to the mean of
the set, add them all up, divide by n-1, and take the positive square root. As formulae:
b. Covariance
Variance and SD are purely 1-dimensional.Data sets like this could be: height of all the
people in the room, marks for the last CSC378 exam etc. However many data sets have
more than one dimensions, and the aim of the statistical analysis of these data sets is
usually to see if there is any relationship between the dimensions. For example, we might
have as our data set both the height of all the students in a class, and the mark they
received for that paper. We could then perform statistical analysis to see if the height of a
student has any effect on their mark. It is useful to have measure to find out how much
the dimensions vary from the mean with respect o each other.
Covariance is such a measure. It is always measured between 2 dimensions. If we
calculate the covariance between one dimension and itself, you get the variance. So if we
had a three dimensional data set (x, y, z), then we could measure the covariance between
the x and y dimensions, the x and z dimensions, and the y and z dimensions. Measuring
the covariance between x and x, or y and y, or z and z would give us the variance of the
x, y and z dimensions respectively.
The formula for covariance is very similar to the formulae for variance.
How does this work? Let’s use some example data. Imagine we have gone into the world
and collected some 2-dimensional data, say we have asked a bunch of students how many
hours in total that they spent studying CSC309, and the mark that they received. So we
have two dimensions, the first is the H dimension, the hours studied, and the second is the
M dimension, the mark received.
15
So what does the covariance between H and M tells us? The exact value is not as
important as its sign (ie. positive or negative). if the value is positive, then that indicates
that noth dimensions increase together, meaning that, in general, as the number of hours
of study increased, so did the final mark.
If the value is negative, then as one dimension increase the other decreases. If we had
ended up with a negative covariance then would mean opposite that as the number of
hours of study increased the final mark decreased.
In the last case, if the covariance is zero, it indicates that the two dimensions are
independent of each other.
c. The covariance Matrix
A useful way to get all the possible covariance values between all the different
dimensions is to calculate them all and put them in a matrix. An example. We will make
up the covariance matrix for an imaginary 3 dimensional data set, using the usual
dimensions x,y and z. Then the covariance matrix has 3 rows and 3 columns, and the
values are this:
cov(x,x) cov(x,y) cov(x,z)
C = cov(y,x) cov(y,y) cov(y,z)
cov(z,x) cov(z,y) cov(z,z)
Point to note: Down the main diagonal, we see that the covariance value is between one
of the dimensions and itself. These are the variances for that dimension. The other point
is that since cov(a,b) = cov(b,a), the matrix is symmetrical about the main diagonal.
16
d. Eigenvectors and Eigenvalues
If we multiply a square matrix with any other vector then we will get another vector that
is transformed from its original position. It is the nature of the transformation that the
eigenvectors arise from. Imagine a transformation matrix that, when multiplied on the
left, reflected vectors in the line y=x. Then we can see that if there were a vector that lay
on the line y=x, it is reflection of itself. This vector (and all multiples of it, because it
wouldn't matter how long the vector was), would be an eigenvector of that transformation
matrix. Eigenvectors can only be found for square matrices. And not every square matrix
has eigenvectors. And given an n x n matrix that does have eigenvectors, there are n of
them. Another property of eigenvectors is that even if we scale the vector by some
amount before we multiply it, we will still get the same multiple of it as a result. This is
because if we scale a vector by some amount, all we are doing is making it longer,
17
Lastly, all the eigenvectors of a matrix are perpendicular, ie. at right angles to each other,
no matter how many dimensions you have. By the way, another word for perpendicular,
in math talk, is orthogonal. This is important because it means that we can express the
data in terms of these perpendicular eigenvectors, instead of expressing them in terms of
the x and y axes. Every eigenvector has a value associated with it, which is called as
eigenvalue. Principal eigenvectors are those which have the highest eigenvalues
associated with them.
5.2 PCA Algorithm
a. Eigen faces Approach
Extract relevant information in a face image [Principal Components] and encode that
information in a suitable data structure. For recognition take the sample image and
encode it in the same way and compare it with the set of encoded images. In
mathematical terms we want to find eigen vectors and eigen values of a covariance
matrix of images. Where one image is just a single point in high dimensional space [n *
n], where n * n are the dimensions of a image. There can be many eigen vectors for a
covariance matrix but very few of them are the principle one's. Though each eigen vector
can be used for finding different amount of variations among the face image. But we are
only interested in principal eigen vectors because these can account for substantial
variations among a bunch of images. They can show the most significant relationship
between the data dimensions.
Eigenvectors with highest eigen values are the principle component of the Image set. We
may lose some information if we ignore the components of lesser significance. But if the
eigen values are small then we won't lose much. Using those set of eigen vectors we can
construct eigenfaces.
b. Finding EigenFaces
(1) Collect a bunch [say 15] of sample face images. Dimensions of all images should
be same. An image can be stored in an array of n*n dimensions [ ] which can be
considered as a image vector.
{i∨i=1,2 ,………… .. , M }
18
Where M is the number of images.
(2) Find the average image of bunch of images.
ψ= 1M
∑i=1
M
i (1)
(3) Find the deviated [avg - img1 , avg - img2, ......... , avg - img.n] images .
ϕi= i-ψ; i=1,……..M. (2)
(4) Calculate the covariance matrix.
C=AAT , (3)
C=[ c (1,1) c (1,2 )⋯ c (1 , d )⋮ ⋱ ⋮
c(d ,1) c (d , 2)⋯ c (d ,d)]where
But the problem with this approach is that we may not be able to complete this operation
for a bunch of images because covariance matrix will be very huge. For Example
Covariance matrix ,where dimension of a image = 256 * 256, will consist of [256 * 256]
rows and same numbers of columns. So it’s very hard or may be practically impossible to
store that matrix and finding that matrix will require considerable computational
requirements.
So for solving this problem we can first compute the matrix L.
L=AATA (4)
And then find the Eigen vectors [v] related to it
Vi (i=1,……..M)
Eigen Vectors for Covariance matrix C can be found by
19
Where ui (i=1,…….M)
Are the Eigen Vectors for C.
(5) Using these Eigen vectors , we can construct eigen faces . But we are interested in the
eigen vectors with high Eigen values . So eigen vectors with less than a threshold Eigen
value can be dropped .So we will keep only those images which correspond to the highest
eigen values. This set of images is called as face space. For doing that in java, we have
used colt algebra package. These are the steps involved in the implementation -->
i) Find L=ATA [from 4]
Convert it in to a DoubleDenseMatrix2D by using colt matrix class.
ii) Find the eigen vector associated with that by using class :-
cern.colt.matrix.linalg.EigenvalueDecomposition
This will be a M by M [M = number of training images] matrix.
iii) By multiplying that with 'A' [Difference image matrix] we'll be able to get the actual
eigenvector matrix [U] of covariance of 'A'. It will be of M by X [Where X is the total
number of pixels in a image].
c. Classifying Face Images
The eigenfaces derived from the previous section seem adequate for describing face
images under very controlled conditions; we decided to investigate their usefulness as a
tool for face recognition. Since the accurate reconstruction of the image is not a
requirement, a smaller number of eigenfaces are sufficient for the identification process.
So identification becomes a pattern recognition task.
Algorithm:
1. Convert image into a matrix [ ] so that all pixels of the test image are stored in a
matrix of 256*256[rows] by 1 [column] size.
2. Find weights associated with each training image. This operation can simply be
performed by,
20
Weight Matrix = TransposeOf (EigenVector-of-CovarianceMatrix) *
DifferenceImageMatrix.
This matrix will be of size N by N, where N is the total number of face images. Each
entry in the column will then represent the corresponding weight of that particular image
with respect to a particular eigenvector.
2. Project into "face space" by a simple operation, this operation is same as defined
above. But here we are projecting a single image and hence we will get a matrix of size N
[rows] by 1 [columns].Let's call this matrix as 'TestProjection' matrix.
for k=1,2.....N. Where N is the total number of training images.
3. Find the distance between the each element of the testProjection matrix and the
corresponding element of Weight matrix. We will get a new matrix of N [rows] by N
[columns].
4. Find the 2-Norm for the above derived matrix. This will be a matrix of 1 [rows] by N
[columns]. Find the minimum value for all the column values. If it is with in some
threshold value then return that column number. That number represents the image
number. That number shows that the test image is nearest to that particular image from
the set of training images. If the minimum value is above the threshold value, then that
test image can be considered as a new image which is not in our training image set. And
that can be stored in our training image set by applying the same procedure [mentioned in
section 5.2]. So the system is a kind of learning system which automatically increases its
knowledge if it encounters some unknown image [ the 1 which it couldn't detect ].
21
CH A P TER - 5
Conclusion
1. The user will be authenticated not only with the username also with the image of the user
2. For the processing, some of the lines on the face will be used so that the image can be identified with the different angles.
3. The image processing process is good enough to provide security for the website.
Future Enhancements
1. The project can be enhanced for processing 3D images.2. Authentication can be implemented by capturing video clip of a person.3. This can also be used to process the signatures of a person for providing the
authentication.4. We can also use this in real time application.5. Authentication can be embedded into web application which will be an added
advantage for providing the login for the websites.
22
References
[01 ]Yong Zhang, Member, IEEE, Christine McCullough, John R. Sullins, Member, IEEE Hand-Drawn Face Sketch Recognition by Humans and a PCA-Based Algorithm “IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS SYSTEMS AND HUMANS, VOL. 40, NO. 3, MAY 2010.
[02] K. W. Bowyer, K. Chang, P. J. Flynn, and X. Chen, “Face recognition using 2-D, 3-D and infrared: Is multimodal better than multisampling?”Proc. IEEE, vol. 94, no. 11, pp. Nov.2012.
[03] G. Medioni, J. Choi, C.-H. Kuo, and D. Fidaleo, “Identifying no cooperative subjects at a distance using face images and inferred three-dimensional face models ,” IEEE Trans. Syst., Man, Cybern. A, Syst.,Humans, vol. 39, no. 1, pp. 12–24, Jan. 2009.
Websites
http://www.imageprocessingplace.com/
http://www.graphicsmagick.org/
http://www.imagemagick.org/
http://www.mediacy.com/
Books
Digtal Image Processing Projects- Rs tech Technology
Image processing by Micheal Pedilla
23
24