Face recognition and Detection by PCA Algorithm
-
Upload
laxmana-rao -
Category
Engineering
-
view
533 -
download
5
Transcript of Face recognition and Detection by PCA Algorithm
Table of Contents1. Introduction.........................................................................................................................................2
2. Objectives............................................................................................................................................2
3. Tools/Environment Used.....................................................................................................................3
4. Analysis................................................................................................................................................4
5. Design..................................................................................................................................................7
5.1 Mathematical Background...............................................................................................................7
5.2 PCA Algorithm..............................................................................................................................11
6. Testing...............................................................................................................................................17
7. Snapshots..........................................................................................................................................25
8. Conclusion.........................................................................................................................................29
9. Future Enhancements........................................................................................................................30
10. References.....................................................................................................................................31
1. Introduction
Humans are very good at recognizing faces and complex patterns. Even a passage of time doesn't
affect this capability and therefore it would help if computers become as robust as humans in
face recognition. Face recognition system can help in many ways:
Checking for criminal records.
Enhancement of security by using surveillance cameras in conjunction with face
recognition system.
Finding lost children's by using the images received from the cameras fitted at public
places.
Knowing in advance if some VIP is entering the hotel.
Detection of a criminal at public place.
Can be used in different areas of science for comparing an entity with a set of entities.
Pattern Recognition.
This project is a step towards developing a face recognition system which can recognize static
images. It can be modified to work with dynamic images. In that case the dynamic images
received from the camera can first be converted in to the static one's and then the same procedure
can be applied on them. But then there are lots of other things that should be considered. Like
distance between the camera and the person, magnification factor, view [top, side, front] etc.
2. Tools/Environment Used
Software Requirements:
Operating System : Windows operating system
Hardware Requirements:
Processor : Pentium processor of 400MHz or higher.
RAM : Minimum 64MB primary memory.
Hard disk : Minimum 1GB hard disk space.
Monitor : Preferably color monitor (16 bit color) and above.
Web camera.
Compact Disk drive.
A keyboard and a mouse.
3. Analysis
Modules
Add Image/Registration
Image Capture
Login
Eigenface Computation
Identification
A module is a small part of our project. This plays a very important role in the project and in
coding concepts. In Software Engineering concept we treat it has a small part of a system but
whereas in our programming language it is a small part of the program, which we also called as
function in, some cases which constitute the main program.
Importance of modules in any software development side is we can easily understand what the
system we are developing and what its main uses are. At the time of project we may create many
modules and finally we combine them to form a system.
Module Description
Add Image/Registration
Add Image is a module that is considered with adding image along with the user id for login
of the person of whom we are taking image. In this we add Image by capturing from web camera
and store them in our system. During registration four images are captured. Each image is stored
four times as minimum of sixteen images are required for the algorithm of comparison.
Image Capture Module
This module is used to capture image using web camera. This is written as a separate thread
to avoid system hanging. This module is used to capture image in login module and registration
module.
Login
This modules function is to compare the captured image with stored images in the system.
This module uses Eigenface computation defined in next modules for comparison.
Eigenface Computation
This module is used to compute the "face space" used for face recognition. The recognition is
actually being carried out in the FaceBundle object, but the preparation of such object requires
doing lots of computations. The steps are:
* Compute an average face.
* Build a covariance matrix.
* Compute eigenvalues and eigenvector
* Select only sixteen largest eigenvalues (and its corresponding eigenvectors)
* Compute the faces using our eigenvectors
* Compute eigenspace for our given images.
Identification
This module contains the functionality to take the image from above module and it compares or
searches with the images already there in the database. If any image is matched then a success`
message is shown to the user.
Registration:
Login:
Create Username RRegistration
Register your image
Saving the username &
image for future use
Identifying the image
Flow Diagram
Enter Username Enter Password(image)
Authentication Provided
Authentication Declined
Processing the Request
True False
Start
Capture Image Enter Login Id
Action
Success
Success Message Failure Message
Capture Image
Store
Login Register
Compare
Stages of face recognition
4. Design
5.1 Mathematical Background
This section will illustrate mathematical algorithm that are the back bone of Principal
Component Analysis. It is less important to remember the exact mechanics of mathematical
techniques than it is to understand the intuition behind them. The topics are covered
independently of each other and examples are given.
Variance, Covariance, Covariance Matrix and Eigenvectors and Eigenvalues are basis of the
design algorithm.
a. Variance
The variance is a measure of the spread of data. Statisticians are usually concerned with taking a
sample of a population. To use election polls as an example, the population is all the people in
the country, whereas a sample is a subset of the population that the statisticians measure. The
great thing about statistics is that by only measuring a sample of the population, we can work out
what is most likely to be the measurement if we used the entire population.
Let's take an example:
Face location detection
Feature extraction
Facial image classification
X = [1 2 4 6 12 25 45 68 67 65 98]
We could simply use the symbol X to refer to this entire set of numbers. For referring to an
individual number in this data set, we will use subscript on the symbol X to indicate a specific
number. There are number of things that we can calculate about a data set. For example we can
calculate the mean of the sample. It can be given by the formulae:-
mean = sum of all numbers / total no. of numbers
Unfortunately, the mean doesn't tell us a lot about the data except for a sort of middle point. For
example, these two data sets have exactly the same mean (10), but are obviously quite different:
[0 8 12 20] and [8 9 11 12]
So what is different about these two sets? It is the spread of the data that is different. The
Variance is a measure of how spread out data is. It’s just like Standard Deviation.
SD is "The average distance from the mean of the data set to a point". The way to calculate it is
to compute the squares of the distance from each data point to the mean of the set, add them all
up, divide by n-1, and take the positive square root. As formulae:
b. Covariance
Variance and SD are purely 1-dimensional.Data sets like this could be: height of all the people in
the room, marks for the last CSC378 exam etc. However many data sets have more than one
dimensions, and the aim of the statistical analysis of these data sets is usually to see if there is
any relationship between the dimensions. For example, we might have as our data set both the
height of all the students in a class, and the mark they received for that paper. We could then
perform statistical analysis to see if the height of a student has any effect on their mark. It is
useful to have measure to find out how much the dimensions vary from the mean with respect o
each other.
Covariance is such a measure. It is always measured between 2 dimensions. If we calculate the
covariance between one dimension and itself, you get the variance. So if we had a three
dimensional data set (x, y, z), then we could measure the covariance between the x and y
dimensions, the x and z dimensions, and the y and z dimensions. Measuring the covariance
between x and x, or y and y, or z and z would give us the variance of the x, y and z dimensions
respectively.
The formula for covariance is very similar to the formulae for variance.
How does this work? Let’s use some example data. Imagine we have gone into the world and
collected some 2-dimensional data, say we have asked a bunch of students how many hours in
total that they spent studying CSC309, and the mark that they received. So we have two
dimensions, the first is the H dimension, the hours studied, and the second is the M dimension,
the mark received.
So what does the covariance between H and M tells us? The exact value is not as important as its
sign (ie. positive or negative). if the value is positive, then that indicates that noth dimensions
increase together, meaning that, in general, as the number of hours of study increased, so did the
final mark.
If the value is negative, then as one dimension increase the other decreases. If we had ended up
with a negative covariance then would mean opposite that as the number of hours of study
increased the final mark decreased.
In the last case, if the covariance is zero, it indicates that the two dimensions are independent of
each other.
c. The covariance Matrix
A useful way to get all the possible covariance values between all the different dimensions is to
calculate them all and put them in a matrix. An example. We will make up the covariance matrix
for an imaginary 3 dimensional data set, using the usual dimensions x,y and z. Then the
covariance matrix has 3 rows and 3 columns, and the values are this:
cov(x,x) cov(x,y) cov(x,z)
C = cov(y,x) cov(y,y) cov(y,z)
cov(z,x) cov(z,y) cov(z,z)
Point to note: Down the main diagonal, we see that the covariance value is between one of the
dimensions and itself. These are the variances for that dimension. The other point is that since
cov(a,b) = cov(b,a), the matrix is symmetrical about the main diagonal.
d. Eigenvectors and Eigenvalues
If we multiply a square matrix with any other vector then we will get another vector that is
transformed from its original position. It is the nature of the transformation that the eigenvectors
arise from. Imagine a transformation matrix that, when multiplied on the left, reflected vectors in
the line y=x. Then we can see that if there were a vector that lay on the line y=x, it is reflection
of itself. This vector (and all multiples of it, because it wouldn't matter how long the vector was),
would be an eigenvector of that transformation matrix. Eigenvectors can only be found for
square matrices. And not every square matrix has eigenvectors. And given an n x n matrix that
does have eigenvectors, there are n of them. Another property of eigenvectors is that even if we
scale the vector by some amount before we multiply it, we will still get the same multiple of it as
a result. This is because if we scale a vector by some amount, all we are doing is making it
longer,
Lastly, all the eigenvectors of a matrix are perpendicular, ie. at right angles to each other, no
matter how many dimensions you have. By the way, another word for perpendicular, in math
talk, is orthogonal. This is important because it means that we can express the data in terms of
these perpendicular eigenvectors, instead of expressing them in terms of the x and y axes. Every
eigenvector has a value associated with it, which is called as eigenvalue. Principal eigenvectors
are those which have the highest eigenvalues associated with them.
5.2 PCA Algorithm
a. Eigen faces Approach
Extract relevant information in a face image [Principal Components] and encode that information
in a suitable data structure. For recognition take the sample image and encode it in the same way
and compare it with the set of encoded images. In mathematical terms we want to find eigen
vectors and eigen values of a covariance matrix of images. Where one image is just a single point
in high dimensional space [n * n], where n * n are the dimensions of a image. There can be many
eigen vectors for a covariance matrix but very few of them are the principle one's. Though each
eigen vector can be used for finding different amount of variations among the face image. But
we are only interested in principal eigen vectors because these can account for substantial
variations among a bunch of images. They can show the most significant relationship between
the data dimensions.
Eigenvectors with highest eigen values are the principle component of the Image set. We may
lose some information if we ignore the components of lesser significance. But if the eigen values
are small then we won't lose much. Using those set of eigen vectors we can construct eigenfaces.
b. Finding EigenFaces
(1) Collect a bunch [say 15] of sample face images . Dimensions of all images should be same .
An image can be stored in an array of n*n dimensions [ ] which can be considered as a image
vector.
Where M is the number of images.
(2) Find the average image of bunch of images.
(3) Find the deviated [avg - img1 , avg - img2, ......... , avg - img.n] images .
(4) Calculate the covariance matrix .
where
But the problem with this approach is that we may not be able to complete this operation for a
bunch of images because covariance matrix will be very huge. For Example Covariance matrix
,where dimension of a image = 256 * 256, will consist of [256 * 256] rows and same numbers of
columns. So its very hard or may be practically impossible to store that matrix and finding that
matrix will require considerable computational requirements.
So for solving this problem we can first compute the matrix L.
And then find the eigen vectors [v] related to it
Eigen Vectors for Covariance matrix C can be found by
where
are the Eigen Vectors for C.
(5) Using these eigen vectors , we can construct eigen faces . But we are interested in the eigen
vectors with high eigen values . So eigen vectors with less than a threshold eigen value can be
dropped .So we will keep only those images which correspond to the highest eigen values. This
set of images is called as face space. For doing that in java , we have used colt algebra package.
These are the steps involved in the implementation -->
i) Find [from 4]
Convert it in to a DoubleDenseMatrix2D by using colt matrix class.
ii) Find the eigen vector associated with that by using class :-
cern.colt.matrix.linalg.EigenvalueDecomposition
This will be a M by M [M = number of training images] matrix.
iii) By multiplying that with 'A' [Difference image matrix] we'll be able to get the actual
eigenvector matrix [U] of covariance of 'A'. It will be of M by X [Where X is the total number of
pixels in a image].
c. Classifying Face Images
The eigenfaces derived from the previous section seem adequate for describing face images
under very controlled conditions, we decided to investigate their usefulness as a tool for face
recognition. Since the accurate reconstruction of the image is not a requirement, a smaller
number of eigenfaces are sufficient for the identification process. So identification becomes a
pattern recognition task.
Algorithm:
1. Convert image into a matrix [ ] so that all pixels of the test image are stored in a matrix of
256*256[rows] by 1 [column] size.
2. Find weights associated with each training image. This operation can simply be performed by,
Weight Matrix = TransposeOf (EigenVector-of-CovarianceMatrix) * DifferenceImageMatrix.
This matrix will be of size N by N, where N is the total number of face images. Each entry in the
column will then represent the corresponding weight of that particular image with respect to a
particular eigenvector.
2. Project into "face space" by a simple operation, this operation is same as defined above. But
here we are projecting a single image and hence we will get a matrix of size N [rows] by 1
[columns].Let's call this matrix as 'TestProjection' matrix.
for k=1,2.....N. Where N is the total number of training images.
3. Find the distance between the each element of the testProjection matrix and the corresponding
element of Weight matrix. We will get a new matrix of N [rows] by N [columns].
4. Find the 2-Norm for the above derived matrix. This will be a matrix of 1 [rows] by N
[columns]. Find the minimum value for all the column values. If it is with in some threshold
value then return that column number. That number represents the image number. That number
shows that the test image is nearest to that particular image from the set of training images. If the
minimum value is above the threshold value, then that test image can be considered as a new
image which is not in our training image set. And that can be stored in our training image set by
applying the same procedure [mentioned in section 5.2]. So the system is a kind of learning
system which automatically increases its knowledge if it encounters some unknown image [ the 1
which it couldn't detect ].
5. Testing
Introduction
Software testing is a critical element of software quality assurance and represents the ultimate
service of specification design and coding. The increasing visibility of software as a system
element and the attended costs associated with the software failure and motivating forces for well
planned, thorough testing. It is not unusual for a software development to spend between 30 and
40 percent of total project effort in testing. System Testing Strategies for this system integrate
test case design techniques into a well planned series of steps that result in the successful
construction of this software. It also provides a road map for the developer, the quality assurance
organization and the customer, a roadmap that describes the steps to be conducted as path of
testing, when these steps are planned and then undertaken and how much effort, time and
resources will be required.
The test provisions are follows.
System testing
Software Testing: As the coding is completed according to the requirement we have to test the
quality of the software. Software testing is a critical element of software quality assurance and
represents the ultimate review of specification, design and coding. Although testing is to uncover
the errors in the software but it also demonstrates that software functions appear to be working as
per the specifications, those performance requirements appear to have been met. In addition, data
collected as testing is conducted provide a good indication of software and some indications of
software quality as a whole. To assure the software quality we conduct both White Box Testing
and Black Box Testing.
White Box Testing:
White Box Testing is a test case design method that uses the control structure of the
procedural design to derive test cases. As we are using a non-procedural language, there is very
small scope for the White Box Testing. Whenever it is necessary, there the control structure are
tested and successfully passed all the control structure with a very minimum error.
Black Box Testing:
Black Box Testing focuses on the functional requirement of the software. It enables to
derive sets of input conditions that will fully exercise all functional requirements for a program.
The Black Box Testing finds almost all errors. If finds some interface errors and errors in
accessing the database and some performance errors. In Black Box Testing we use mainly two
techniques Equivalence partitioning the Boundary Volume Analysis Technique.
Equivalence Partitions:
In the method we divide input domain of a program into classes of data from which test cases are
derived. An Equivalence class represents a set of valid or invalid of a set of related values or a
Boolean condition.
The equivalence for these is: Input condition requires specific value-specific or non-specific two
classes.
Input condition requires a range or out of range two classes.
Input condition specifies a number of a set-belongs to a set or not belongs to the set two classes.
Input condition is Boolean-valid or invalid Boolean condition two classes.
Boundary Values Analysis:
Number of errors usually occurs at the boundaries of the input domain generally. In this
technique a selection of test cases is exercised using boundary values i.e., around boundaries. By
the above two techniques, we eliminated almost all errors from the software and checked for
numerous test values for each and every input value. The results were satisfactory. Flow of
Testing System testing is designated to uncover weakness that was not detected in the earlier
tests. The total system is tested for recovery and fallback after various major failures to ensure
that no data are lost. An accepted test is done to validity and reliability of the system. The
philosophy behind the testing is to find error in project.
There are many test cases designed with this is mind. The flow of testing is as follows.
Code Testing
Specification testing is done to check if the program does with it should do and how it should
behave under various conditions or combinations and submitted for processing in the system and
it’s checked if any overlaps occur during the processing. This strategy examines the logic of the
program. Here only syntax of the code is tested. In code testing syntax errors are corrected, to
ensure that the code is perfect.
Unit Testing:
The first level of testing is called unit testing. Here different modules are tested against the
specifications produced during the design of the modules. Unit testing is done to test the working
of individual modules with test oracles. Unit testing comprises a set of tests preformed by an
individual programmer prior to integration of the units into a large system. A program unit is
small enough that the programmer who developed if can test it in a great detail. Unit testing
focuses first on the modules to locate errors. These errors are verified and corrected so that the
unit perfectly fits to the project.
System Testing
The next level of testing is system testing and acceptance testing. This testing is done to check if
the system has its requirements and to find the external behavior of the system. System testing
involves two kinds of activities:
Integration testing
Acceptance testing
Integration Testing
The next level of testing is called the Integration Testing. In this many tested modules are
combined into subsystems, which were tested. Test case data is prepared to check the control
flow of all the modules and to exhaust all possible inputs to the program. Situations like treating
the modules when there is no data entered in the text box is also tested. This testing strategy
dictates the order in which modules must be available, and exerts strong influence on the order in
which the modules must be written, debugged and unit tested. In integration testing, all the
modules / units on which unit testing is performed are integrated together and tested.
Acceptance Testing:
This testing is performed finally by user to demonstrate that the implemented system satisfies its
requirements. The user gives various inputs to get required outputs.
Specification Testing:
Specification testing is done to check if the program does what is should do and how it should
behave under various conditions or combination and submitted for processing in the system and
it is checked if any overlaps occur during the processing.
Testing Objectives:
The following are the testing objectives….
Testing is a process of executing a program with the intent of finding an error.
A good test case is one that has a high probability of finding an as yet undiscovered error.
A successful test is one that uncovers an as yet undiscovered error.
The above objectives imply a dramatic change in view point. They move counter to the
commonly held view that a successful test is one in which no errors are found. Our objective is
to design tests that systematically verify different clauses of errors and do so with minimum
amount of time and effort. If testing is conducted successfully, it will uncover errors in the
software. As a secondary benefit, testing demonstrates that software functions appear to be
working according to specification and that performance requirements appear to have been met.
In addition, data collected as testing is conducted provides a good indication of software. Testing
can’t show the absence of defects, it can only show that software errors are present. It is
important to keep this stated in mind as testing is being conducted.
Testing principles:
Before applying methods to design effective test cases, a software engineer must
understand the basic principles that guide software testing.
• All tests should be traceable to customer requirements.
• Tests should be planned long before testing begins.
• Testing should begin “in the small” and progress towards testing “in the large”.
• Exhaustive testing is not possible.
Test Plan:
A test plan is a document that contains a complete set of test cases for a system, along
with other information about the testing process. The test plan should be returned long before the
testing starts.
Test plan identifies
1. A task set to be applied as testing commences,
2. The work products to be produced as each testing task is executed
3. The manner, in which the results of testing are evaluated, recorded and reuse when regression
testing is conducted. In some cases the test plan is indicated with the project plan. In others the
test plan is a separate document. The test report is a record of the testing performed. The testing
report enables the acquirer to assess the testing and its results. The test report is a record of the
testing performed. The testing report enables the acquirer to assess the testing and its results.
Test cases
Test cases for login page
Sl no
Task Expected result Obtained result Remarks
1 Using valid username
and
password(Image)
Successful
authentication
As expected success
2 Using invalid
username
Authentication
failed
As expected Invalid user
name
3 Using invalid
password(Image)
Authentication
failed
As expected Username and
password are
not correct
4 Without giving
username and
password
Authentication
failed As expected
Please enter
user name and
password
5 Username and
without password
Authentication
failed
As expected Password
cannot be
empty
Test cases for registration page
Sl no Task Expected result Obtained
result
Remarks
1 Capture four images Registration As expected success
and register success
2 Capture three
images and register
Should not allow
to register
As expected
Register button is
disabled if less than
four images are
captured.
3 Without giving port
number
Connection failed As expected Please specify port
number
4 Without selecting IP Connection failed As expected Ip has to be selected
6. Snapshots
Layout
The layout contains two sections. Left section is used for placing web camera window. Right section is used to show capture images for login and registration.
Web camera Window
This is a separate window which is created using separate thread.
Register Window
Four images are shown which are captured during registration.
Login Screen
The image is captured for login is shown in this window. Success message is shown as below.
Login screen ( login failure).
Login Screen and Web camera
Web camera and captured image during login is as shown below.
7. Conclusion
1. The user will be authenticated not only with the username also with the image of the user
2. For the processing, some of the lines on the face will be used so that the image can be identified with the different angles.
3. The image processing process is good enough to provide security for the website.
8. Future Enhancements
1. The project can be enhanced for processing 3D images.
2. Authentication can be implemented by capturing video clip of a person.
3. This can also be used to process the signatures of a person for providing the authentication.
4. We can also use this in real time application.
5. Authentication can be embedded into web application which will be an added advantage for providing the login for the websites.
9. References
Websites
http://www.imageprocessingplace.com/
http://www.graphicsmagick.org/
http://www.imagemagick.org/
http://www.mediacy.com/
Books
Digtal Image Processing Projects- Rs tech Technology
Image processing by Micheal Pedilla