Synopsis of 3D Face recognition system

8/12/2019 Synopsis of 3D Face recognition system

1/15

Institute of Engineering and Technology, Lucknow

MAJOR PROJECT SYNOPSISOn

3 DIMENSIONAL FACIAL RECOGNITION

Submitted inpartial fulfillment of the requirement for the degree ofBachelor ofEngineering in Computer Science

Department of Computer Science and Engineering

(Session 2013-14)

Submitted by,

Anubhav Shrivastava

[1005210009]

Computer Science and Engineering

Project guide:

Dr. Y N Singh


2/15

Acknowledgement:-

The moment of acknowledgement is one of the pride thatgives up a feeling of cherish about, I take this opportunity to

express my sincere gratitude to all who contributed in

making this work possible with in time.

My heartiest indebtedness to the Associate Professor Dr.

Y N Singh to patronized me in all times. I also wish to

express my deep sense of gratitude to Mr. Radheshayam

for his continuous and tireless support and advice.

I give sincere thanks to the rest for their kind guidance

and technical expertise extended us to complete this

project staff of Computer Science and Engineering

Department.

Anubhav Shrivastava

(100521009)

3


3/15

I NDEX

1) OBJECTIVE

2) DESCRIPTION

a) WORKING PRINCIPLE

b) TECHNOLOGY USED

c) RELATED WORK

3) APPLICATION

4) HARDWARE AND SOFTWARE REQUIREMENTS

5) REFERENCES

4


4/15

OBJECTIVE

To develop a THREE DIMENSIONALFACIAL RECOGNITION

SYSTEM.

DESCRIPTION

A face recognition is a computer application for automaticallyidentifying or verifying a person from a digital image or a video framefrom a video source. One of the ways to do this is by comparingselected facial features from the image and a facial database.

At the access point, an image of someones faceis captured by acamera and is matched against pre-stored images of the sameperson. Only if there is a match, access is permitted, e.g. the dooropens. A Face recognition is also a crucial component of ubiquitous

and pervasive computing, which aims at incorporating intelligence inour living environment and allowing humans to interact with machinesin a natural way, just like people interact with each others ID orpassport.

The fact that face recognition is an essential tool for interpretinghuman actions, human emotions, facial expressions, human behaviorand intentions, and is also an extremely natural and non-intrusivetechnique, makes it an excellent choice for ambient intelligenceapplications.

5


5/15

WORKINGPRINCIPLE

Our system operates in two stages: it first applies a set of neural network-

based filters to an image, and then uses an arbitrator to combine the outputs.

The filters examine each location in the image at several scales, looking for

locations that might contain a face. The arbitrator then merges detectionsfrom individual filters and eliminates overlapping detections.

TECHNOLOGY USED:

NEURAL NETWORK

STAGE 1

A Neural Network-Based Filter

The first component of our system is a filter that receives as input a 20x20

pixel region of the image, and generates an output ranging from 1 to -1,

signifying the presence or absence of a face, respectively. To detect faces

anywhere in the input, the filter is applied at every location in the image. To

detect faces larger than the window size, the input image is repeatedly

reduced in size (by sub sampling), and the filter is applied at each size. This

filter must have some invariance to position and scale. The amount of

invariance determines the number of scales and positions at which it must be

applied. For the work presented here, we apply the filter at every pixelposition in the image, and scale the image down by a factor of 1.2 for each

step in the pyramid.

First, a preprocessing step, is applied to a window of the image. The window

is then passed through a neural network, which decides whether the window

contains a face. The preprocessing first attempts to equalize the intensity

values in across the window. We fit a function which varies linearly across

thewindow to the intensity values in an oval region inside the window. Pixels

outside the oval may represent the background, so those intensity values are

ignored in computing the lighting variation across the face. The linearfunction will approximate the overall brightness of each part of the window,

and can be subtracted from the window to compensate for a variety of

lighting conditions. Then histogram equalization is performed, which non-

linearly maps the intensity values to expand the range of

6


6/15

NeuralNetwork-BasedFaceDetecJioo

ilpulizclp, op)nmid Emcta.iiDI!o< C


7/15

is then passed through a neural network. There are three types of hidden

units: 4 which look at 10x10 pixel sub regions , 16 which look at 5x5 pixel

sub regions , and 6 which look at overlapping 20x5 pixel horizontal stripes

of pixels. Each of these types was chosen to allow the hidden units to detect

local features. In particular, the horizontal stripes allow the hidden units to

detect such features as mouths or pairs of eyes, while the hidden units with

square receptive fields might detect features such as individual eyes, thenose, or corners of the mouth. Although the figure shows a single hidden

unit for each sub region of the input, these units can be replicated. For the

experiments which are described later, we use networks with two and three

sets of these hidden units.The network has a single, real-valued output,

which indicates whether or not the window contains a face.

To train the neural network used in stage one to serve as an accurate filter, a

large number of face and non face images are needed. The images contained

faces of various sizes, orientations , positions, and intensities. The eyes, tip

of nose, and corners and center of the mouth of each face were labeled

manually. These points were used to normalize each face to the sameScale , orientation, and position, as follows:

1. Initialize_ F, a vector which will be the average positions of each labeledfeature over all the faces, with the feature locations in the first face F1.

2. The feature coordinates in_ F are rotated, translated, and scaled, so thatthe average locations of the eyes will appear at predetermined locations in a

20x20 pixel window.

3. For each face i, compute the best rotation, translation, and scaling to align

the facesfeatures Fi with the average feature locations_ F. Suchtransformations can be written as a linear function of their parameters. Thus,

we can write a system of linear equations mapping the features from Fi to_F. The least squares solution to this over-constrained system yields the

parameters for the best alignment transformation. Call the aligned feature

locations F0 i .4. Update_ Fby averaging the aligned feature locations F0ifor each face i.5. Go to step 2.

The alignment algorithm converges within five iterations, yielding for each

face a function which maps that face to a 20x20 pixel window. Fifteen face

examples are generated for the training set from each original image, by

randomly rotating the images (about their center points) up to 10_,scaling

between 90% and 110%, translating up to half a pixel, and mirroring. Each

20x20 window in the set is then preprocessed (by applying lighting

correction and histogram equalization). The randomization gives the filter

8


8/15

invariance to translations of less than a pixel and scaling of 20%. Larger

changes in translation and scale are dealt with by applying the filter at every

pixel position in an image pyramid, in which the images are scaled by

factors of 1.2. Practically any image can serve as a non face example

becausethe space of non face images is much larger than the space of face

images. Instead of collecting the images before training is started, the images

are collected during training, in the following manner.

1. Create an initial set of non face images by generating 1000 random

images. Apply the preprocessing steps to each of these images.

2. Train a neural network to produce an output of 1 for the face examples,

and -1 for the non face examples. The training algorithm is standard error

back propogation with momentum. On the first iteration of this loop, the

networks weights are initialized randomly. After the first iteration, we use

the weights computed by training in the previous iteration as the starting

point.

3. Run the system on an image of scenery which contains no faces. Collect

sub images in which the network incorrectly identifies a face (an output

activation> 0).

4. Select up to 250 of these sub images at random, apply the preprocessing

steps, and add them into the training set as negative examples. Go to step 2.

Stage Two

Merging Overlapping Detections and Arbitration

In this section, we present two strategies to improve the reliability of the

detector: merging overlapping detections from a single network and

arbitrating among multiple networks.

Merging Overlapping Detections

For each location and scale, the number of detections within a specified

neighborhood of that location can be counted. If the number is above a

threshold, then that location is classified as a face. The centroid of the

9


9/15

= =

NeuralNetwork-B asedFareDetect ion

Figure3:Imageswithalltheabovethresholddetectionsindicatedbyboxes.

lrjll;l

11;1;Figure4:E.xamplefaceimages(theauthors),randomlymirrored,rotated,translated,andscaledbysmallamounts.

FigureS:Duringtraining,thepartiall y-traine dsystemis appliedto imagesofscenery whichdonotcontainfaces(liketheoneontheleft).Arrje ons intheimagedetectedasfaces(whichareexpandedandshownontheright)areerrors,whichcanbeaddedintotheseiofnegativetrainingexamples.

nearby detections defines the location of the detection result, thereby

collapsing multiple detections. In the experiments section, this heuristic

will be referred to as "thresholding".Ifa particular location is correctly

identified as a face, then all other detection locations which overlap it are

10


10/15

likely to be errors, and can therefore be eliminated. Based on the above

heuristic regarding nearby detections, we preserve the location with the higher

number of detections within Neural Network-Based Face Detection small

neighborhood, and eliminate locations with fewer detections . Each detection

at a particular location and scale is marked in an image pyramid, labelled the

outputpyramid. Then, each location in the pyramid is replaced by the

number of detections in a specified neighborhood of thatlocation. This has the effect of spreading outthe detections. Normally, the

neighborhood extends an equal number of pixels in the dimensions of scale

and position. All detections contributing to a centroid are collapsed down to

asingle point. Each centroid is then examined in order, starting

from the ones which had the highest number of detections within the

specified neighborhood. If any other centroid locations represent a face

overlapping with the current centroid, they are removed from the output

pyramid. All remaining centroid locations constitute the final detection

result.

Arbitration among Multiple Networks

To further reduce the number of false positives, we can apply multiple

networks, and arbitrate between their outputs to produce the final decision.

Each network is trained in a similar manner, but with random initial weights,

random initial non face images, and permutations of the order of

presentation of the scenery images. As will be seen in the next section, the

detection and false positive rates of the individual networks will be quite

close. However, because of different training conditions and because of self-selection of negative training examples, the networks will have different

biases and will make different errors. Each detection at a particular position

And scale is recorded in an image pyramid, as was done with the previous

heuristics. One way to combine two such pyramids is by ANDing them. This

strategy signals a detection only if both networks detect a face at precisely

the same scale and position. Due to the different biases of the individual

networks, they will rarely agree on a false detection of a face. This allows

ANDing to eliminate most false detections. Unfortunately, this heuristic can

decrease the detection rate because a face detected by only one network will

be thrown out. However, we will see later that individual networks can all

detect roughly the same set of faces, so that the number of faces lost

due to ANDing is small. Similar heuristics, such as ORing the outputs of two

networks, or voting among three networks,were also tried. Each of these

arbitration methods can be applied before or after the thresholding

11


11/15

and overlapeliminationheuristics. If applied afterwards, we combine the

centroid locations rather than actual detection locations, and require them to

be within some neighborhood of one another rather than precisely aligned.

arbitration strategies such as ANDing, ORing, or voting seem intuitively

reasonable, but perhaps there are some less obvious heuristics that could

perform better. To test this hypothesis, we applied a separate neural networkto arbitrate among multiple detection networks. For a location of interest, the

arbitration network examines a small neighborhood surrounding that

location in the Neural Network-Based Face Detection output pyramid of

each individual network. For each pyramid, we count the number of

detectionsin a 3x3 pixel region at each of three scales around the location of

interest, resulting in three numbers for each detector, which are fed to the

arbitration network The arbitration network is trained to produce a positive

output for a given set of inputs only if that location contains a face, and to

produce a negative output for locations without a face.

RELATED WORK:

12


12/15

Different face recognition techniques can be broadly classified intofour categories :

1. Eigenfaces

2. Feature based

3. Hidden Markov Model

4.Neural Network based algorithms.

Performances of implementations of different algorithms wereevaluated on different databases of face. Some of the recognizedface databases are : CMU face database, Olivetti ResearchLaboratory face database and MIT face database.

Eigenfaces for face representation was used first used by Sirovichand Kirvy which was later developed by Truk and Pentland for facerecognition. Different techniques have been developed using neuralnetworks. The implemention by Lawrence, Giles,Tsoi and Backshowed good results. Taylor and Cootes used Active AppearanceModel to design a system for face identification. Some comparisonhas been done between eigenfaces vs fisherfaces, eigenfaces vsfeature based techniques and likewise.

APPLICATIONS

13


13/15

Machine recognition of human faces is used in a variety ofcivilian and law enforcement applications that require reliablerecognition of humans.

Identity verification for physical access control in buildings or

security areas is one of the face recognition applications. For high security areas, a combination with card terminals is

possible, so that a double check is performed for example inairports to facilitate the crew and airport staff to pass throughdifferent control levels.

A smart home should be able to recognize the owners, theirfamily, friends and guests, remember their preferences (fromfavorite food and TV program to room temperature), understandwhat they are saying, where are they looking at, what eachgesture, movement or expression means, and according to allthese cues to be able to facilitate every-day life.

HARDWAREREQUIREMENTS

14


14/15

Min 256mb RAM

P2 and above processor

Hard disk-min 2 MB

webcam

SOFTWAREREQUIREMENT

Operating system: windows xp, vista

Development tool- .net

File system NTFS

REFERENCES

15


15/15

1. T. M. Mitchell. Machine Learning. McGraw-Hill InternationalEdition, 1997.

2. D. Pissarenko. Neural networks for financial time series prediction:Overview over recent research. BSc thesis, 2002.

3. L. I. Smith. A tutorial on principal components analysis, February

2002. URLhttp://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf. (URL accessed on November 27, 2002).

4. M. Turk and A. Pentland. Eigenfaces for recognition. Journal ofCognitive Neuroscience, 3 (1), 1991a. URLhttp://www.cs.ucsb.edu/ mturk/Papers/jcn.pdf. (URL accessed onNovember 27, 2002).

5 . M. A. Turk and A. P. Pentland. Face recognition using eigenfaces.In Proc. of Computer Vision and Pattern Recognition, pages 586-

591. IEEE, June 1991b.URLhttp://www.cs.wisc.edu/ dyer/cs540/handouts/mturk-CVPR91.pdf. (URL accessed on November 27, 2002).

16
http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_compohttp://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_compohttp://www.cs.ucsb.edu/http://www.cs.ucsb.edu/http://www.cs.wisc.edu/http://www.cs.wisc.edu/http://www.cs.ucsb.edu/http://www.cs.ucsb.edu/http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_compohttp://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_compo

Synopsis of 3D Face recognition system

Documents

Transcript of Synopsis of 3D Face recognition system