Modern Face Recognition with Deep...

Modern Face Recognition with Deep Learning

Jothi Thilaga.P Arshath Khan .B, Jones .A.A, Krishna Kumar .N Assistant Professor, Department of CSE IV year B.E. CSE

Ramco Institute of Technology Ramco Institute of Technology

Rajapalayam, India Rajapalayam, India

[email protected] [email protected]

Abstract - Facial recognition systems are commonly

used for verification and security purposes but the

levels of accuracy are still being improved. Errors

occurring in facial feature detection due to occlusions,

pose and illumination changes can be compensated by

the use of hog descriptors. The most reliable way to

measure a face is by employing deep learning

techniques. The final step is to train a classifier that can

take in the measurements from a new test image and

tells which known person is the closest match. A python

based application is being developed to recognize faces

in all conditions.

Keywords—Histograms of Oriented Gradients, Deep

Learning, classifiers

I. INTRODUCTION

Lost Cost sensors are available and several

new techniques have been employed to do face

recognition but efficiency and accuracy still needs to

be improved manifold. Central to the success of face

recognition are the feature representation and the

classification method. In this paper we will focus on

both the techniques. Though the face recognition

errors have decreased over the past decades, many

new systems and organizations have adopted facial

recognition for their work related purposes however

these systems have shown to be sensitive to lightning,

expression, occlusion and aging which increase the

error rates and deteriorate their performance in

recognizing people.

Histograms of Oriented Gradients (HOGs)

are image descriptors invariant to 2D rotation,

occlusion, and extreme lightning conditions. HOG

Descriptors have been successfully applied to face

recognition. Deep Learning techniques have been

employed to distinguish people by training

classifiers. By doing so we design a very powerful

face recognition system which could work at what

was considered most inappropriate situations in

employing a facial recognition system.

The rest of the paper is organized as follows.

Section II describes the related works which are

carried out in the related fields. Section III describes

the HOG, Deep learning and Classification. Section

IV describes the System Architecture. Section V

outlines how the system is implemented. Section VI

describes the conclusion and future work.

II. RELATED WORK

Vahid Kazemi et al., [1] did research on

finding the 68 landmarks on any face, one

millisecond face alignment with an ensemble of

regression trees which had the potential to find the 68

landmarks within milliseconds. They used two

baselines for landmarks selection, one is random

feature selection and the other correlation based

feature selection. The complexity of the training time

depends linearly on the number of training images

used as input for this method.

Mohsen Ghorbani et al., [3] built a robust

Face Recognition system using HOG and LBP. In

this paper, errors in facial feature detection due to

occlusions, pose and illumination changes are

rectified by extracting HOG descriptors from a

regular grid. The fusion of HOG descriptors at

different scales with the LBP captures the important

structure for face recognition.

III. METHODOLOGY

HOG relies on the idea that local object

appearance and shape can often be characterized

rather well by the distribution of local intensity

gradients even without precise knowledge of the

corresponding gradient or edge positions. It can be

implemented by dividing the image window into

small spatial regions, for each region accumulating a

local histogram of gradient directions over the pixels

of the region. All the above histogram entries are

combined to form the representation.

There are numerous advantages of using the

HOG feature. It captures gradient structure that is

very characteristic of local shape and possesses very

little invariance to local geometric and photometric

transformations. The end result is we have the

original image turned into a very simple

representation that features the basic structure of a

face. The measurements that look so clear and

distinguishing to us humans do not really make sense

to a machine looking at individual pixels in an image.

Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT 2018)IEEE Xplore Compliant - Part Number: CFP18BAC-ART; ISBN:978-1-5386-1974-2

978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1947

Research suggests that Deep learning techniques can

be employed in these cases where measurements are

required by the machine. The complicated raw data

like a picture or a video can be reduced into a list of

computer-generated numbers which comes up with a

lot of machine learning (especially in language

translation). We can train a Deep Convolutional

Neural Network. Training a convolutional neural

network to output face embedding requires a lot of

data and computer power. But once the network has

been trained, it can generate measurements for any

face, even ones it has never seen before. So all we

need to do is run the images of the faces through

trained networks to get the 128 measurements for

each face.

The last step is to search for the name of the

person in the database of known people which has the

closest measurements to our test image.

IV. SYSTEM ARCHITECTURE

Face Recognition is a technology that is

employed in most organizations to identify the person

for security and verification purposes.

A server can be designed to take as input an

image or a video stream to recognize faces within

seconds. Database is used for Authentication and

Real time data storage for better performance.

In this paper, we developed a python based

application that is robust and accurate in recognizing

the faces.

Fig.1 represents the system architecture of

the proposed system. It comprises various steps such

as face detection, landmarks identification, extraction

of faces and finally matches the faces with the

database.

Fig.1: Proposed System

At first the input image which is a still

image of various persons are subjected for face

detection. Then the 68 landmarks of the detected

image are estimated. The faces turned towards

different directions and looking differently from a

computer’s perspective may belong to the same

person can be easily matched by using these

landmarks identification. Finally the classified

images are directly compared with the known faces

that has already been trained and placed in our

database using deep learning.

A. HOG

HOG is one of the best techniques used for

analyzing the parameters that determine the faces.

One of the greatest features in recent cameras is face

detection that can automatically pick focus faces in

an image. Paul Viola and Michael Jones invented a

method in 2005 to detect faces that was fast enough

to detect faces but much reliable solutions like HOG

exist now. To detect the faces in an image we will

start by making the faces as gray scale image because

color data is not required to find the faces.

We proceed by looking at every single pixel

of the image and also the nearby and also the pixels

that directly surround it. Our goal is to figure out how

dark the current pixel is comparing to the pixels

directly surrounding it. We draw an arrow in the

direction in which the image is getting darker. These

arrows are called gradients. We end up replacing

every pixel with a gradient.

But saving every pixel detail is not

necessary. It is enough to know in which direction

the basic flow of lightness/darkness flows, so that we

could get a basic pattern of the image. To accomplish

this we will break up the entire image into small

squares of 16x16 pixels each and replace all the

gradients with a single stronger gradient in

comparison with all other gradients.

On analyzing the pixels directly we see that

really dark and really light images will have totally

different pixel values. But by considering only the

direction in which that brightness changes both dark

and light images will have the same exact

representation.

B. Face Landmark Estimation

After isolation of faces in an image, the

problem is that faces turned in different directions

look different to a computer. To mitigate the fact that

differently turned faces does not belong to the same

person, we use an algorithm called the face landmark

estimation algorithm proposed by Vahid Kazemi and

Josephine Sullivan[1]. The algorithm helps in

establishing the fact that faces turned towards


978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1948

different directions and looking differently from a

computer’s perspective may belong to the same

person. The basic idea is to come up with 68 specific

points on an image, and then we will train a machine

learning algorithm to find any 68 specific landmarks

in any image. After the application of this algorithm,

no matter how the faces are turned we are able to

center the eyes and mouth.

Fig.2 represents the trained image, in which

the given image is subjected to identify the

landmarks and also to center the eyes and mouth of

the particular image that is turned in different

directions.

Fig.2: HOG image

C. Deep Learning

The easiest approach to face recognition is

to directly compare the unknown faces with the

known faces that has already been trained and placed

in our database. We train a Deep Convolutional

Neural Network to generate 128 measurements for a

face. So all we need to do is run our face images

through the network to get 128 measurements for

each face. We can run any number of images to train

that is in different lightning conditions and positions

to make these 128 measurements more accurate and

easier for us to match with the unknown faces.

D. Recognizing Faces

This is the easiest approach in the entire

process. All we have to do is find the person in our

database of known people who has the closest

measurements to our unknown face. This can be done

by a machine learning algorithm.

We have to train a classifier that can take in

the measurements from a new test image and tells

which known person is the closest match. The result

of the classifier would be the name of a person.

Fig.3 describes how the trained image is

matched with the image present in the database. After

checking in the database, it will returns whether the

particular image is matched with the image in the

database or not.

Fig.3: Match the faces

V. IMPLEMENTATION

A python application is developed for

recognizing the faces of the persons that pass by the

system or the image is feed into the system by the

admin. The user interface is designed in such a way

that the admin can decide whether the user can pass

through the system or not, which can be automated.

A. Face Detection

We detect the faces using hog face fontal

detector using an open library dlib used for face

detection.

Fig.4: Detection of faces


978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1949

Fig.4 represents the detection of faces in

given sample input image. It can be detected by using

the hog detector in the OpenCV. Only the faces

present in a particular image is detected and used for

further process.

B. Face Landmark Estimation

We implement the face landmark estimation

algorithm to establish the landmarks in a face. Mainly

the 68 landmarks are estimated to identify the images

which were turned in different directions. Only the

eyes, nose and the mouth of the detected image is

identified for the same person’s image but that is

turned in different directions.

Fig.5: Landmarks discovery

In Fig.5, the landmarks of the detected

image are identified. The image of the same person

with poses in different angle can be subjected to

identify the landmarks. This will results in perfectly

matching the image in the databases by avoiding the

errors occur in facial feature detection.

C. Machine Learning

We implement a deep convolutional neural

network to train the images and store the 128

measurements using OpenFace. OpenFace is the

python and torch implementation for facial

recognition with deep neural network.

Fig.6 represents the output data training of

images. This approach started by using a training set

of labeled facial landmarks on an image. These

images are manually labeled, specifying specific xy-

coordinates of regions surrounding each facial

structure. The probability on distance between pairs

of input pixels can be determined.

Fig 6: Training of the images

D. Face Recognition

We can recognize face by training a basic

SVM classifier to tell which person is the closest

match.

Train a classifier that can be taken in the

measurements from a sample image and tells which

known person is the perfect match. The performance

and accuracy is very effective in this approach to

recognize faces of the persons.

VI. CONCLUSION AND FUTURE WORK

A robust face recognition system built using

python for security and verification purposes which

could recognize faces independent of the prevailing

conditions. The accuracy of face recognition can be

improved by increasing the number of images during

training. The result of person identification using

HOG techniques exhibits promising results.

In future, the same recognition system with

the facial expression recognizer along with text to

audio features can be implemented, which is mainly

useful for visually challenged to identify the person

in an organization or other places.

REFERENCES [1] Vahid Kazemi and Josephine Sullivan “one millisecond face

alignment with an ensemble of regression trees” was presented at computer vision and pattern recognition (cvpr), 2014 IEEE conference.

[2] S. Happy, A. Routray, Automatic facial expression recognition using features of salient facial patches, IEEE Transactions on Affective Computing (2015).


978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1950

[3] Mohsen Ghorbani, Alireza Tavakoli and Mohammed Mahdi Dehshibi HOG and LBP: towards a robust face recognition system was presented at Digital Information Management (ICDIM), 2015 Tenth International Conference.

[4] Pranav Kumar, S.L.Happy, Aurobindo Routray A Real-time

Robust Facial Expression Recognition System using HOG features was presented at the International Conference on Computing, Analytics and Security trends, 2016.

[5] Michael Owajyan, Roger Achkar, Moussa Iskandar Face Detection with Expression Recognition using Artificial Neural Networks at the Middle East Conference on Biomedical Engineering and published at the IEEE conference, 2016.

[6] Y.Sun, X.Wang, X.Tang, Deep Convolutional network Cascade for facial point detection in Computer Vision and Pattern Recognition, 2013 IEEE conference.

[7] A. Albiol, D. Monzo, A. Martin, J. Sastre, A. Albiol, Face recognition using HOG-EBGM, Publisher, City, 2008.

[8] Aniwat Juhong, C.Pintavirooj “Face Recognition based on Facial Landmark Detection”,BMEiCON – 2017.

[9] Rajesh K M, Naveenkumar M, “An Adaptive-Profile Modified Active Shape Model for Automatic Landmark Annotation Using Open CV”, International Journal of Engineering Research in Electronic and Communication Engineering (IJERECE), Vol.3, Issue.5, pp:18-21, May 2016.

[10] Jan Erik Solem, “Programming Computer Vision with Python”, First Edition , ISBN 13:978-93-5023-766-3, July 2012.


978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1951

Modern Face Recognition with Deep...

Documents

Transcript of Modern Face Recognition with Deep...