Modern Face Recognition with Deep...
Transcript of Modern Face Recognition with Deep...
Modern Face Recognition with Deep Learning
Jothi Thilaga.P Arshath Khan .B, Jones .A.A, Krishna Kumar .N Assistant Professor, Department of CSE IV year B.E. CSE
Ramco Institute of Technology Ramco Institute of Technology
Rajapalayam, India Rajapalayam, India
[email protected] [email protected]
Abstract - Facial recognition systems are commonly
used for verification and security purposes but the
levels of accuracy are still being improved. Errors
occurring in facial feature detection due to occlusions,
pose and illumination changes can be compensated by
the use of hog descriptors. The most reliable way to
measure a face is by employing deep learning
techniques. The final step is to train a classifier that can
take in the measurements from a new test image and
tells which known person is the closest match. A python
based application is being developed to recognize faces
in all conditions.
Keywords—Histograms of Oriented Gradients, Deep
Learning, classifiers
I. INTRODUCTION
Lost Cost sensors are available and several
new techniques have been employed to do face
recognition but efficiency and accuracy still needs to
be improved manifold. Central to the success of face
recognition are the feature representation and the
classification method. In this paper we will focus on
both the techniques. Though the face recognition
errors have decreased over the past decades, many
new systems and organizations have adopted facial
recognition for their work related purposes however
these systems have shown to be sensitive to lightning,
expression, occlusion and aging which increase the
error rates and deteriorate their performance in
recognizing people.
Histograms of Oriented Gradients (HOGs)
are image descriptors invariant to 2D rotation,
occlusion, and extreme lightning conditions. HOG
Descriptors have been successfully applied to face
recognition. Deep Learning techniques have been
employed to distinguish people by training
classifiers. By doing so we design a very powerful
face recognition system which could work at what
was considered most inappropriate situations in
employing a facial recognition system.
The rest of the paper is organized as follows.
Section II describes the related works which are
carried out in the related fields. Section III describes
the HOG, Deep learning and Classification. Section
IV describes the System Architecture. Section V
outlines how the system is implemented. Section VI
describes the conclusion and future work.
II. RELATED WORK
Vahid Kazemi et al., [1] did research on
finding the 68 landmarks on any face, one
millisecond face alignment with an ensemble of
regression trees which had the potential to find the 68
landmarks within milliseconds. They used two
baselines for landmarks selection, one is random
feature selection and the other correlation based
feature selection. The complexity of the training time
depends linearly on the number of training images
used as input for this method.
Mohsen Ghorbani et al., [3] built a robust
Face Recognition system using HOG and LBP. In
this paper, errors in facial feature detection due to
occlusions, pose and illumination changes are
rectified by extracting HOG descriptors from a
regular grid. The fusion of HOG descriptors at
different scales with the LBP captures the important
structure for face recognition.
III. METHODOLOGY
HOG relies on the idea that local object
appearance and shape can often be characterized
rather well by the distribution of local intensity
gradients even without precise knowledge of the
corresponding gradient or edge positions. It can be
implemented by dividing the image window into
small spatial regions, for each region accumulating a
local histogram of gradient directions over the pixels
of the region. All the above histogram entries are
combined to form the representation.
There are numerous advantages of using the
HOG feature. It captures gradient structure that is
very characteristic of local shape and possesses very
little invariance to local geometric and photometric
transformations. The end result is we have the
original image turned into a very simple
representation that features the basic structure of a
face. The measurements that look so clear and
distinguishing to us humans do not really make sense
to a machine looking at individual pixels in an image.
Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT 2018)IEEE Xplore Compliant - Part Number: CFP18BAC-ART; ISBN:978-1-5386-1974-2
978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1947
Research suggests that Deep learning techniques can
be employed in these cases where measurements are
required by the machine. The complicated raw data
like a picture or a video can be reduced into a list of
computer-generated numbers which comes up with a
lot of machine learning (especially in language
translation). We can train a Deep Convolutional
Neural Network. Training a convolutional neural
network to output face embedding requires a lot of
data and computer power. But once the network has
been trained, it can generate measurements for any
face, even ones it has never seen before. So all we
need to do is run the images of the faces through
trained networks to get the 128 measurements for
each face.
The last step is to search for the name of the
person in the database of known people which has the
closest measurements to our test image.
IV. SYSTEM ARCHITECTURE
Face Recognition is a technology that is
employed in most organizations to identify the person
for security and verification purposes.
A server can be designed to take as input an
image or a video stream to recognize faces within
seconds. Database is used for Authentication and
Real time data storage for better performance.
In this paper, we developed a python based
application that is robust and accurate in recognizing
the faces.
Fig.1 represents the system architecture of
the proposed system. It comprises various steps such
as face detection, landmarks identification, extraction
of faces and finally matches the faces with the
database.
Fig.1: Proposed System
At first the input image which is a still
image of various persons are subjected for face
detection. Then the 68 landmarks of the detected
image are estimated. The faces turned towards
different directions and looking differently from a
computer’s perspective may belong to the same
person can be easily matched by using these
landmarks identification. Finally the classified
images are directly compared with the known faces
that has already been trained and placed in our
database using deep learning.
A. HOG
HOG is one of the best techniques used for
analyzing the parameters that determine the faces.
One of the greatest features in recent cameras is face
detection that can automatically pick focus faces in
an image. Paul Viola and Michael Jones invented a
method in 2005 to detect faces that was fast enough
to detect faces but much reliable solutions like HOG
exist now. To detect the faces in an image we will
start by making the faces as gray scale image because
color data is not required to find the faces.
We proceed by looking at every single pixel
of the image and also the nearby and also the pixels
that directly surround it. Our goal is to figure out how
dark the current pixel is comparing to the pixels
directly surrounding it. We draw an arrow in the
direction in which the image is getting darker. These
arrows are called gradients. We end up replacing
every pixel with a gradient.
But saving every pixel detail is not
necessary. It is enough to know in which direction
the basic flow of lightness/darkness flows, so that we
could get a basic pattern of the image. To accomplish
this we will break up the entire image into small
squares of 16x16 pixels each and replace all the
gradients with a single stronger gradient in
comparison with all other gradients.
On analyzing the pixels directly we see that
really dark and really light images will have totally
different pixel values. But by considering only the
direction in which that brightness changes both dark
and light images will have the same exact
representation.
B. Face Landmark Estimation
After isolation of faces in an image, the
problem is that faces turned in different directions
look different to a computer. To mitigate the fact that
differently turned faces does not belong to the same
person, we use an algorithm called the face landmark
estimation algorithm proposed by Vahid Kazemi and
Josephine Sullivan[1]. The algorithm helps in
establishing the fact that faces turned towards
Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT 2018)IEEE Xplore Compliant - Part Number: CFP18BAC-ART; ISBN:978-1-5386-1974-2
978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1948
different directions and looking differently from a
computer’s perspective may belong to the same
person. The basic idea is to come up with 68 specific
points on an image, and then we will train a machine
learning algorithm to find any 68 specific landmarks
in any image. After the application of this algorithm,
no matter how the faces are turned we are able to
center the eyes and mouth.
Fig.2 represents the trained image, in which
the given image is subjected to identify the
landmarks and also to center the eyes and mouth of
the particular image that is turned in different
directions.
Fig.2: HOG image
C. Deep Learning
The easiest approach to face recognition is
to directly compare the unknown faces with the
known faces that has already been trained and placed
in our database. We train a Deep Convolutional
Neural Network to generate 128 measurements for a
face. So all we need to do is run our face images
through the network to get 128 measurements for
each face. We can run any number of images to train
that is in different lightning conditions and positions
to make these 128 measurements more accurate and
easier for us to match with the unknown faces.
D. Recognizing Faces
This is the easiest approach in the entire
process. All we have to do is find the person in our
database of known people who has the closest
measurements to our unknown face. This can be done
by a machine learning algorithm.
We have to train a classifier that can take in
the measurements from a new test image and tells
which known person is the closest match. The result
of the classifier would be the name of a person.
Fig.3 describes how the trained image is
matched with the image present in the database. After
checking in the database, it will returns whether the
particular image is matched with the image in the
database or not.
Fig.3: Match the faces
V. IMPLEMENTATION
A python application is developed for
recognizing the faces of the persons that pass by the
system or the image is feed into the system by the
admin. The user interface is designed in such a way
that the admin can decide whether the user can pass
through the system or not, which can be automated.
A. Face Detection
We detect the faces using hog face fontal
detector using an open library dlib used for face
detection.
Fig.4: Detection of faces
Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT 2018)IEEE Xplore Compliant - Part Number: CFP18BAC-ART; ISBN:978-1-5386-1974-2
978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1949
Fig.4 represents the detection of faces in
given sample input image. It can be detected by using
the hog detector in the OpenCV. Only the faces
present in a particular image is detected and used for
further process.
B. Face Landmark Estimation
We implement the face landmark estimation
algorithm to establish the landmarks in a face. Mainly
the 68 landmarks are estimated to identify the images
which were turned in different directions. Only the
eyes, nose and the mouth of the detected image is
identified for the same person’s image but that is
turned in different directions.
Fig.5: Landmarks discovery
In Fig.5, the landmarks of the detected
image are identified. The image of the same person
with poses in different angle can be subjected to
identify the landmarks. This will results in perfectly
matching the image in the databases by avoiding the
errors occur in facial feature detection.
C. Machine Learning
We implement a deep convolutional neural
network to train the images and store the 128
measurements using OpenFace. OpenFace is the
python and torch implementation for facial
recognition with deep neural network.
Fig.6 represents the output data training of
images. This approach started by using a training set
of labeled facial landmarks on an image. These
images are manually labeled, specifying specific xy-
coordinates of regions surrounding each facial
structure. The probability on distance between pairs
of input pixels can be determined.
Fig 6: Training of the images
D. Face Recognition
We can recognize face by training a basic
SVM classifier to tell which person is the closest
match.
Train a classifier that can be taken in the
measurements from a sample image and tells which
known person is the perfect match. The performance
and accuracy is very effective in this approach to
recognize faces of the persons.
VI. CONCLUSION AND FUTURE WORK
A robust face recognition system built using
python for security and verification purposes which
could recognize faces independent of the prevailing
conditions. The accuracy of face recognition can be
improved by increasing the number of images during
training. The result of person identification using
HOG techniques exhibits promising results.
In future, the same recognition system with
the facial expression recognizer along with text to
audio features can be implemented, which is mainly
useful for visually challenged to identify the person
in an organization or other places.
REFERENCES [1] Vahid Kazemi and Josephine Sullivan “one millisecond face
alignment with an ensemble of regression trees” was presented at computer vision and pattern recognition (cvpr), 2014 IEEE conference.
[2] S. Happy, A. Routray, Automatic facial expression recognition using features of salient facial patches, IEEE Transactions on Affective Computing (2015).
Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT 2018)IEEE Xplore Compliant - Part Number: CFP18BAC-ART; ISBN:978-1-5386-1974-2
978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1950
[3] Mohsen Ghorbani, Alireza Tavakoli and Mohammed Mahdi Dehshibi HOG and LBP: towards a robust face recognition system was presented at Digital Information Management (ICDIM), 2015 Tenth International Conference.
[4] Pranav Kumar, S.L.Happy, Aurobindo Routray A Real-time
Robust Facial Expression Recognition System using HOG features was presented at the International Conference on Computing, Analytics and Security trends, 2016.
[5] Michael Owajyan, Roger Achkar, Moussa Iskandar Face Detection with Expression Recognition using Artificial Neural Networks at the Middle East Conference on Biomedical Engineering and published at the IEEE conference, 2016.
[6] Y.Sun, X.Wang, X.Tang, Deep Convolutional network Cascade for facial point detection in Computer Vision and Pattern Recognition, 2013 IEEE conference.
[7] A. Albiol, D. Monzo, A. Martin, J. Sastre, A. Albiol, Face recognition using HOG-EBGM, Publisher, City, 2008.
[8] Aniwat Juhong, C.Pintavirooj “Face Recognition based on Facial Landmark Detection”,BMEiCON – 2017.
[9] Rajesh K M, Naveenkumar M, “An Adaptive-Profile Modified Active Shape Model for Automatic Landmark Annotation Using Open CV”, International Journal of Engineering Research in Electronic and Communication Engineering (IJERECE), Vol.3, Issue.5, pp:18-21, May 2016.
[10] Jan Erik Solem, “Programming Computer Vision with Python”, First Edition , ISBN 13:978-93-5023-766-3, July 2012.
Proceedings of the 2nd International Conference on Inventive Communication and Computational Technologies (ICICCT 2018)IEEE Xplore Compliant - Part Number: CFP18BAC-ART; ISBN:978-1-5386-1974-2
978-1-5386-1974-2/18/$31.00 ©2018 IEEE 1951