Face Detection System

Face detection system

1. INTRODUCTION

1.1Outline of a Typical Face Detection System:

1.1.1. The acquisition module:

This is the entry point of the face recognition process. It is the module where the

face image under consideration is presented to the system. In other words, the user is

asked to present a face image to the face recognition system in this module. An

acquisition module can request a face image from several different environments: The

face image can be an image file that is located on a magnetic disk, it can be captured by

a frame grabber and camera or it can be scanned from paper with the help of a scanner.

1.1.2. The pre-processing module:

In this module, by means of early vision techniques, face images are normalized

and if desired, they are enhanced to improve the recognition performance of the system.

Some or all of the pre-processing steps may be implemented in a face recognition

system

1.1.3. The feature extraction module:

After performing some pre-processing (if necessary), the normalized face image

is presented to the feature extraction module in order to find the key features that are

going to be used for classification. In other words, this module is responsible for

composing a feature vector that is well enough to represent the face image.

1


1.1.4. The classification module:

In this module, with the help of a pattern classifier, extracted features of the face

image is compared with the ones stored in a face library (or face database). After doing

this comparison, face image is classified as either known or unknown.

Principal component analysis, based on information theory concepts, seeks a

computational model that best describes a face, by extracting the most relevant

information contained in that face. Eigenfaces approach is a principal component

analysis method, in which a small set of characteristic pictures are used to describe the

variation between face images. Goal is to find out the eigenvectors (Eigenfaces) of the

covariance matrix of the distribution, spanned by a training set of face images. Later

every face image is represented by a linear combination of these eigenvectors.

Evaluations of these eigenvectors are quite difficult for typical image sizes but,

an approximation that is suitable for practical purposes is also presented. Recognition is

performed by projecting a new image into the subspace spanned by the Eigenfaces

and then classifying the face by comparing its position in face space with the

positions of known individuals.

Eigenfaces approach seems to be an adequate method to be used in face

recognition due to its simplicity, speed and learning capability. Experimental results are

given to demonstrate the viability of the proposed “face detection method”.

2


1.2 Definition:

Face detection is concerned with finding whether or not there are any faces in a

given image (usually in gray scale) and, if present,return the image location and content

of each face. This is the first step of any fully automatic system that analyzes the info-

mation contained in faces e.g., identity, gender, expression, age, race and pose).While

earlier work dealt mainly with upright frontal faces, several systems have been dovelo-

ped that are able to detect faces fairly accurately with in-plane or out-of-plane

rotations in real time. Although a face detection module is typically designed to deal

with single images,its performance can be further improved if video stream is available.

The advances of computing technology have facilitated the development of rea-

ltime vision modules that interact with humans in recent years. Examples abound, part-

icularly in biometrics and human computer interaction as the information contained

faces needs to be analyzed for systems to react accordingly. For biometric systemsthat

use faces as non-intrusive input modules, it is imperative to locate faces in a scene bef-

ore any recognition algorithm can be applied. An intelligent visionbased user interface

should be able to tell the attention focus of the user (i.e., where the user is looking at)

in order torespond accordingly. To detect facial features accurately for applications su-

ch as digital cosmetics, faces need to be located and registered first to facilitate further

processing. It is evident that face detection plays an important and critical role for the

success of any face processing systems.

The face detection problem is challenging as it needs to account for all possible

appearance variation caused by change in illumination, facial features, occlusions, etc. In

addition, it has to detect faces that appear at different scale, pose, with inplane rotations.

In spite of all these difficulties, tremendous progress has been made in the last decade

and many systems have shown impressive real-time performance. The recent advances

of these algorithms have also made significant contributions in detecting other objects

such as humans/pedestrians, and cars. Operation of a Face Detection System Most

detection systems carry out the task by extracting certain properties (e.g., local features

3


or holistic intensity patterns) of a set of training images acquired at a fixed pose (e.g.,

upright frontal pose) in an off-line setting. To reduce the effects of illumination change,

these images are processed with histogram equalization [3, 1] or standardization (i.e.,

zero mean unit variance) [2]. Based on the extracted properties, these systems typically

scan through the entire image at every possible location and scale in order to locate

faces. The extracted properties can be either manually coded (with human knowledge) or

learned from a set of data as adopted in the recent systems that have demonstrated

impressive results [3, 1, 4, 5, 2].

In order to detect faces at different scale, the detection process is usually

repeated to a pyramid of images whose resolution are reduced by a certain factor (e.g.,

1.2) from the original one [3, 1]. Such procedures may be expedited when other visual

cues can be accurately incorporated (e.g., color and motion) as pre-processing steps to

reduce the search space . As faces are often detected across scale, the raw detected

faces are usually further processed to combine overlapped results and remove false

positives with heuristics (e.g., faces typically do not overlap in images) or further

processing (e.g., edge detection and intensity variance).

Numerous representations have been proposed for face detection, including

pixel-based [3, 1, 5], parts-based [6, 4, 7], local edge features [8, 9], Haar wavelets [10,

4] and Haar-like features [2, 11]. While earlier holistic representation schemes are able

to detect faces [3, 1, 5], the recent systems with Haar-like features [2, 12, 13] have

demonstrated impressive empirical results in detecting faces under occlusion. A large

and representative training set of face images is essential for the success of learning-

based face detectors. From the set of collected data, more positive examples can be

synthetically generated by perturbing, mirroring, rotating and scaling the original face

images [3, 1]. On the other hand, it is relatively easier to collect negative examples by

randomly sampling images without face images [3, 1].

4


As face detection can be mainly formulated as a pattern recognition problem,

numerous algorithms have been proposed to learn their generic templates (e.g.,

eigenface and statistical distribution) or discriminant classifiers (e.g., neural networks,

Fisher linear discriminant, sparse network of Winnows, decision tree, Bayes classifiers,

support vector machines, and AdaBoost).

Typically, a good face detection system needs to be trained with several

iterations. One common method to further improve the system is to bootstrap a trained

face detector with test sets, and re-train the system with the false positive as well as

negatives . This process is repeated several times in order to further improve the

performance of a face detector. A survey on these topics can be found in , and the most

recent advances are discussed in the next section.

1.3 Recent Advances:

The AdaBoost-based face detector by Viola and Jones demonstrated that faces

can be fairly reliably detected in real-time (i.e., more than 15 frames per second on 320

by 240 images with desktop computers) under partial occlusion. While Haar

wavelets were used in for representing faces and pedestrians, they proposed the use of

Haar-like features which can be computed efficiently with integral image . Figure 1

shows four types of Haar-like features that are used to encode the horizontal, vertical

and diagonal intensity information of face images at different position and scale. Given a

sample image of 24 by 24 pixels, the exhaustive set of parameterized Haar-like features

(at different position and scale) is very large (about 160,000). Contrary to most of the

prior algorithms that use one single strong classifier (e.g., neural networks and support

vector machines), they used an ensemble of weak classifiers where each one is

constructed by thresholding of one Haar-like feature. The weak classifiers are selected

and weighted using the AdaBoost algorithm . As there are large number of weak

classifiers, they presented a method to rank these classifiers into several cascades using a

set of optimization criteria.

5


Within each stage, an ensemble of several weak classifiers is trained using the

AdaBoost algorithm. The motivation behind the cascade of classifier is that simple

classifiers at early stage can filter out most negative examples efficiently, and stronger

classifiers at later stage are only necessary to deal with instances that look like faces.

The final detector, a 38 layer cascade of classifiers with 6,060 Haar-like features,

demonstrated impressive real-time performance with fairly high detection and low false

positive rates. Several extensions to detect faces in multiple views with in-plane ration

have since been proposed. An implementation of the AdaBoost-based face detector can

be found in the Intel OpenCV library.

Despite the excellent run-time performance of boosted cascade classifier, the

training time of such a system is rather lengthy. In addition, the classifier cascade is an

example of degenerate decision tree with an unbalanced data set (i.e., a small set of

positive examples and a huge set of negative ones). Numerous algorithms have been

proposed to address these issues and extended to detect faces in multiple views. To

handle the asymmetry between the positive and negative data sets, Viola and Jones

proposed the asymmetric AdaBoost algorithm which keeps most of the weights on the

the positive examples.

The AdaBoost algorithm is used to select a specified number of weak classifiers

with lowest error rates for each cascade and the process is repeated until a set of

optimization criteria (i.e., the number of stages, the number of features of each stage,and

the detection/false positive rates) is satisfied. As each weak classifier is made of one

single Haar-like feature, the process within each stage can be considered as a feature

selection problem. Instead of repeating the feature selection process at each stage, Wu et

al. presented a greedy algorithm for determining the set of features for all stages first

before training the cascade classifier. With the greedy feature selection algorithm used

as a pre-computing procedure, they reported that the training time of the classifier

cascade with AdaBoost is reduced by 50 to 100 times. For learning in each stage (or

node within the classifier cascade, they also exploited the asymmetry between positive

6


and negative data using a linear classifier with the assumptions that they can be modeled

with Gaussian distributions . The merits and drawbacks of the proposed linear

asymmetric classifier as well as the classic Fisher linear discriminant were also

examined in their work. Recently, Pham and Cham proposed an online algorithm that

learns asymmetric boosted classifiers with significant gain in training time. In an

algorithm that aims to automatically determine the number of classifiers and stages for

constructing a boosted ensemble was proposed. While a greedy optimization algorithm

was employed in Brubaker et al. proposed an algorithm for determining the number of

weak classifiers and training each node classifier of a cascade by selecting operating

points within a receiver operator characteristic (ROC) curve . The solved the

optimization problem using linear programs that maximize the detection rates while

satisfying the constraints of false positive rates.

Although the original four types of Haar-like features are sufficient to encode

upright frontal face images, other types of features are essential to represent more

complex patterns (e.g., faces in different pose). Most systems take a divide-and-conquer

strategy and a face detector is constructed for a fixed pose, thereby covering a wide

range of angles (e.g., yaw and pitch angles). A test image is either sent to all detectors

for evaluation, or to a decision module with a coarse pose estimator for selecting the

appropriate trees for further processing. The ensuing problems are how the types of

features are constructed, and how the most important ones from a large feature space are

selected. More generalized Haar-like features are defined in in which the rectangular

image regions are not necessarily adjacent, and furthermore the number of such

rectangular blocks is randomly varied . Several greedy algorithms have been proposed to

select features efficiently by exploiting the statistics of features before training boosted

cascade classifiers.

There are also other fast face detection methods that demonstrate promising

results, including the component-based face detector using Naive Bayes classifiers , the

face detectors using support vector machines, the Anti-face method which consists of a

7


series of detectors trained with positive images only, and the energy-based method that

simultaneously detects faces and estimates their pose in real time.

1.4 Quantifying Performance:

There are numerous metrics to gauge the performance of face detection systems,

ranging from detection frame rate, false positive/negative rate, number of classifier,

number of feature, number of training image, training time, accuracy and memory

requirements. In addition, the reported performance also depends on the definition of a

“correct” detection result. Figure 2 shows the effects of detection results versus different

criteria, and more discussions can be found in.

The most commonly adopted method is to plot the ROC curve using the de facto

standard MIT + CMU data set which contains frontal face images. Another data set

from CMU contains images with faces that vary in pose from frontal to side view .

It has been noticed that although the face detection methods nowadays have impressive

real-time performance, there is still much room for improvement in terms of accuracy.

The detected faces returned by state-of-the-art algorithms are often a few pixels off the

“accurate” locations, which is significant as face images are usually standardized to 21

by 21 pixels. While such results are the trade-offs between speed, robustness and

accuracy, they inevitably degrade the performance of any biometric applications using

the contents of detected faces. Several post-processing algorithms have been proposed to

better locate faces and extract facial features (when the image resolution of the detected

faces is sufficiently high).

8


1.5 Applications

As face detection is the first step of any face processing system, it finds

numerous applications in face recognition, face tracking, facial expression recognition,

facial feature extraction, gender classification, clustering, attentive user interfaces,

digital cosmetics, biometric systems, to name a few. In addition, most of the face

detection algorithms can be extended to recognize other objects such as cars, humans,

pedestrians, and signs, etc.

2. MATLAB

9


2.1 MATLAB deals with:

1. Basic flow control and programming language

2. How to write scripts (main functions) with mat lab

3. How to write functions with mat lab

4. How to use the debugger

5. How to use the graphical interface

6. Examples of useful scripts and functions for image processing

After learning about mat lab we will be able to use matlab as a tool to help us

with our maths, electronics, signal & image processing, statistics, neural networks,

control and automation.

2.2 Matlab resources:

Language: High level matrix/vector language with

Scripts and main programs

Functions

Flow statements (for, while)

Control statements (if,else)

data structures (struct, cells)

input/ouputs (read,write,save)

object oriented programming.

Environment

10


Command window.

Editor

Debugger

Profiler (evaluate performances)

Mathematical libraries

Vast collection of functions

API

Call c function from matlab

Call matlab functions from c

Scripts and main programs

In matlab, scripts are the equivalent of main programs. The variables declared in

a script are visible in the workspace and they can be saved. Scripts can therefore take a

lot of memory if you are not careful, especially when dealing with images. To create a

script, you will need to start the editor, write your code and run it.

2.3 MATLAB Functions:

imread: Read images from graphics files.

Syntax:

A = imread(filename,fmt)

[X,map] = imread(filename,fmt)

[...] = imread(filename)

[...] = imread(...,idx) (TIFF only)

[...] = imread(...,ref) (HDF only)

[...] = imread(...,'BackgroundColor',BG) (PNG only)

2.4 Description:

11


A = imread(filename,fmt) reads a grayscale or truecolor image named

filename into A. If the file contains a grayscale intensity image, A is a two-dimensional

array. If the file contains a truecolor (RGB) image, A is a three-dimensional array.

[X,map] = imread(filename,fmt) reads the indexed image in filename into X and

its associated colormap into map. The colormap values are rescaled to the range [0,1]. A

and map are two-dimensional arrays.

[...] = imread(filename) attempts to infer the format of the file from its content.

filename is a string that specifies the name of the graphics file, and fmt is a string that

specifies the format of the file. If the file is not in the current directory or in a directory

in the MATLAB path, specify the full pathname for a location on your system. If

imread cannot find a file named filename, it looks for a file named filename.fmt. If

you do not specify a string for fmt, the toolbox will try to discern the format of the file

by checking the file header.

Format File type

'bmp' Windows Bitmap (BMP)

'hdf' Hierarchical Data Format (HDF)

'jpg' or 'jpeg' Joint Photographic Experts Group (JPEG)

'pcx' Windows Paintbrush (PCX)

`png' Portable Network Graphics (PNG)

'tif' or 'tiff' Tagged Image File Format (TIFF)

'xwd' X Windows Dump (XWD)

Table 2.1: possible values for fmt.

Special Case Syntax:

12


TIFF-Specific Syntax:

[...] = imread(...,idx) reads in one image from a multi-image TIFF file. idx is an

integer value that specifies the order in which the image appears in the file. For example,

if idx is 3, imread reads the third image in the file. If you omit this argument, imread

reads the first image in the file. To read all ages of a TIFF file, omit the idx argument.

2.5 PNG-Specific Syntax:

The discussion in this section is only relevant to PNG files that contain

transparent pixels. A PNG file does not necessarily contain transparency data.

Transparent pixels, when they exist, will be identified by one of two components: a

transparency chunk or an alpha channel.

The transparency chunk identifies which pixel values will be treated as

transparent, e.g., if the value in the transparency chunk of an 8-bit image is 0.5020, all

pixels in the image with the color 0.5020 can be displayed as transparent. An alpha

channel is an array with the same number of pixels as are in the image, which indicates

the transparency status of each corresponding pixel in the image (transparent or

nontransparent).

Another potential PNG component related to transparency is the background

color chunk, which (if present) defines a color value that can be used behind all

transparent pixels. This section identifies the default behavior of the toolbox for reading

PNG images that contain either a transparency chunk or an alpha channel, and describes

how you can override it.

13


Case 1. You do not ask to output the alpha channel and do not specify a background

color to use. For example,

[a,map] = imread(filename);

a = imread(filename);

If the PNG file contains a background color chunk, the transparent pixels will be

composited against the specified background color.

If the PNG file does not contain a background color chunk, the transparent pixels will be

composited against 0 for grayscale (black), 1 for indexed (first color in map), or [0 0

0] for RGB (black).

Case 2. You do not ask to output the alpha channel but you specify the background color

parameter in your call. For example,

[...] = imread(...,'BackgroundColor',bg);

The transparent pixels will be composited against the specified color. The form of bg

depends on whether the file contains an indexed, intensity (grayscale), or RGB image. If

the input image is indexed, bg should be an integer in the range [1,P] where P is the

colormap length. If the input image is intensity, bg should be an integer in the range

[0,1]. If the input image is RGB, bg should be a 3-element vector whose values are in

the range [0,1].

There is one exception to the toolbox's behavior of using your background color. If you

set background to 'none' no compositing will be performed. For example,

[...] = imread(...,'Back','none');

Case 3. You ask to get the alpha channel as an output variable. For example,

[a,map,alpha] = imread(filename);

[a,map,alpha] = imread(filename,fmt);

No compositing is performed; the alpha channel will be stored separately from

the image (not merged into the image as in cases 1 and 2). This form of imread returns

the alpha channel if one is present, and also returns the image and any associated

14


colormap. If there is no alpha channel, alpha returns []. If there is no colormap, or the

image is grayscale or truecolor, map may be empty.

2.6 HDF-Specific Syntax:

[...] = imread(...,ref) reads in one image from a multi-image HDF file. ref

is an integer value that specifies the reference number used to identify the image. For

example, if ref is 12, imread reads the image whose reference number is 12. (Note that

in an HDF file the reference numbers do not necessarily correspond to the order of the

images in the file. You can use imfinfo to match up image order with reference

number.) If you omit this argument, imread reads the first image in the file.

Forma

tVariants

BMP1-bit, 4-bit, 8-bit, and 24-bit uncompressed images; 4-bit and 8-bit run-

length encoded (RLE) images

HDF8-bit raster image datasets, with or without associated colormap; 24-bit

raster image datasets

JPEGAny baseline JPEG image (8 or 24-bit); JPEG images with some commonly

used extensions

PCX 1-bit, 8-bit, and 24-bit images

PNGAny PNG image, including 1-bit, 2-bit, 4-bit, 8-bit, and 16-bit grayscale

images; 8-bit and 16-bit indexed images; 24-bit and 48-bit RGB images

TIFF

Any baseline TIFF image, including 1-bit, 8-bit, and 24-bit uncompressed

images; 1-bit, 8-bit, 16-bit, and 24-bit images with packbits compression; 1-

bit images with CCITT compression; also 16-bit grayscale, 16-bit indexed,

and 48-bit RGB images.

XWD 1-bit and 8-bit ZPixmaps; XYBitmaps; 1-bit XYPixmaps

Table 2.2: Types of images that imread can read.

15


Examples:

This example reads the sixth image in a TIFF file:

[X,map] = imread('flowers.tif',6);

This example reads the fourth image in an HDF file.

info = imfinfo('skull.hdf');

[X,map] = imread('skull.hdf',info(4).Reference);

This example reads a 24-bit PNG image and sets any of its fully transparent (alpha

channel) pixels to red.

bg = [255 0 0];

A = imread('image.png','BackgroundColor',bg);

This example returns the alpha channel (if any) of a PNG image.

[A,map,alpha] = imread('image.png');

imshow: Display image

Syntax

imshow(I)

imshow(I,[low high])

imshow(RGB)

imshow(BW)

imshow(X,map)

imshow(filename)

himage = imshow(...)

imshow(..., param1, val1, param2, val2,...)

16


2.7 Description:

imshow(I) displays the grayscale image I.

imshow(I,[low high]) displays the grayscale image I, specifying the display

range for I in [low high]. The value low (and any value less than low) displays as black;

the value high (and any value greater than high) displays as white. Values in between are

displayed as intermediate shades of gray, using the default number of gray levels. If you

use an empty matrix ([]) for [low high], imshow uses [min(I(:)) max(I(:))]; that is, the

minimum value in I is displayed as black, and the maximum value is displayed as white.

imshow(RGB) displays the truecolor image RGB.

imshow(BW) displays the binary image BW. imshow displays pixels with the value 0

(zero) as black and pixels with the value 1 as white.

imshow(X,map) displays the indexed image X with the colormap map. A color

map matrix may have any number of rows, but it must have exactly 3 columns. Each

row is interpreted as a color, with the first element specifying the intensity of red light,

the second green, and the third blue. Color intensity can be specified on the interval 0.0

to 1.0.

imshow(filename) displays the image stored in the graphics file filename. The

file must contain an image that can be read by imread or dicomread. imshow calls

imread or dicomread to read the image from the file, but does not store the image data in

the MATLAB workspace. If the file contains multiple images, the first one will be

displayed. The file must be in the current directory or on the MATLAB path.

17


2.8 Remarks

imshow is the toolbox's fundamental image display function, optimizing figure,

axes, and image object property settings for image display. imtool provides all the image

display capabilities of imshow but also provides access to several other tools for

navigating and exploring images, such as the Pixel Region tool, Image Information tool,

and the Adjust Contrast tool. imtool presents an integrated environment for displaying

images and performing some common image processing tasks.

Examples

Display an image from a file.

X= imread('moon.tif'); imshow(X).

18


3. DEFINITIONAL ENTRIES

3.1 AdaBoost:

AdaBoost (short for Adaptive Boosting) is a machine learning algorithm

formulated by Freund and Schapire that learns a strong classifier by combining an

ensemble of weak (moderately accurate) classifiers with weights. The discrete AdaBoost

algorithm was originally developed for classification using the exponential loss function

and is an instance within the boosting

family.

3.2 Haar-like features:

Similar to the what Haar wavelets are developed for basis functions to encode

signals, the objective of two-dimensional Haar features is to collect local oriented

intensity difference at different scale for representing image patters. This representation

transforms an image from pixel space to the space of wavelet coefficients with an over-

complete dictionary of features. See for how such features can be used to represent face

and pedestrians images. The Haar-like features, similar to Haar wavelets, compute local

oriented intensity difference using rectangular blocks (rather than pixels) which can be

computed efficiently with the integral image.

3.3 ROC curve:

An ROC (receiver operating characteristic) curve is a plot commonly used in

machine learning and data mining for exhibiting the performance of a classifier under

different criteria. The y-axis is the true positive and the x-axis is the false positive

(i.e.,false alarm). A point on ROC curve shows that the trade-off between the achieved

true positive detection rate and the accepted false positive rate.

19


3.4Classifier cascade:

In face detection, a classifier cascade is a degenerate decision tree where each

node (decision stump) consists of a binary classifier. In [2], each node is a boosted

classifier consisting of several weak classifiers. These boosted classifiers are constructed

so that the ones near the root can be computed very efficiently at very high detection rate

with acceptable false positive rate.

Typically, most patches in a test image can be classified as faces/non-faces using

simple classifiers near the root, and relatively few difficult ones need to be analyzed by

nodes with deeper depth. With this cascade structure, the total computation of examining

all scanned image patches can be reduced significantly.

Fig: 3.1.a Face images 3.1.b.Non face images

Fig. 1. Four types of Haar-like features. These features appear at different

position and scale. The Haar-like features are computed as the difference of dark and

light regions. They can be considered as features that collect local edge information at

different orientation and scale. The set of Haar-like features is large, and only a small

amount of them are learned from positive and negative examples for face detection.

20


Fig: 3.2 a Test image 3.2.b Detection Results

Fig.3.2 . Detection results depend heavily on the adopted criteria. Suppose all the

sub-images in (b) are returned as face patterns by a detector.A loose criterion may

declare all the faces as “successful” detections while a more strict one non-faces

gabor.m

This script contains Gabor equation and is used to generate one based on some

parameters.

create_gabor.m

21


This script uses gabor.m to generate forty 32x32 gabor filters and save them in a

cell array matrix called “G” and in a file named “ gabor.mat”. This script will be

inkoved only once unless we delete “gabor.mat”.

Fig: 3.3 Gabor Filters in Time Domain

main.m

The main menu and the only file you need to run the program

createffnn.m

This function creates a feed forward neural network with one hundred neurons

in the hidden layer and one neuron in the output layer. The network will be saved in

“net.mat” for further use. To learn more about how to customized neural network see

“MATLAB help > Neural Network Toolbox > Advance Topics”

22


loadimages.m

This function prepares images for training phase. All data form both “face” and

“non-face” folders will be gathered in a large cell array. Each column represents the

features of an image which could be a face or not Rows are as follows:

Row 1: File name

Row 2: Desired output of the network corresponded to the feature vector.

Row 3: Prepared vector for the training phase

Also this script saves the database to a file named “imgdb.mat”. So we do not

need to create the database more than once unless we add or delete some photos to/from

“face” and “non-face” folders.

Every time we do this, after recreating a database, we should initialize and train

the network again This script uses “im2vec.m” to extract features from images and

vectorize them for the database.

23


.

im2vec.m

This function takes a 27x18 image. It adjusts the histogram of the image for

better contrast. Then the image will be convolved with gabor filters by multiplying the

image by gabor filters in frequency domain. Gabor filters are stored in “gabor.m”. To

save time they have been saved in frequency domain before Features135x144 is a cell

array contains the result of the colvolution of the image with each of the forty gabor

filters. These matrixes will be concated to form a bif 135x144 matrix of complex

numbers. we only need the magnitude of the result. That is why “abs” is used.

135x144 has 10,400 pixels. It means that the input vector of the network should

have 19,400 values which mean a large amount of computation. So we reduce the matrix

size to one-third of its original size by deleting some rows and columns. Deleting is not

the best way but it save more time compare to other methods like PCA

We should optimize this function as possible as we can.

Fig: 3.4

Trainnet.m

This function trains the neural network and returns the trained network.

24


imscan.m

First Section:

Fig: 3.5

Second Section:

In this section the algorithm checks all potential face-contained windows and the

windows around them using neural network. The result will be the output of the neural

network for checked regions

Fig :3.6

25


Third Section:

1- Filtering above pattern for values above threshold (xy_)

2- Dilating pattern with a disk structure (xy_)

3- finding the center of each region

26


4- Draw a rectangle for each point

5 - Final Result

27


4. SOURCE CODE

4.1 How to run the program:

1- Copy all files and directories to the MATLAB’s work folder.

(you may also create a folder there to avoid confliction with other programs)

2- Find the file named “main.m”

3- Double click on the file or type “main” in the command window

For the first time the program will create three files automatically

gabor.mat: this file contains a cell array matrix called “G”. Forty gabor filters are

Stored in “G” in frequency domain each of which has a resolution of 32x32

net.mat: feed forward neural network structure

imgdb.mat: All Images which are going to be used in training

4- A menu will be shown. Click on “Train Network” and wait until the program trains

your neural network.

5- Click on “Test on Photos”. A dialog box will be appeared. Select a .jpg photo.

” Im1.jpg” is a small image which is good for your first visit of the program

Your selected photo will be shown on the screen. You can maximize the window if

you want.

6- Wait until the program detects some faces. During this phase you should see some

activities on the selected photo.

4.2 Requirments:

1- MATLAB 7.0 or Later

2- Image Processing Toolbox

3- Neural Network Toolbox

28


4.3 MATLAB Program:

% Face recognition by Santiago Serrano

clear all

close all

clc

% number of images on your training set.

M=40;

% Chosen std and mean.

% It can be any number that it is close to the std and mean of most of the images.

um=100;

ustd=80;

% read and show image

S=[]; % img matrix

figure(1);

for i=1:M

str=strcat(int2str(i),'.bmp'); % concatenates two strings that form the name of the

image

eval('img=imread(str);');

subplot(ceil(sqrt(M)),ceil(sqrt(M)),i)

imshow(img)

if i==3

title('Training set','fontsize',18)

end

drawnow;

29


[irow icol]=size(img); % get the number of rows (N1) and columns (N2)

temp=reshape(img',irow*icol,1); % creates a (N1*N2)x1 vector

S=[S temp]; % S is a N1*N2xM matrix after finishing the sequence

End

% Here we change the mean and std of all images. We normalize all images.

% This is done to reduce the error due to lighting conditions and background.

for i=1:size(S,2)

temp=double(S(:,i));

m=mean(temp);

st=std(temp);

S(:,i)=(temp-m)*ustd/st+um;

end

% show normalized images

figure(2);

for i=1:M

str=strcat(int2str(i),'.jpg');

img=reshape(S(:,i),icol,irow);

img=img';

eval('imwrite(img,str)');


imshow(img)

drawnow;

if i==3

title('Normalized Training Set','fontsize',18)

end

end

% mean image

m=mean(S,2); % obtains the mean of each row instead of each column

30


tmimg=uint8(m); % converts to unsigned 8-bit integer. Values range from 0 to 255

img=reshape(tmimg,icol,irow); % takes the N1*N2x1 vector and creates a N1xN2

matrix

img=img';

figure(3);

imshow(img);

title('Mean Image','fontsize',18)

% Change image for manipulation

dbx=[]; % A matrix

for i=1:M

temp=double(S(:,i));

dbx=[dbx temp];

end

%Covariance matrix C=A'A, L=AA'

A=dbx';

L=A*A';

% vv are the eigenvector for L

% dd are the eigenvalue for both L=dbx'*dbx and C=dbx*dbx';

[vv dd]=eig(L);

% Sort and eliminate those whose eigenvalue is zero

v=[];

d=[];

for i=1:size(vv,2)

if(dd(i,i)>1e-4)

v=[v vv(:,i)];

d=[d dd(i,i)];

end

31


end

%sort, will return an ascending sequence

[B index]=sort(d);

ind=zeros(size(index));

dtemp=zeros(size(index));

vtemp=zeros(size(v));

len=length(index);

for i=1:len

dtemp(i)=B(len+1-i);

ind(i)=len+1-index(i);

vtemp(:,ind(i))=v(:,i);

end

d=dtemp;

v=vtemp;

%Normalization of eigenvectors

for i=1:size(v,2) %access each column

kk=v(:,i);

temp=sqrt(sum(kk.^2));

v(:,i)=v(:,i)./temp;

end

%Eigenvectors of C matrix

u=[];

for i=1:size(v,2)

temp=sqrt(d(i));

u=[u (dbx*v(:,i))./temp];

end

32


%Normalization of eigenvectors

for i=1:size(u,2)

kk=u(:,i);

temp=sqrt(sum(kk.^2));

u(:,i)=u(:,i)./temp;

end

% show eigenfaces

figure(4);

for i=1:size(u,2)

img=reshape(u(:,i),icol,irow);

img=img';

img=histeq(img,255);


imshow(img)

drawnow;

if i==3

title('Eigenfaces','fontsize',18)

end

end

% Find the weight of each face in the training set

omega = [];

for h=1:size(dbx,2)

WW=[];

for i=1:size(u,2)

t = u(:,i)';

33


WeightOfImage = dot(t,dbx(:,h)');

WW = [WW; WeightOfImage];

end

omega = [omega WW];

end

% Acquire new image

% Note: the input image must have a bmp or jpg extension.

% It should have the same size as the ones in your training set.

% It should be placed on your desktop

%InputImage = input('Please enter the name of the image and its extension \n','s');

InputImage = imread('1.bmp');

%InputImage = imread(strcat('D:\Documents and Settings\user\Desktop\face

recognition\',InputImage));

figure(5)

subplot(1,2,1)

imshow(InputImage); colormap('gray');title('Input image','fontsize',18)

InImage=reshape(double(InputImage)',irow*icol,1);

temp=InImage;

me=mean(temp);

st=std(temp);

temp=(temp-me)*ustd/st+um;

NormImage = temp;

Difference = temp-m;

p = [];

aa=size(u,2);

34


for i = 1:aa

pare = dot(NormImage,u(:,i));

p = [p; pare];

end

ReshapedImage = m + u(:,1:aa)*p; %m is the mean image, u is the eigenvector

ReshapedImage = reshape(ReshapedImage,icol,irow);

ReshapedImage = ReshapedImage';

%show the reconstructed image.

subplot(1,2,2)

imagesc(ReshapedImage); colormap('gray');

title('Reconstructed image','fontsize',18)

InImWeight = [];

for i=1:size(u,2)

t = u(:,i)';

WeightOfInputImage = dot(t,Difference');

InImWeight = [InImWeight; WeightOfInputImage];

end

ll = 1:M;

figure(68)

subplot(1,2,1)

stem(ll,InImWeight)

title('Weight of Input Face','fontsize',14)

% Find Euclidean distance

e=[];

for i=1:size(omega,2)

q = omega(:,i);

DiffWeight = InImWeight-q;

35


mag = norm(DiffWeight);

e = [e mag];

end

kk = 1:size(e,2);

subplot(1,2,2)

stem(kk,e)

title('Eucledian distance of input image','fontsize',14)

MaximumValue=max(e) % maximum eucledian distance

MinimumValue=min(e) % minimum eucledian distance

5. FUTURE SCOPE

36


Face Detection is the First Step in Face Recoganization system.

General Face Recognition Steps :

A) Face Detection

B) Face Normalization

C) Face Identification

In future we are going to do our main project on “FACE RECOGANIZATION

SYSTEM”.

37


6. CONCLUSION

As face detection is the first step of any face processing system, it finds

numerous applications in face recognition, face tracking, facial expression recognition,

facial feature extraction, gender classification, clustering, attentive user interfaces,

digital cosmetics, biometric systems, to name a few. In addition, most of the face

detection algorithms can be extended to recognize other objects such as cars, humans,

pedestrians, and signs, etc...

Face Recognition has been successfully implemented using eigenface approach.

Eigenface approach of face recognition has been found to be a robust technique that can

be used in security systems.

7. BIBILOGRAPHY

38


S.N.o Book Name Author Year

1. Authenticated Key Exchange Secure Against

Dictionary Attacks

M.Bellare

D. Pointcheval, P.

Rogaway

2000

2. Fingerprint image Enhancement Algorithm

and Performance Evaluation

L. Hong,

Y. Wan, and A.

Jain

1998

3. A New Two – Server Approach for

Authentication with Short Secrets

J. Brainard,

A. Juels,

B.Kaliski,

and M. Szydlo

2003

References on the Web:

www.mathworks.com

http://www.analog.com

http://www.intechopen.com

http://ieeexplore.ieee.org

39

http://ieeexplore.ieee.org/

http://www.mathworks.com/


40

Face Detection System

Documents

Transcript of Face Detection System