CHAPTER 5 GENETIC ALGORITHM (GA) FOR FACIAL BIOMETRIC...
Transcript of CHAPTER 5 GENETIC ALGORITHM (GA) FOR FACIAL BIOMETRIC...
79
CHAPTER 5
GENETIC ALGORITHM (GA) FOR FACIAL BIOMETRIC
SECURITY SYSTEM (BSS)
5.1 PREAMBLE
This chapter presents the genetic algorithm for facial biometric
security system methodology used this research work. This section gives the
organization of the chapter. Section 5.2 discusses the genetic algorithm
operations and including the parameters and steps of the genetic algorithm.
Section 5.3 presents the operations of genetic algorithm and discuss with the
algorithm of the facial and facial communication and genetic feature
selection. Section 5.4 presents the facial expressions recognition framework.
This section divide the five parts of the recognition process, discuss in the
first part is the image capturing system and collection of images. The
processing routings of the image discuss of the section 5.5. This section also
presents transformation of secure information and the removing of the noise.
The next section 5.6 the main theme of this research for Extraction of
Features. This section also presents Geometry feature extraction and facial
expression on behaviour. The section 5.7 discusses the Template Classifier.
This section also presents the expressional class hierarchy and facial
expression interpretation. Final the section is discussing the Facial
Recognition of the images in section 5.8. This section also discusses the
genetic feature selection and security using invariant features for recognition
rates.
80
5.2 GENETIC ALGORITHM OPERATIONS
The genetic algorithm is a search heuristic that mimics the process
of natural evolution. This heuristic is routinely used to generate useful
solutions to optimization and search problems. Genetic algorithms belong to
the larger class of Evolutionary Algorithms (EA), which generate solutions to
optimization problems using techniques inspired by natural evolution, such as
inheritance, mutation, selection and crossover. In a genetic algorithm, a
population of strings (called chromosomes or the genotype of the genome),
which encode candidate solutions (called individuals, creatures or
phenotypes) to an optimization problem, evolves toward better solutions.
Traditionally, solutions are represented in binary as strings of 0s and 1s, but
other encodings are also possible. The evolution usually starts from a
population of randomly generated individuals and happens in generations. In
each generation, the fitness of every individual in the population is evaluated,
multiple individuals are stochastically selected from the current population
(based on their fitness), and modified (recombined and possibly randomly
mutated) to form a new population. The new population is then used in the
next iteration of the algorithm. Commonly, the algorithm terminates when
either a maximum number of generations has been produced, or a satisfactory
fitness level has been reached for the population. If the algorithm has
terminated due to a maximum number of generations, a satisfactory solution
may or may not have been reached. The typical genetic algorithm requires a
genetic representation of the solution domain and a fitness function to
evaluate the solution domain.
The genetic algorithm is a model of machine learning which
derives its behaviour from a metaphor of some of the mechanisms of
evolution in nature. This is done by the creation within a machine of a
population of individuals represented by chromosomes, in essence a set of
81
character strings that are analogous to the base-4 chromosomes. The
individuals represent candidate solutions to the optimization problem being
solved. In genetic algorithms, the individuals are typically represented by
n-bit binary vectors. The resulting search space corresponds to an n-
dimensional boolean space (Oliveira et al. 2002). It is assumed that the quality
of each candidate solution can be evaluated using a fitness function as shown
in Figure 5.1.
Figure 5.1 Cycle of Genetic Algorithm
A standard representation of the solution is as an array of bits.
Arrays of other types and structures can be used in essentially the same way.
The main property that makes these genetic representations convenient is that
their parts are easily aligned due to their fixed size, which facilitates simple
crossover operations. Variable length representations may also be used, but
crossover implementation is more complex in this case. Tree-like
representations are explored in genetic programming and graph-form
representations are explored in evolutionary programming.
82
The fitness function is defined over the genetic representation and
measures the quality of the represented solution. The fitness function is
always problem dependent. For instance, in the knapsack problem one wants
to maximize the total value of objects that can be put in a knapsack of some
fixed capacity. A representation of a solution might be an array of bits, where
each bit represents a different object, and the value of the bit 0 or 1 represents
whether or not the object is in the knapsack. Not every such representation is
valid, as the size of objects may exceed the capacity of the knapsack. The
fitness of the solution is the sum of values of all objects in the knapsack if the
representation is valid or 0 otherwise. In some problems, it is hard or even
impossible to define the fitness expression, in these cases interactive genetic
algorithms are used. Once the use of genetic representation and the fitness
function defined, genetic algorithm proceeds to initialize a population of
solutions randomly and then improve it through repetitive application of
initialization, mutation, crossover and inversion and selection operators.
Simple generational genetic algorithm pseudocode:
a. Choose the initial population of individuals
b. Evaluate the fitness of each individual in that population
c. Repeat on this generation until termination: (time limit,
sufficient fitness achieved, etc.)
i. Select the best-fit individuals for reproduction
ii. Breed new individuals through crossover and mutation
operations to give birth to offspring
iii. Evaluate the individual fitness of new individuals
iv. Replace least-fit population with new individuals
83
5.2.1 Parameters of Genetic Algorithm
There are two basic parameters of genetic algorithm one of the
crossover probability and another one is mutation probability.
Crossover probability: Crossover will be performed. If there is no
crossover, offspring are exact copies of parents. If there is crossover,
offspring are made from parts of both parent’s chromosome. If crossover
probability is 100%, then all offspring are made by crossover. If it is 0%,
whole new generation is made from exact copies of chromosomes from old
population (but this does not mean that the new generation is the same).
Crossover is made in hope that new chromosomes will contain good parts of
old chromosomes and therefore the new chromosomes will be better.
However, it is good to leave some part of old population survive to next
generation.
Mutation probability: Parts of chromosome will be mutated. If
there is no mutation, offspring are generated immediately after crossover (or
directly copied) without any change. If mutation is performed, one or more
parts of a chromosome are changed. If mutation probability is 100%, whole
chromosome is changed, if it is 0%, nothing is changed. Mutation generally
prevents the genetic algorithm from falling into local extremes. Mutation
should not occur very often, because then GA will in fact change to random
search.
The genetic algorithm loops over an iteration process to make the
population evolve. Each iteration consists of the following steps:
1. SELECTION: The first step consists in selecting individuals
for reproduction. This selection is done randomly with a
probability depending on the relative fitness of the individuals
84
so that best ones are often chosen for reproduction than poor
ones.
2. REPRODUCTION: In the second step, offspring are bred by
the selected individuals. For generating new chromosomes,
the algorithm can use both recombination and mutation.
3. EVALUATION: Then the fitness of the new chromosomes is
evaluated.
4. REPLACEMENT: During the last step, individuals from the
old population are killed and replaced by the new ones. The
algorithm is stopped when the population converges toward
the optimal solution.
5.2.2 Steps of the Genetic Algorithm
Genetic Algorithm is an iterative process. Each iteration is called
generation. A chromosome of length of 6 bits and a population of 20 are
chosen in this research work. The selected chromosome is an approximate
solution.
The genetic algorithm process is described in the following steps:
1. Represent the problem variable domain as chromosome of a
fixed length and population, with suitable crossover
probability and mutation probability.
2. Define a fitness function to measure the performance, or
fitness of an individual chromosome in the problem domain.
3. Randomly generate an initial population of chromosomes.
4. Calculate the fitness of each individual chromosome.
85
5. Select a pair of chromosomes for matting from the current
population. Parent chromosomes are selected with a
probability related to their fitness. Highly fit chromosomes
have a higher probability of being selected for mating
compared to less fit chromosomes.
6. Create a pair of offspring chromosomes by applying the
genetic operator crossover and mutation.
7. Place the created offspring chromosomes in the new
population.
8. Repeat from step 5 until the size of new chromosome
population becomes equal to the size of the initial population.
9. Replace the initial chromosome population with the new
population.
10. Go to step 4, and repeat the process until the termination
criterion is satisfied.
5.3 FUNCTIONS OF GENETIC ALGORITHM
Genetic algorithms are not too hard to program or understand, since
they are biological based. Thinking in terms of real-life evolution may help
you understand. Here is the general algorithm for a GA:
Create a Random Initial State: An initial population is created
from a random selection of solutions (which are analogous to chromosomes).
This is unlike the situation for symbolic Artificial Intelligent (AI) systems,
where the initial state in a problem is already given instead.
Evaluate Fitness: A value for fitness is assigned to each solution
(chromosome) depending on how close it actually is to solving the problem
86
(thus arriving to the answer of the desired problem). (These ‘solutions’ are not
to be confused with ‘answers’ to the problem, think of them as possible
characteristics that the system would employ in order to reach the answer.)
Reproduce (Children Mutate): Those chromosomes with a higher
fitness value are more likely to reproduce offspring (which can mutate after
reproduction). The offspring is a product of the father and mother, whose
composition consists of a combination of genes from them (this process is
known as ‘crossing over’.
Next Generation: If the new generation contains a solution that
produces an output that is close enough or equal to the desired answer then
the problem has been solved. If this is not the case, then the new generation
will go through the same process as their parents did. This will continue until
a solution is reached.
To attenuate the illumination effect removes the three eigenvectors
with the largest eigenvalues and the performance is improved. However there
is no systematic way to determine which eigenvalues should be used. The
proposed genetic algorithm interprets an automatic and systematic method to
select the eigenvectors to be used in facial expression recognition algorithm.
A number of multiobjective evolutionary algorithms have been
proposed. A systematic comparison of various evolutionary approaches to
multiobjective optimization using carefully chosen test functions. The idea
behind the GA is that a ranking selection method is used to emphasize good
points and a niche method is used to maintain stable subpopulations of good
points. Simple genetic algorithm only in the way the selection operator works.
The crossover and mutation remain as usual. Before the selection is
performed, the population is ranked based on an individual’s nondomination.
The nondominated individuals present in the population are first identified
87
from the current population. Then, all these individuals are assumed to
constitute the first nondominated front in the population and assigned a large
dummy fitness value (Oliveira et al. 2003).
The same fitness value is assigned to give an equal reproductive
potential to all these nondominated individuals. In order to maintain the
diversity in the population, these classified individuals are then shared with
their dummy fitness values. Sharing is achieved by performing selection
operation using degraded fitness values obtained by dividing the original
fitness value of an individual by a quantity proportional to the number of
individuals around it. After sharing, these nondominated individuals are
ignored temporarily to process the remaining population in the same way to
identify individuals for the second nondominated front. These new sets of
points are then assigned a new dummy fitness which is kept smaller than the
minimum shared dummy fitness of the previous front. This process is
continued until the entire population is classified into several fronts as shown
in Figure 5.2.
Figure 5.2 Flowchart of the Genetic Algorithm Functions
88
5.3.1 Algorithm for Genetic Facial Expression
A genetic algorithm facial expression model is developed and
experimented. It is found that some small eigenvectors should also be used as
part of the basis for dimension reduction. Use the genetic algorithm facial
expression to reduce the dimension and compare with the traditional Fisher
Face method. This section also presents the results of the various facial
expressions using MATLAB. The genetic algorithm for facial Expressional is
discussed in the following steps:
BEGIN /* genetic algorithm*/
Generate initial population;
Compute fitness of each individual;
WHILE NOT finished DO LOOP
BEGIN
Select individuals from old generations
For mating;
Create offspring by applying
recombination and/or mutation
to the selected individuals;
Compute fitness of the new individuals;
Kill old individuals to make room for
new chromosomes and insert
offspring in the new generalization;
IF Population has converged
THEN finishes: = TRUE;
END
END
89
In the genetic algorithm each chromosome is a string of binary
numbers either 0 to 1 of size 128. Since have 128 facial expression features
describing an interest point. Let Cij denote the jth
component i. Cij = 0
indicated that is should not use the jth
feature whereas Cij = 1 indicated that it
should use it. Initially take a 10 interest points and desired distances between
each pair of them. Let d r denote the desired distance between any two
interest points d and r. these desired distances are assumed to be 0 for any two
very similar interest points (like one coming from the eye of a person and
another one coming one coming from the another eye of image to the same
person) and 1 for any two non similar interest points (the distances are
Euclidian distances that are normalized to 1). Then the difference between the
desired distance of these two points and the distance calculated by only using
the selected features of a chromosome should reach to zero in the ideal case.
So the fitness function becomes minus (desired distance of two points the
distance calculated by only using the selected features of a chromosome).
The experimental graph1 implies genetic algorithm geometric face
expression offers two additional advantages, that is, optimal based for
dimensionality reduction are derived from genetic algorithm model and the
computational efficiency is improved by adding a whitening procedure after
dimension reduction. Experimental results show that almost 30%
improvement compared with Fisher Face can be obtained and the results are
encouraging.
5.3.2 Facial Communication
A person’s face especially their eyes create the most obvious and
immediate cues that lead to the formation of impressions. In this research
discusses eyes and facial expressions and the effect they have on interpersonal
communication.
90
Eye contact is another major aspect of facial communication. Some
have hypothesized that this is due to infancy as human are one of the few
mammals who maintain regular eye contact with their mother while nursing,
eye contact servers a variety of purposes. It regulates conversations shows
interest or involvement and establishes a connection with others.
The face as a whole indicates much about the moods as well.
Specific emotional states such as happiness or sadness are expressed through
a smile or a frown respectively. There are seven universally recognized
emotions shown through facial expressions such as happiness, sadness, fear,
anger, surprise and disgust. Regardless of culture these expressions are the
same. However the same emotion from a specific facial expression may be
recognized by a culture but the same intensity of emotion may not be
perceived.
The facial recognition algorithms identify faces by extracting
landmarks or features from an image of the subject’s face. For example an
algorithm may analyze the relative position, size and / or shape of the eyes,
nose, cheekbones and jaw. These features are then used to search for other
images with matching features. Other algorithms normalize a gallery of face
images and then compress the face data only saving the data in the image that
is useful for face detection. A probe image is then compared with the face
data. The facial expression recognition system developed in this research
adopts the following techniques are 2D Face Recognition and Geometrics of
Facial expressions.
91
5.3.3 Genetic Feature Selection
A facial model database is created by modifying a generic facial
model to customize each individual face given a front view and a side view of
one face. This approach is based on recovering the structure of selected
feature points in the face and then adjusting a generic model using these
control points to obtain the individualized facial model. Each individualized
facial model consists of 295 vertices. The face model database is generated
using 32 pairs of face images from 10 subjects. These source image pairs are
mainly chosen from the databases and some additional images are captured
from the local community.
For each subject there are two or three pairs of frontal and profile
images which are taken under different imaging conditions and they are used
periodically in an effective manner. The characterize features of the facial
surface on each vertex on the individual model is labeled by one of eight label
types. Therefore, the facial feature space is represented with a set of labels.
A cubic approximation method is then helpful in exploring the estimated
principal curvatures of each vertex on the each model. Then the eight typical
curvature types are Convex Peak, Convex Cylinder/Cone, Convex Saddle,
Minimal Surface, Concave Saddle, Concave Cylinder/Cone, Concave Pit and
planar using categorized according to the relation of the principal curvatures.
Among the set of labels only the certain labels which are located in
certain regions are attracted and covers the interest of others. Some non
feature labels could be noises that may blur the individual facial
characteristics those results in redundancy of images. Therefore, a feature
screening process is applied and the select features are presented by the
individual. Those selected individual facial traits for maximizing the
difference between different subjects while minimizing the size of the feature
space. In order to select the optimal features partition the face model into
92
15 sub regions based on their physical structure that helps in over lapping
between some of the regions which are similar to the region components.
Some of the sub regions do not contribute to the recognition task
and not all the vertices within one sub region contribute to the classification
need to select the best set of vertex labels and the best set of sub regions for
the well equipped recognition task. The purpose of the feature selection is to
remove the irrelevant or redundant features or any duplicate occurrence in
that particular task that may lead to down fall in the performance of face
classification. The genetic algorithm is implemented successfully to address
these kinds of problem. So it is more convenient to choose and use genetic
algorithm based method algorithms in this task and that can select the
components to contribute the most to the face recognition task.
5.4 FACIAL EXPRESSIONAL RECOGNITION FRAMEWORK
The genetic feature value set of the template class is processed in
the recognition phase of the system. The template classes of input sample try
to match the multiple template classes available in the training data. Once the
matching of the template class is obtained, respective image can derive the
status of valid recognition. Even then the valid recognition could have high
precision and some time low precision. Finally the rate of recognition is also
derived to show the performance level of the developed facial expression
recognition model. In Figure 5.3 the facial expressional recognition
framework of the system.
93
Figure 5.3 Facial Expression Recognition Frame Work
94
5.4.1 Image Capturing System
Images used for expression recognition of facial expression are
static images or image sequences. An image sequence contains potentially
more information than a still image, because the former also depicts the
temporal characteristics of an expression. With respect to the spatial,
chromatic, and temporal dimensionality of input images, 2D monochrome
(gray scale) facial image sequences are the most popular type of pictures used
for automatic expression recognition. However, color images could become
prevalent in future, owing to the increasing availability of low cost color
image acquisition equipment, and the ability of color images to convey
emotional cues such as blushing. The image acquisition system comprises of
the following components:
1. An electronic camera has a lens with a selectively adjustable
aperture control.
2. An amplifier coupled to the camera will receive image data
there from, and the amplifier has selectively adjustable gain
and offset controls.
3. A digitizer coupled to the amplifier will receive amplified
image data.
4. A processor coupled to the digitizer will receive digitized
image data there from, and the processor has outputs coupled
to the amplifier gain and offset controls for selectively
adjusting the same, and an output connected to the lens
aperture control for selectively adjusting the same.
5. The data of digitized image received from digitizer calculate
intensity entropy of an image and provide outputs selectively
95
to required amplifier gain and offset controls and lens aperture
control for selectively adjusting the same to optimize the
acquired image information.
The image capturing system is arranged to optimize the information
in an acquired image. Parameters associated with the system, such as any of
the lens aperture, the lens focus and image intensity is adjusted. The data of
incoming image is processed to determine the entropy to the image and with
this information, the aperture can be optimized. By determining the dynamic
range of the scene the black and white levels there from can be identified and
the gain and offset applied to the image are adjusted to minimize truncation
distortion. Secular highlights can be detected by calculating the ratio of
changes in maximum and minimum intensities between different but related
images.
The image acquisition toolbox preview window of the MATLAB
helps to verify and optimize the parameters of the user requested acquisition.
It instantly reflects any adjustments that made to acquisition properties. The
image acquisition tool has a built-in preview window, and can add one to any
application built with MATLAB.
Collecting Image Data: Image acquisition toolbox can
continuously acquire image data while processing the acquired data in
MATLAB. The toolbox automatically buffers acquired data into memory,
handles memory and buffer management, and enables acquisition from a
Region of Interest (ROI). Data can be acquired in a wide range of data types,
including signed or unsigned 8-, 16- and 32- bit integers and single- or
double- precision floating point. The toolbox supports any color space
provided by the image acquisition device, such as RGB, YUV or grayscale.
96
Raw sensor data in a Bayer pattern can be automatically converted into RGB
data. The toolbox supports any frame rate and video resolution supported by
your PC and imaging hardware.
5.5 PROCESSING ROUTING
Image preprocessing performs the task of signal conditioning which
includes noise removal, and normalization against the variation of pixel in
horizontal or vertical position or brightness together with segmentation,
location, or tracking of the face or its parts. Expression representation can be
sensitive to translation, scaling and rotation of the head in an image. To
eliminate the effect of these unwanted transformations, the facial image may
be geometrically standardized prior to classification. The normalization is
usually based on references provided by the eyes or nostrils. Segmentation is
concerned with the elimination of image portions conveying relevant facial
information and helps in sharpening of images.
Face segmentation is often anchored on the shape, motion, color,
texture and spatial configuration of the face or its components. The face
location process gets the position and spatial extent of faces and other parts in
an image, it is typically based on segmentation results. However, robust
detection of faces or their constituents is difficult to attain in many real world
setting. Tracking is often implemented as location, of the face or its parts,
within a specified image sequence, whereby previously determined location is
typically used for estimating location in subsequent image frames.
5.5.1 Transformation of Secure Features
The biometric features are easily accessible and hence
transformable on acquisition and that abnormal transformation will lead to
97
destruction of images, so the systems need to incorporate some transformation
information into the biometric feature to be able to use these features
correctly. Since the number of usable biometrics known to data is limited, the
system cannot afford to compromise these while being used as a transformed
image. To overcome these problems, the proposed systems have followed the
approach of the cancelable biometrics, where the biometric template is
convolved with a 2D random signal to generate an original feature of the
image.
Using this approach (Nagar and Chaudhury 2006) both the
problems are solved as without knowing the random signal one can get the
original biometric feature and other one can easily discard a image
transformation by discarding the corresponding random signal. In this
research, the system have converted the biometric template of length 255 to a
matrix of size 15 * 17 and convolved with a random kernel of transformation
the size 10 * 10 to get the secure features of face as shown in Figure 5.4.
Figure 5.4 Secure Features Creation
5.5.2 Noise Removal
Facial images taken with digital cameras pick up noise from a
variety of sources. Many further uses of these images require that the noise
will be (partially) removed for practical image feature retention. In salt and
pepper noise (sparse light and dark disturbances), pixels in the facial image
are very different in color or intensity from their surrounding pixels, the
defining characteristics is that the value of a noise pixel bears no relation to
98
the color of surrounding pixels. Generally this type of noise will only affect a
small number of image pixels. When viewed, the image contains dark and
white dots.
In Gaussian noise, each pixel in the image will be changed from its
original value by a (usually) small amount. A histogram, a plot of the amount
of distortion of a pixel value against the frequency with which it occurs,
shows a normal distribution of noise. The Gaussian (normal) distribution is
usually a good model, due to the central limit theorem that says that the sum
of different noises tends to approach a Gaussian distribution. In either case,
the noises at different pixels can be either correlated or uncorrelated. In many
cases, noise values at different pixels are modeled as being independent and
identically distributed, and hence uncorrelated.
5.5.3 Linear Smoothing Filters
Linear smoothing filter method removes noise in the facial image,
by convolving the original image with a mask, that represents a low pass filter
or smoothing operation. For this, the Gaussian mask comprises elements
determined by the Gaussian function. This convolution brings the value of
each pixel into closer harmony with the values of its neighbors. The
smoothing filter sets each pixel to the average value, or a weighted average of
itself and its nearby neighbors. The Gaussian filter is just one possible set of
weights.
Linear smoothing filters tend to blur an image, because pixel
intensity values are significantly higher or lower than the surrounding
neighborhood. Because of this blurring, linear filters are seldom used in
practice for noise reduction.
99
5.5.4 Nonlinear Filters
The nonlinear filter (median) preserving facial image features
(shape, size and intensity). The process of median filter are first consider each
pixel i the image, then sort the neighboring pixels into order based upon their
intensities and finally replace the original value of the pixel with the median
value from the list. The median filter selects the closest of the neighboring
values when a pixels value is extremely in its neighborhood, and leaves it
unchanged otherwise for photographic applications.
5.6 EXTRACTION OF FEATURES
Extraction of feature converts pixel data into a higher-level
representation of Shape, Motion, color, Blur. Texture and Spatial
configuration of the face or its components. Thus the formed extraction is
used for subsequent expression categorization. Extraction of feature generally
reduces the dimensionality of the input space. The reduction procedure is used
to retain the essential information possessing high discrimination power and
high stability. Such reduction in dimensionality will lead to mitigate the
‘curse of dimensionality’. Geometric, Kinetic, Potential and Statistical or
Spectral transform based features are often used as alternative representation
of the facial expression prior to classification.
5.6.1 Geometric Feature Extraction
An efficient, local image based approach for extraction of
instantaneous facial features and recognition of facial expressions from 2D
image sequences is presented. The algorithm uses edge projection analysis for
extraction of features and created a dynamic temporal representation of the
100
face and other parts, followed by classification through a Feed Forward net
with one hidden layer.
A novel transform for extracting the lip region for color face
images based on Gaussian modeling of skin and lip color is proposed with
this system. This helps in the image transformation and thus proposed lip
transform for colored images results in better extraction of lip region in the
extraction of feature stage. The algorithm gets an accuracy of 90.0% for facial
expression recognition from grayscale image sequences in a well organized
and efficient manner.
The geometric extraction of feature system has used integral
projections of the edge map of the face image for extraction of facial features.
Let I(x, y) be the input image. Vertical and Horizontal projection vectors in
the rectangle [x1, x2] * [y1, y2] are defined respectively with their axis.
A typical human face follows a set of anthropometric standards,
which have been utilized to narrow the search of a particular facial feature to
smaller regions of the face.
5.6.2 Algorithmic Steps
a. An approximate bounding box for the feature is obtained
using the anthropometric standards.
b. Sobel edge map is computed to get edges with boundaries of
the features.
c. The integral projections Vertical axis (x) and Horizontal axis
(y) are calculated and results are shown on the edge map.
101
d. Median filtering followed by Gaussian smoothing method that
helps in smoothing the projection vectors which are obtained
with the proposed system. Higher value of projection vector at
a particular point indicates higher probability of occurrence of
the feature. The relative probability E (i) of the ith
region
containing the feature is calculated and then the region with
maximum E (i) gives the Vertical extent of the region
containing the feature with these similar approaches is used
for getting the vertical extent from the vertical projection
V(x).
e. The bounding box so obtained is processed further to get an
exact binary mask of the feature.
5.6.3 Features of Eyebrow
The approximate bounding box is the top half of the face. With the
generic, horizontal direction identifies the sobel edges are used, to compute
bounding box containing eye and eyebrow of the given image. The
segmentation algorithm applied in the given bounding box for the eyebrow
exclusively.
Eyebrow is segmented from eye using the fact that the eye occurs
below eyebrow and its edges form closed contours, obtained by applying
Laplacian of Gaussian operator at zero thresholds. They are filled and the
resulting image containing masks of eyebrow and eye is morphologically
filtered by horizontally stretched elliptic structuring elements. From the two
largest filled regions, the region with higher centroid is chosen to be mask of
eyebrow.
102
Lip Features: The approximate bounding box is the lower half of
the face. In case of colored images, lip pixels significantly differ from those
of skin in YCbCr color space. Thus the colored image is preprocessed to
produce pronounced demarcation between lip and other skin regions. For the
case of grayscale image, no such preprocessing is needed.
The genetic algorithm calculates edge maps on the transformed
image as well as on the image to be transformed. Edges for lips occur both in
horizontal and vertical direction along with their axis. In the bounding box
computed by the genetic algorithm closed contours are obtained by applying
Laplacian of Gaussian operator at zero thresholds. These contours are filled
and morphologically filtered using elliptic structuring elements to get binary
mask for the lips.
Nose Features: The approximate bounding box for the nose lies
between the eyes and the mouth. The genetic algorithm uses vertical sobel
edges to compute the vertical position, which is used as a reference point on
face.
Facial Expression on Behaviour: The system developed works on
the basic principal set of facial action coding system that measures all visible
facial movements. Facial action coding system would differentiate every
change in muscular action, but it is limited to what a user can reliably
discriminate when movements are inspected repeatedly, in stopped and
slowed motion. It does not measure invisible changes (for example, certain
changes in muscle tonus) or vascular and glandular changes produced by the
autonomic nervous system.
Limiting facial action coding system measurement to visible
movement was consistent with an interest in those behaviours which may be
103
social signals, usually detected during social interactions. Facial action coding
system can be applied to any reasonably detailed visual record of facial
behaviour. If the techniques were to measure invisible or Autonomic Nervous
System (ANS) activity, it would be limited to situations were sensors were
attached (for example, Electrodes (EMG)) or special sensing and recording
methods were used (for example, Thermography).
The primary goal in adopting facial action coding system for face
recognition system was comprehensiveness, a technique that could measure
all possible, visible discriminable facial actions. Comprehensiveness was
important because many of the fundamental questions about the universe and
nature of facial expressions cannot be answered if just a subset of behaviours
is measureable. Facial action coding system was derived from an analysis of
the anatomical basis for facial movement. A comprehensive system was
obtained by discovering how each muscle of the face acts to change visible
appearances. With this knowledge it is possible to analysis to analyze any
facial movement into anatomically base, minimal action units.
Geometry on Expressions: The geometric intensity of facial
expressional emotions had been studied in an efficient manner and then
analyzed to have an effective, flexible and objective method for facial
recognition system. The result of this approach has been demonstrated on
various expressions like happiness, sadness, fear, anger, surprise and disgust
providing various levels of intensity. It has also been able to associate a pixel
wise shape value corresponding to an expression that has been changed, based
on the expansion/ contraction of that region.
The creation of this pixel wise association makes it possible for that
the method, which can quantify even subtle differences on a region wise
basis, for expressions at every levels of intensity. This is important for any
104
facial expression analysis, as a single number quantifying the whole face is of
limited significance because various regions of the face undergo different
changes for the same expression of emotion.
5.7 TEMPLATE CLASSIFIER
Expression categorization is preformed by a classifier which often
specifies the models of pattern distribution in the proposed system that has
been coupled to a decision procedure. A wide range of classifiers, covering
parametric as well as non parametric techniques has been applied to the
automate expression that solves recognition problem. The two main types of
classes used in facial expression recognition are action units and the
prototypic facial expressions.
The six prototypic expressions relate to the emotional states of
happiness, sadness, fear, anger, surprise and disgust. However it has been
noted that the variation in complexity and meaning of expressions covers a far
more than these six expression categories. Moreover, although many
experimental expression recognition systems use prototypic expressions as
output categories, such expressions occur infrequently repeatedly and fine
changes in one or a few discrete face parts communicate emotions and
intention.
A 46AUs Action Units is one of atomic elements that are
associated with the visible facial movement or its associated deformation with
this an expression typically results from the agglomeration of several Action
Units. Action units are described in the facial action coding systems.
Sometimes Action Units and prototypic expression classes are both
used in a hierarchical recognition system for example categorization into
105
Action units can be used as a low level of expression classification followed
by a high level classification of Action Unit combinations into basic
expression prototypes.
5.7.1 Expressional Class Hierarchy
The proposed system is generic in its scope, and may be applied to
a number of application areas. However, to provide a more concrete example
the demonstration focuses on the use of the system for recognizing
expressions. Using the facial action coding system this able to determine
particular muscle movements and investigate which of these correspond to
which action unit.
Wrinkles can help us to detect some of these muscle movements for
instance that can be related to particular action units. The key motivation for
this example is to demonstrate how a genetic algorithm based approach may
be used to analyze facial images and automatically classify these into
particular types of expressions. There are therefore two key aspects being
considered here:
a) Mechanism to find particular facial features, and
b) Associating these features with an ontology describing
expressions
In this case (1) is achieved through the use of one or more
MATLAB agents trained to analyze images for particular features and
case (2) is achieved by one or more application agents aggregating the
response from MATLAB gene function to make a deduction.
106
MATLAB filters to detect Mouth open and close position on a face
were used in the experiments discussed. More filters will be added to the
system in future and that helps to automate recognize facial expressions.
The aim of the filters used in this experiment is to detect horizontal
wrinkles in the upper part of the face (on the forehead), vertical wrinkles on
the lower part of the face (on the cheeks) and diagonal wrinkles on the lower
part of the face. The first will be explained since the other two works in a
similar way. To detect horizontal wrinkles related to muscle movements two
images are needed the one used for analyzing (facial image) and another one
which see the same person with a natural expression. It will need eye
positions for both images as well. They will be used to align both images and
also must be scaled to have the same size.
The system model describe a computer vision system for observing
facial motion by using an optimal estimation optical flow method coupled
with a geometric and a physical (muscle) model describing the facial
structure. The proposed method produces a reliable parametric representation
of the faces independent muscle action groups as well as accurate estimation
of facial motion.
Previous efforts at analysis of facial expression have been on the
facial action coding system a representation developed in order to allow
human psychologists to code expression from static pictures. To avoid use of
this heuristic coding scheme, the proposed system have used the computer
vision system to probabilistically characterize facial motion and muscle
activation and locate the mispositioning of muscles in an experimental
population. From this experimental derive a new more accurate representation
of human facial expressions.
107
The systems use this new representation for recognition in two
different ways. The first method uses the physics based model directly by
recognizing expressions through comparison of estimated muscle activations.
The second method uses the physics based model to generate spatio temporal
motion energy template of the whole face for each different expression. These
simple biologically plausible motion energy ‘templates’ are then used for
recognition. Both methods show substantially greater accuracy at expression
recognition than has been previously achieved.
5.7.2 Facial Expression Interpretation
Some automatic facial expression analysis systems found in the
literature attempt to directly interpret observed facial expressions in terms of
basic emotions. Recently few systems use rules of facial expression
dictionaries in order to translate coded facial actions into emotion categories.
The proposed model follow the first approach but for a more advanced
expression interpretation a framework known as facial action coding system
coding framework can also be used.
A system for describing all visually distinguishable facial
movements called the facial action coding system which has been frequently
referred to in recent literature. It is based on the enumeration of all Action
Units on a face that cause facial movements. There are 46 such Action Units
in FACS that account for changes in facial expression. Researchers have used
FACS as the basis for their expression recognition research. There have been
developed systems that specifically recognize individual Action Units or
Action Unit combinations (7000 in Numbers).
108
However discovering rules that relate Action Units to emotional
states that is happiness, sadness, fear, anger, surprise and disgust is difficult,
since it cannot be defined by any regular mathematical function. This is where
gene feature obtained by mapping geometric dimension to the expressional
variations come into act. The problem is the great number of possible facial
action combinations about 7000 genetic combinations have been identified
within the facial expressional framework. This means that the outputs of the
system handles with the recognition rate would probably be of appreciable
values.
5.8 FACIAL RECOGNITION RATE
The facial expression recognition is processed in the form of
multiple blocks of sequentially inputting output of each previous block to the
current processing block. The major blocks of the system are Image Capturing
System, Preprocessing, Extraction of Feature, Template classifier and
Recognition block. The image capturing system block gets the sample input
image and stores it in the image database as a file structure. Training samples
of the images are kept in the collective image file folder. However the
acquired image is not clear and smooth for any image processing operations
like feature extraction, segmentation, registration etc.,
The acquired image is fed into the preprocessing block. The
preprocessing consists of two stage that is, noise filtering and registration of
the image. In the noise filter geometric dimensional noises with respect to
edges and feature seeds are removed. The image obtained from the noise filter
is fed into the registration subblock of the system. The registration subblock
stores the input image to its fineness of the properties (geometric and
expressional dimensions) required for facial expressional recognition. The
109
registered image can be used as intermediate template reserved for achieving
effective feature extraction on any given facial image to its maximization.
The registered image from the preprocessing block is sent as input
to the extraction of feature block. In the extraction of feature block, first the
geometric dimensions are obtained from the registered image. The geometric
dimensions are compared with the expressional invariants such as happiness,
sadness, fear, anger, surprise and disgust to evaluate the spatial dimensional
variations. With this variants, map the geometric features to the specific
expression of the sample input image. This map would then adapt to the
genetic value sets. With these gene value set, template classifier is obtained.
The genetic feature value set of the template class is processed in
the recognition phase of the system. The template classes of input sample try
to match the multiple template classes available in the training data. Once the
matching of the template class is obtained, respective image can derive the
status of valid recognition. Even then the valid recognition could have high
precision and some time low precision. Finally the rate of recognition is also
derived to show the performance level of the developed facial expression
recognition.