Real-time Face Recognition & Detection Systems 1

LOGO

Real-time Face Recognition & Detection

Systems

Real-time Face Recognition & Detection

Systems

Presented by-Suvadip ShomeUnder guidance of Prof.S.K.Mondal

Company Logo

www.themegallery.com

Topics Covered

Brief History1

Definitions2

Methods33

Demo44

References35

LOGO

Definitions

What is Face Detection? Given an image, tell whether there is any human face

or not. If there is then find the location and size of each

human face in the image. Classification between face and non-face.

Results for Training_1.jpg

Why Face Detection is Important?

First step for any automatic face recognition system.First step in many Human-Computer Interaction

systems and Man-machine Interaction.• Expression Recognition• Cognitive State/Emotional State Recogntion

First step in many surveillance systemsTracking: Face is a highly non rigid object.A step towards Automatic Target Recognition(ATR) or

generic object detection/recognition.

WHAT IS FACE RECOGNITION?

“Face Recognition is the task of identifying an already detected face as a KNOWN or UNKNOWN face, and in more advanced casesTELLING EXACTLY WHO’S IT IS ! “

FACE DETECTION FEATURE EXTRACTION

FACE RECOGNITION

Difference between Face Detection and Face Recognition

FD:-Only two classifications face or non face

FR:- have multiple classifications, adjusted by the number of individuals who want to be recognized.

One person vs. all the others Classification from face shape, of form eyes, nose,

mouth, etc. FR process requires the first FD

LOGO

Methods

Methods for Face Detection

Knowledge-based methods: Encode what constitutes a typical face.

e.g., the relationship between facial features Feature invariant approaches:

Aim to find structure features of a face that exist even when pose, viewpoint or lighting conditions vary

Template Matching: Several standard patterns stored to describe the face as a whole

or the facial features separately Appearance-based methods:

The models are learned from a set of training images that capture the representative variability of faces.

1 2 3 4 5 6 7 8

[10]

Three goals & a conclusion

1. Feature Computation : what features? And how can they be computed as quickly as possible

2. Feature Selection : select the most discriminating features

3. Real-timeliness : must focus on po-tentially positive areas (that contain faces)

4. Conclusion : presentation of results and discussion of detection issues.

How did Viola & Jones deal with these challenges?

Three solutions

Feature Computation

~The “Integral” image representation Feature Selection

~The AdaBoost training algorithm Real-timeliness

~A cascade of classifiers

LOGO

Feature Computation

[13]

Overview | Integral Image | AdaBoost | Cascade

Features Can a simple feature (i.e. a value) indicate the existence of a face?

All faces share some similar properties

The eyes region is darker than the

upper-cheeks.The nose bridge region is brighter

than the eyes.That is useful domain knowl-edge

Need for encoding of Domain Knowledge:

Location - Size: eyes & nose bridge region Value: darker / brighter

[14]


Rectangle features:Value = ∑ (pixels in black area) - ∑ (pixels in white area)Three types: two-, three-, four-rec-tangles, Viola&Jones used two-rec-tangle featuresFor example: the difference in brightness between the white &black rectangles over a specific area

Each feature is related to a special location in the sub-win-dow

Each feature may have any size Why not pixels instead of fea-

tures? Features encode domain knowl-edgeFeature based systems operate faster

[15]


Rapid computation of rectangular features Using the integral image

representation we can com-pute the value of any rec-tangular sum (part of fea-tures) in constant time

For example the integral sum in-side rectangle D can be com-puted as:ii(d) + ii(a) – ii(b) – ii(c)

two-, three-, and four-rec-tangular features can be computed with 6, 8 and 9 ar-ray references respectively.

As a result: feature computa-tion

takes less time

ii(a) = Aii(b) = A+Bii(c) = A+Cii(d) = A+B+C+DD = ii(d)+ii(a)-

ii(b)-ii(c)

LOGO

Feature selectionAdaboost Algorithm

[17]


Feature selection Problem: Too many features

In a sub-window (24x24) there are ~160,000 features (all pos-sible combinations of orientation, location and scale of these fea-ture types)

impractical to compute all of them (computationally expen-sive)

We have to select a subset of relevant features – which are informative - to model a face

Hypothesis: “A very small sub-set of features can be combined to form an effective classifier”

How? SOLUTION: AdaBoost algorithm Relevant feature Irrelevant feature

[18]


AdaBoost Stands for “Adaptive” boost Constructs a “strong” classifier as a

linear combination of weighted sim-ple “weak” classifiers

Strong classifier

Weak classifier

WeightImage

[19]


AdaBoost - Characteristics Features as weak classifiers

Each single rectangle feature may be regarded as a sim-ple weak classifier

An iterative algorithmAdaBoost performs a series of trials, each time selecting a new weak classifier

Weights are being applied over the set of the example images

During each iteration, each example/image receives a weight determining its importance

[20]

AdaBoost - Getting the idea…


Given: example images labeled +/- Initially, all weights set equally

Repeat T timesStep 1: choose the most efficient weak classifier that will be a component of the final strong classifier (Problem! Remember the huge number of features…)Step 2: Update the weights to emphasize the examples which were incorrectly classified

This makes the next weak classifier to focus on “harder” examples

Final (strong) classifier is a weighted combination of the T “weak” classifiers

Weighted according to their accuracy

otherwise

xxhT

t

T

t ttt h0

2

1)(1)( 1 1

[21]

AdaBoost – Feature Selection Problem On each round, large set of possible weak classifiers

(each simple classifier consists of a single feature) – Which one to choose?

choose the most efficient (the one that best separates the examples – the lowest error)choice of a classifier corresponds to choice of a feature

At the end, the ‘strong’ classifier consists of T fea-tures

Conclusion AdaBoost searches for a small number of good classi-

fiers – features (feature selection) adaptively constructs a final strong classifier taking

into account the failures of each one of the chosen weak classifiers (weight appliance)

AdaBoost is used to both select a small set of fea-tures and train a strong classifier


LOGO

A cascade of classifiers (for realtimeness)

[23]

Now we have a good face detector Thus, We can build a 200-

featpicture classifier. Experiments showed that

a 200-feature classifier achieves:

95% detection rate0.14x10-3 FP rate (1 in 14084)Scans all sub-windows of a 384x288 pixel image in 0.7 seconds (on Intel PIII 700MHz)

The more the better (?)Gain in classifier perfor-mance Lose in CPU time

Verdict: good & fast, but not enough

Competitors achieve close to 1 in a 1.000.000 FP rate!0.7 sec / frame IS NOT real-time.


[24]

Training a cascade of classifiers


Strong classifier definition:

otherwise

xxh

T

tt

T

ttt h

02

1)(1

)(11 ,

where )1

log(t

t ,

1 t

t

t

Keep in mind: Competitors achieved 95% TP rate,10-6 FP rate These are the goals. Final cascade must do better!

Given the goals, to design a cascade we must choose: Number of layers in cascade (strong classifiers)

Number of features of each strong classifier (the ‘T’ in definition)

Threshold of each strong classifier (the in definition)

Optimization problem: Can we find optimum combination?

T

t t12

1

[25]

A simple framework for cascade training


Viola & Jones suggested a heuristic algorithm for the cascade training: does not guarantee optimality but produces a “effective” cascade that meets previous goals

Manual Tweaking: overall training outcome is highly depended on user’s choices select fi (Maximum Acceptable False Positive rate / layer) select di (Minimum Acceptable True Positive rate / layer) select Ftarget (Target Overall FP rate) possible repeat trial & error process for a given training set

Until Ftarget is met: Add new layer:

Until fi , di rates are met for this layer Increase feature number & train new strong classifier with AdaBoost Determine rates of layer on validation set

[26]

Viola & Jones AlgorithmUser selects values for f, the maximum acceptable false positive rate per layer and d,

the minimum acceptable detection rate per layer. User selects target overall false positive rate Ftarget. P = set of positive examples N = set of negative examples F0 = 1.0; D0 = 1.0; i = 0 While Fi > Ftarget

i++ ni = 0; Fi = Fi-1 while Fi > f x Fi-1

o ni ++ o Use P and N to train a classifier with ni features using AdaBoost o Evaluate current cascaded classifier on validation set to determine Fi and Di o Decrease threshold for the ith classifier until the current cascaded classifier has a detection rate of at least d x Di-1 (this also affects Fi)

N = If Fi > Ftarget then evaluate the current cascaded detector on the set of non-face images and put any false detections into the set N.


LOGO

Training

TrainingSet

(sub-windows)

IntegralRepresentation

Featurecomputation

AdaBoostFeature Selection

Cascade trainer

Training phase

Strong Classifier 1(cascade stage 1)

Strong Classifier N(cascade stage N)

Classifier cascade framework

Strong Classifier 2(cascade stage 2)

FACE IDENTIFIED

LOGO

Face Recognition Process

PCA ALGORITHM

STEP 1 : Convert image of training set to image vectors

A training set consisting of total M images

Each image is of size N x N

STEP 1: Convert image of training set to image vectors(Contd.)

A training set consisting of total M image

Image converted to vector

For each (image in training set)

Ti Vector

N x N Image

N

Free vector space

STEP 2: Normalize the face vector1. Calculate the average face vector A training set consisting of total M image

Free

……

Calculate average face vector ‘U’

Ti

U


Free vector space

……

Calculate average face vector ‘U’

Then subtract mean(average) face vector from EACH face vector to get to get normalized face vector Øi=Ti-U

Ti

U

1. Calculate the average face vectors

2. Subtract avg face vector from each face vector



Free vector space

STEP 2: Normalize the face vector(Contd.)

……

Øi=Ti-U

Eg. a1 – m1

a2 – m2

Ø1= . .

. .

a3 – m3

TiU

1. Calculate the average face vectors

2. Subtract avg face vector from each face vector

STEP 2: Normalize the face vector(Contd.)



Free vector space

STEP 3: Calculate the Eigenvectors (Eigenvectors represent the variations in the faces )

……

To calculate the eigenvectors , we need to calculate the covariance vector C

C=A.AT

where A=[Ø1, Ø2, Ø3,… ØM] N2 X M

Ti

U



Free vector space

……Ti

UC=A.AT

N2 X M M X N2 = N2 X N2

Very huge matrix

……

N2 eigenvectorsA training set consisting of total M image


Free vector space

STEP 3: Calculate the Eigenvectors (Contd.)

……Ti

U

……

N2 eigenvectors

But we need to find only K eigenvectors from the above N2 eigenvectors, where K<M

Eg. If N=50 and K=100 , we need to find 100 eigenvectors from 2500 (i.e.N2 ) VERY TIME CONSUMING



Free vector space


……Ti

U

……

N2 eigenvectors

SOLUTION

“DIMENSIONALITY REDUCTION”

i.e. Calculate eigenvectors from a covariance of reduced dimensionality



Free vector space


……Ti

U

……

M2 eigenvectors

New C=AT .A M XN2 N2 X M = M XM

matrix

STEP 4: Calculating eigenvectors from reduced covariance matrix



Free vector space

STEP 5: Select K best eigenfaces such that K<=M and can represent the whole training set

• Selected K eigenfaces MUST be in the ORIGINAL dimensionality of the face Vector Space

STEP 6: Convert lower dimension K eigenvectors to original face dimensionality

•

•

……Ti

U

……

100 eigenvectors

ui = A vi

ui = ith eigenvector in the higher dimensional spacevi = ith eigenvector in the lower dimensional space



Free vector space

……

2500 eigenvectors

……

100 eigenvectors

= A

ui = A vi

ui

vi

Each 100 X 1 dimension


……

2500 eigenvectors

ui


yellow color shows K selected eigenfaces = ui

STEP 6: Represent each face image a linear

combination of all K eigenvectors

w1 w2 w3 w4 …. wk

∑

w of mean face

We can say, the above image contains a little bit proportion of all these eigenfaces.

w1

Ω = w2

: wk

Calculating weight of each eigenfaces

The formula for calculating the weight is:

wi= Øi. Ui

For Eg.

w1= Ø1. U1

w2= Ø2. U2

Recognizing an unknown face

r1

r2

: rk

Convert the input

image to a face vector

Normalize the face vector

a1 – m1

i a2 – m2

. .

. .

a3 – m3

Project Normalized

face onto the eigenspace

Weight vector of input image

w1

Ω= w2

: wk

Calculate Distance between input weight vector and all

the weight vector of training set

€=|Ω–Ωi|2

i=1…M

UNKNOWN FACE

NOYES

RECOGNIZED AS

Input image of UNKNOWN FACE

Is Distance=thresod

∂ ?

LOGO

Demo

References:

www.google.com www.fec.rec.org. www.wikipedia.com http://sebastianraschka.com/Articles/2014_pca_step_by_ste

p.html M. Lam, H. Yan, An analytic-to-holistic approach for face

recognition based on a single frontal view, IEEE Trans. Pattern Anal. Mach. Intel. 20 (1998) 673-686.

Zhang, Automatic adaptation of a face model using action units for semantic coding of videophone sequences, IEEE Trans. Circuits Systems Video Technol. 8 (6) (1998) 781-795.

http://www.fec.rec.org/

http://www.wikipedia.com/

http://sebastianraschka.com/Articles/2014_pca_step_by_step.html

http://sebastianraschka.com/Articles/2014_pca_step_by_step.html

Real-time Face Recognition & Detection Systems 1

Documents

Transcript of Real-time Face Recognition & Detection Systems 1