A Real-Time Multi Face Detection Technique Using Positive ...€¦ · A Real-Time Multi Face...

A Real-Time Multi Face Detection Technique Using Positive-Negative Lines-of-Face Template

Yuichi Hori, Kenji Shimizu, Yutaka Nakamura and Tadahiro Kuroda

Department of Electrical Engineering, Keio University, Japan

Abstract

This paper describes a real-time multi face detection technique for color video sequences. A 3D rational skin color model and a positive-negative lines-of-face template are proposed to improve signal to noise ratio (SNR) in face detection. Steady State Genetic Algorithm (SSGA) is employed for lines-of-face detection from entire image. Hardware architecture is optimized for high-speed operation and small hardware resources. An experimental system is developed in Field Programmable Gate Array (FPGA), with only 40k gates for logic and 240k gates for memory. It detects 6 faces in real-time (30fps:every 33ms) from 320x240pixels (QVGA) color video sequences. Detection rate of 98% is achieved for 89 images including 205 faces from daily scenes.

1. Introduction

A large variety of face detection techniques have been proposed, and they are classified into two types [1].

The first method is feature-based approach [2], which is based on the knowledge of human faces’ features. It has a weak side for cluttered noises. The second method is image-based approach [3], which is based on the statistical model obtained using a variety of learning methods. It requires long calculation time for scanning a window to evaluate correlation. This hinders face detection for real-time operation.

This paper describes a novel real-time multi face detection technique. To improve SNR and to reduce computation time, a 3D rational skin color model and a positive-negative lines-of-face template are proposed.

SSGA [5,6], which can be pipelined easily, is adopted for global search. An experimental system using FPGA is

developed. It is demonstrated that real-time operation can be achieved by this system.

2. Face Detection Algorithm

Figure 1 shows an overview of proposed face detection algorithm that contains three major operations: (i) image preprocessing to emphasize the edge of faces; (ii) lines-of-face detection by SSGA; (iii) face decision by locating lip.

3. Image Pre-Processing

3.1. 3D Color Model for Skin Tone Extraction

For effective skin tone extraction, an appropriate skin

color model should be chosen. We introduce rational skin tone model in consideration of color characteristics of image input system.

In the image input system, initially a lens focuses light, then the light is transferred to electronic signal of RGB with a Charge Coupled Device (CCD), and finally RGB signals are converted to 8bit R, G, and B digital signals. Reflected light from the same color object is transferred to R, G, and B digital signals, and their values are in proportion to Y of reflected light. Cb and Cr are calculated from R, G, and B. Hence Cb and Cr should be in proportion to Y.

However, with range of high Y, signal power of RGB is saturated. Therefore, Cb and Cr turn out to be not proportional to Y. Hence we choose the skin color model shown in Fig.2. In Fig. 2, gray dots show skin color cluster that is extracted from sample images and 4 rectangular parallelepipeds for each range of Y represent the proposed 3D color model in YCbCr color space.

Face DecisionLines-of-Face Detection

(C) Entire image searchby SSGA

Color Image

(D) Decide whether candidate is face or not

(C)

Image Preprocessing

(B) (D)

(A) Skin tone extraction(B) Edge detection

(A)

Figure 1. Face detection algorithm.

( ) ( ) LnyxfIfitnessn

jkjkjk ×=∑

=1

** , (2)

Figure 2. Proposed 3D color model

for skin tone extraction. Cb and Cr depend on Y. 3.2. Edge Detection and Blurring

For the obtained image by skin tone extraction (Fig.1 (A)), edge detection and blurring are executed. This emphasizes the edge of faces for face detection.

Laplacian filter, which has 5x5 window, is adopted to extract edges of skin tone (Fig.1 (B)) initially. Then, 13x13 window is scanned to whole image. If there is a edge at the center of the window, value of the center pixel is set to L. Surrounding pixels are set to L – i ( i = 1, 2, … , L ) depending on the distance from the center pixel, and a blurred image is obtained like Fig.1 (B). 4. Positive-Negative Lines-of-Face Template

This section describes about a positive-negative lines-of-face template for template matching to detect efficiently the edge of faces from entire image.

In Fig.3, proposed positive-negative lines-of-face template composed of 30 points is shown. Black points stand for the positive template to evaluate existence of lines-of-face. Gray points stand for the negative template to evaluate nonexistence of lines-of-face.

Positive Template

Negative Template

No template placed

Figure 3. Proposed positive-negative

lines-of-face template composed of 30 points.

We make use of a semi-ellipse composed of the positive templates to detect the edge of faces. In addition,

negative templates are laid outside the positive template. There are no negative templates at the bottom of the template because this is neck area. Sometime chin edge does not appear because the color of chin and neck is same. So we put the positive templates sparsely there. Results of lines-of-face detection are shown in Fig.4. 5. Steady State Genetic Algorithm

To extract the edge of faces by matching the proposed template to the image, Genetic Algorithm (GA) [4] is employed for global search procedure. We employ SSGA [5] among a variety of GA techniques.

The simple GA commits sorting to select individuals to come through next generation. In contrast, SSGA does not commit sorting. Genetic operation such as selection, crossover, and mutation can be done N (population size) times for each generation. Time required for software operation is proportional to N, whereas time for sort operation is proportional to N2. If many individuals are used, SSGA works rapidly than simple GA.

In hardware implementation, the sorting of simple GA causes pipeline stall. As SSGA has no sorting, whole process can be pipelined. This improves hardware performance drastically [6]. 5.1. Gene Coding

Template matching for face detection has 4 parameters. They are barycentric coordinate x and y, rotation angle θ and magnitude M. To apply these parameters to SSGA, we define the gene code as g = (x, y, θ, M).

In this work, the input image size is QVGA. So x has 9bit length, and y has 8bit length. Both θ and M have 5bit length for 32step rotation and magnification with the same aspect ratio. Totally, the gene is composed of 27bit length.

5.2. Fitness Evaluation

To evaluate fitness, the template is transformed by decoding gene code at first using affine transformation. The transformed template is placed in the target image, and then the pixel value is integrated for all points of a template. The fitness of an individual is defined by this sum of pixel value as follows. where Ik is an individual, n is number of points included in

the template, L is maximum value of blurred pixel described in 3.2., and f(xkj

*, ykj*) is blurred pixel value

under the template.

If the fitness exceeds a threshold THGA, the area, which is framed in by the template, is determined as face. The area that is detected once is deleted from the search space. 6. Face Decision

The positive-negative template has a potential to detect false positives with face like shape. So it’s essential to decide whether an extracted area is a face or not.

For the face decision, we adopt the approach of lip detection in a detected area. A lip area has high Cr value in the YCbCr color space. In case there is high Cr region over the threshold in a detected area, the extracted area is decided as a face. The Cr threshold is relatively decided by the Cr histogram of the skin area.

Generally the Cr histogram of the skin area in a noise is very precipitous. However that in a face area has a part of relatively high Cr region. 7. Simulation Results 7.1. Condition

To estimate the accuracy of proposed method, we examined 89 QVGA color images. These images contain 205 faces, and include scenes of indoor, outdoor, daytime, night, with and without photoflash and so on.

This experiment uses the positive-negative template shown in Fig.3 and SSGA has 200 individuals in total. We set L=3, THGA=0.85.

7.2. Detection Rate

The experimental results by software simulation are

shown in Table 1. The detection rate and the error rate are calculated as follows.

100facescorrect ofNumber

faces detected ofNumber (%) RateDetection ×= (3)

100facescorrect ofNumber

positives false ofNumber (%) RateError ×= (4)

Table 1 shows that the face detection rate of proposed approach is higher than using conventional positive-only template and the error rate is less. This result shows that our approach has great advantages in extraction of face. Some detection results are shown in Fig.5.

Table 1. Face detection rate. Detection rate Error rate

Conv. positive-only template

77 % 29 %

Positive-negative template

98 % 18 %

7.3. Speed Evaluation

Figure 6 shows the histogram of the number of

evaluation to detect all faces in images. The number of evaluation until all faces in an image were detected was 70,000 in case of using only the positive template. On the other hand, using the positive-negative template, it was about 20,000. This shows that the speed of detection in proposed approach largely surpasses that of conventional positive-template-only methods.

(a) Original image.

(b) 7 false positives are detected.

(c) Only 2 false positives are detected.

Figure 4. Comparison between (b) positive only template and (c) proposed positive-negative template.

0

5

10

15

20

25

30

35

40

45

50

5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80Number of evaluation (x1,000)

Freq

uenc

y

Positive-Negative Template

Positive Template Only

Figure 6. This chart shows number of generation, which required for detecting all of correct faces.

8. Experimental System

An experimental system based on FPGA was realized to demonstrate real-time operation of our method, which is shown in Fig.7.

Image preprocessing and SSGA were implemented in the FPGA running at 100 MHz. The FPGA required about 40k gates for logic and about 240k gates for memory. The face decision method was implemented into HITACHI SH-3 microcontroller. We confirmed that it is possible to detect 6 faces in real-time using this system.

Figure 7. Experimental FPGA system.

9. Conclusions

An efficient technique for real-time face detection is

presented in this paper. An experimental system is developed to demonstrate that real-time face detection is achieved by the proposed method.

The major contributions of this work are • The positive-negative lines-of-face template

composed only of 30 points can achieve high detection rate.

• SSGA is employed for hardware implementation of global search. This achieves drastic improvement in hardware performance.

The future work is to develop LSI (Large Scale

Integrated circuit), which performs the algorithm proposed in this paper.

References

[1] M-H Yang, “Detecting Faces in Images: A Survey”, IEEE Trans. on Pattern Analysis and Machine Intelligence, 2002.

[2] A.Lanitis, “Locating Facial Features Using Genetic Algorithms”, Proc. of the International Conference on Digital Signal Processing, 1995.

[3] E.Osuna, “Training support vector machines:an application to face detection”, Proc. of Conference on Computer Vision and Pattern Recognition, 1997.

[4] J.H.Holland, “Adaptation in Natural and Artificial System”, University of Michigan Press, 1975,

[5] G.Syswerda, “A Study of Reproduction in Generational and Steady-State Genetic Algorithms”, Foundation of Genetic Algorithms, 1990.

[6] B.Shackleford, “A High-Performance, Pipelined, FPGA-Based Genetic Algorithm Machine,” Journal of Genetic Programming and Evolvable Machines, 2001.

Figure 5. Detection results.

A Real-Time Multi Face Detection Technique Using Positive ...€¦ · A Real-Time Multi Face...

Documents

Transcript of A Real-Time Multi Face Detection Technique Using Positive ...€¦ · A Real-Time Multi Face...