UNIVERSITY OF MURCIA (SPAIN)
ARTIFICIAL PERCEPTION AND PATTERN
RECOGNITION GROUP
Estimating 3D Facial Pose in Video with Just Three PointsEstimating 3D Facial Pose in Video with Just Three Points
Ginés García Mateos, Alberto Ruiz GarcíaDept. de Informática y Sistemas
P.E. López-de-Teruel, A.L. Rodriguez, L. FernándezDept. Ingeniería y Tecnología de Computadores
University of Murcia - SPAIN
2
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Introduction (1/3) Introduction (1/3)
• Main objective: to develop a new method to estimate the 3D pose of the head of a human user:– Estimation through a video sequence– Working with the minimum necessary
information: a 2D location of the face– A very simple method, without training,
running in real-time: fast processing– Under realistic conditions: robust to
facial expressions, light, movements– Robustness preferred to accuracy
3
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Introduction (2/3) Introduction (2/3) • 3D pose estimation using 3D tracking…
http://www.lysator.liu.se/~eru/research/
http://www.merl.com/projects/3Dfacerec/ www.cs.bu.edu/groups/ivc/html/research_list.php
Active Appearance Model
Shape & texture models Cylindrical Models
3D morphable mesh
http://cvlab.epfl.ch/research/body
4
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Introduction (3/3) Introduction (3/3) • In short, we want to obtain something like this:
• The result is 3D location (x, y, x), and 3D orientation (roll, pitch, yaw): 6 D.O.F.
5
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Index of the presentationIndex of the presentation
• Overview of the proposed method– 2D facial detection and location
– 2D face tracking
• 3D Facial pose estimation– 3D Position
– 3D Orientation
• Experimental results• Conclusions
6
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Overview of the Proposed Method
Overview of the Proposed Method
• The key idea: separate the problems of 2D tracking and 3D pose estimation.
•Introducing some assumptions and simplifications, pose is extracted with very little information.
The proposed 3D pose estimator could
use any 2D facial tracker
The proposed 3D pose estimator could
use any 2D facial tracker
2D Face detection 2D Face tracking3D Pose
estimation
7
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
2D Face Detection, Location and Tracking Using I.P.
2D Face Detection, Location and Tracking Using I.P.
• We use a method based on integral projections (I.P.), which is simple and fast.
• Definition of I.P.: average of gray levels of an image along rows and columns.
75 100 125 150 175 200 225
PV(y)
100
80
60
40
20
0
y
i(x, y)
PVi : [ymin, ..., ymax] → R
Given by: PVi(y) := i(·, y)
PHi : [xmin, ..., xmax] → R
Given by: PHi(x) := i(x, ·)
20 40 60 80x
225
200
175150
125
100
PH(x)
8
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
2D Face Detection with I.P.2D Face Detection with I.P.
Global view of the I.P. face detector
a
Inputimage
PVface
PHeyes
Step 1. Vertical projections by
strips
Step 2. Horizontal
projection of the candidates
Step 3.Grouping of
the candidates
Final result
9
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
2D Face Detection with I.P.2D Face Detection with I.P.
• To improve the results, we combine two face detectors: combined detector.
Face Detector 1.Look for candidates
Face Detector 2.Verify face candidates
Final detectionresult
Haar + AdaBoost[Viola and Jones, 2001]
Integral Projections[Garcia et al, 2007]
10
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
2D Face Detection with I.P.2D Face Detection with I.P.
[Garcia et al, 2007]
ROC curves on UMU FaceDB (737 img./853 faces)
IP+Haar Haar+IPHaar NeuralNetIntProj TemMatch Cont
0.2 0.4 0.6 0.8 1 1.2% false positives
0.2
0.4
0.6
0.8
1
0.0050.01 0.050.1 0.5 1% false positives
0.2
0.4
0.6
0.8
1
% detected faces
% detected faces
Detec
tor
Det. r
atio
F.P.=
0.584,2% 91,8% 88,6% 39,0% 24,8% 88,6% 96,1%
Time
PIV 2
,6Gh
85 ms85 ms 293 ms293 ms 2338 ms2338 ms 389 ms389 ms 120 ms120 ms 97 ms97 ms 296 ms296 ms
11
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
2D Face Location with I.P.2D Face Location with I.P.
Global view of the 2D face locator
Input image and face
0 5 1015202530x
250 200 150 100 500
0 5 1015202530x
PHojos(x) PH’ojos(x)
Step 1. Orientation estimation
Step 2. Vertical
alignment
Step 3. Horizontal alignment
Final result
250 200 150 100
10 20 30 40x
500MHeyes(x)
MVface(y)10
20
30
40
50
60
y
200 100
20 60 100 140
PVeyes(y)
15
10
5
0
20
15
10
5
0
20 60 100 140
PV’eyes(y)
50 150 250PV’face(y)
30
20
10
0
50 150 250PVface(y)
30
20
10
0
12
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
2D Face Location with I.P.2D Face Location with I.P.
Location accuracy of the 2D face locator
Av. timePIV 2.6Gh 1,7 ms1,7 ms
IntProj NeuralNet EigenFeat
323,6 ms323,6 ms 20,5 ms20,5 ms
13
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
2D Face Tracking with I.P.2D Face Tracking with I.P.
Initial facedetection&location Motion model
updatePrediction ofnew position
Facerelocation
Frame t+1
Lost face
Correct tracking
FACE TRACKING
Step 0.Prediction
Step 3.Orientationestimation
Step 1.Vertical
alignment
Step 2.Horizontalalignment
PHeyes(x)
0 20 40 60
200 150 100
x
PH’eyes(x)
0 20 40 60
200 150 100
x50 150 250PVface(y)
60
40
20
0
-20
50 150 250PV’face(y)
60
40
20
0
-20
50 150 250PV’eyes(y)
30
25 20
15 10
50
50 150 250PVeyes(y)
30
25 20
15 10
50
y
14
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
2D Face Tracking with I.P.2D Face Tracking with I.P.
• Sample result of the proposed tracker.
320x240 pixels, 312 frames at 25fps, laptop webcam
(e1x, e1y) = location of left eye; (e2x, e2y) = right eye; (mx, my) = location of the mouth
15
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
3D Facial Pose Estimation3D Facial Pose Estimation• In theory, 3 points should be enough to
solve the 6 degrees-of-freedom (if focal length and face geometry are known).
• But…
• Location errors are high in the mouth for non-frontal faces.
• Some assumptions are introduced to avoid the effect of this error.
16
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
3D Facial Pose Estimation3D Facial Pose Estimation• Fixed body assumption: fixed user’s
body, moving the head 3D position is estimated in the first frame; 3D orientation in the following frames.
• A simple perspective projection model is used to estimate 3D position.
17
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
3D Position Estimation3D Position Estimation
• f: focal length (known)• (cx,cy): tracked center of the face
X,Y
Z
f
pz
c cx y,p px y,
Imageplane
Center ofprojection
(0,0,0)
p= (px,py,pz) p= (px,py,pz)
cx= (e1x+e2x+mx)/3cy=
(e1y+e2y+my)/3
cx= (e1x+e2x+mx)/3cy=
(e1y+e2y+my)/3
18
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
3D Position Estimation3D Position Estimation
• We have:
cx/f = px/pz ; cy/f = py/pz• Where:
cx= (e1x+e2x+mx)/3; cy= (e1y+e2y+my)/3• So:
px= (e1x+e2x+mx)/3·pz/f
py= (e1y+e2y+my)/3·pz/f• The depth of the face, pz, is computed
with: pz= f·t/r, where r is the apparent face size* and t is the real size.
* For more information, see the paper. .
19
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Estimation of Roll AngleEstimation of Roll Angle
• Roll angle can be approximately associated with the 2D rotation of the face in the image.
roll = arctane2y − e1y
e2x − e1x
• This equation is valid in most practical situations, but it is not precise in all cases.
roll = -43,7º roll = -2,8º roll = 15,9º roll = 34,6º
20
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Estimation of Pitch and YawEstimation of Pitch and Yaw
• The head-neck system can be modeled as a robotic arm, with 3 rotational DOF.
Y
Z Xroll
pitch
yaw
XY
bb
c
ZX
Y
bb
a
ORTHOGRAPHIC VIEW TOP VIEW FRONT VIEW
Zi
• In this model, any point of the head lies in a sphere its projection is related to pitch and yaw.
Y
X i
(dx0,dy0) (dxt,dyt)
r i
21
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Estimation of Pitch and YawEstimation of Pitch and Yaw• rw: radius of the sphere where the center of the eyes lies.
• ri: radius of the circle where that sphere is projected.
• (dx0, dy0): initial center of eyes.
• (dxt, dyt): current center of eyes
Y i
X i
(dx0,dy0)
r i
rw= sqrt(a2+c2)
ri= rw·f/pz
((e1x+e2x)/2, (e1y+e2y)/2)
Y i
X i
(dx0,dy0)(dx1,dy1)
r i
Y i
X i
(dx0,dy0)(dx2,dy2)
r i
Initial framepitch= 0, yaw= 0
Instant t = 1 Instant t = 2
22
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Estimation of Pitch and YawEstimation of Pitch and Yaw• In essence, we have a problem of computing
altitude and latitude for a given point in a circle.• The center of the circle is:
(dx0, dy0 − a·f/pz)
• So we have:
pitch = arcsin
• And:
yaw = arcsin
dyt − (dy0 − a · f/pz)
ri- arcsin a/c
dxt − dx0
ri · cos(pitch + arcsin(a/c))
Y
X i
(dx0,dy0) (dxt,dyt)
ri
23
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Experimental Results (1/7)Experimental Results (1/7)
• Experiments carried out:– Off-the-shelf webcams.– Different individuals.– Variations in facial expressions and
facial elements (glasses).
• Studies of robustness, efficiency, comparison with a projection-based 3D estimation algorithm.
• In a Pentium IV at 2.6Gh: ~5 ms file reading, ~3 ms tracking, ~0.006 ms pose estimation
24
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Experimental Results (2/7)Experimental Results (2/7)
• Sample input video: bego.a.avi
320x240 pixels, 312 frames at 25fps, laptop webcam
25
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Experimental Results (3/7)Experimental Results (3/7)
• 3D pose estimation results
320x240 pixels, 312 frames at 25fps, laptop webcam
26
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Experimental Results (4/7)Experimental Results (4/7)
-20
-10
0
10
20
30
0 100 200 300 400# frames
Pitch
Proposed method
Projection-based
Proposed method Projection-based
27
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Experimental Results (5/7)Experimental Results (5/7)
• Range of working angles…
• Approx. ±20º in pitch and ±40º in yaw.
• The 2D tracker is not explicitly prepared for profile faces!
28
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Experimental Results (6/7)Experimental Results (6/7)
• With glasses and without glasses
29
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Experimental Results (7/7)Experimental Results (7/7)• When fixed-body assumption does not hold
• Body/shoulder tracking could be used to compensate body movement.
30
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Conclusions (1/3)Conclusions (1/3)
• Our purpose was to design a fast, robust, generic and approximate 3D pose estimation method:– Separation of 2D tracking and 3D pose.
– Fixed-body assumption.
– Robotic head model.
• 3D position is computed in the first frame.• 3D orientation is estimated in the rest of
frames.• Estimation process is very simple, and
avoids inaccuracies in the 2D tracker.
31
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Conclusions (2/3)Conclusions (2/3)
• Future work: using the 3D pose estimator in a perceptual interface.
32
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
Conclusions (3/3)Conclusions (3/3)
• The simplifications introduced lead to several limitations of our system, but in general…
• Human anatomy of the head/neck system could be used in 3D face trackers.
• The human head cannot move independently of the body!
• Taking advantage of these anatomical limitations could simplify and improve current trackers.
33
ESTIMATING3D FACIAL
POSE IN VIDEO WITH JUST THREE
POINTS
G. GarcíaA. Ruiz
P.E. LópezA.L. RodríguezL. Fernández
3DFP’2008ANCHORAGEJUNE, 2008
LastLast
• This work has been supported by the project Consolider Ingenio-2010 CSD2006-00046, and TIN2006-15516-C04-03.
• Sample videos:
http://dis.um.es/~ginesgm/fip
• Grupo PARP web page:
http://perception.inf.um.es/
Thank you very much
Top Related