1 Face Synthesis M. L. Gavrilova. 2 Outline Face Synthesis From Modeling to Synthesis Facial...

1

Face Synthesis

M. L. Gavrilova

2

Outline

Face Synthesis From Modeling to Synthesis Facial Expression Synthesis Conclusions

3

Face Synthesis

How to synthesize photorealistic images of human faces has been a fascinating yet difficult problem in computer graphics.

Here, the term “face synthesis” refers to synthesis of still images as well as synthesis of facial animations.

For example, the technique of synthesizing facial expression images can be directly used for generating facial animations, and most of the facial animation systems involve the synthesis of still images.

4

Face SynthesisFace synthesis has many interesting applications.

In the film industry, people would like to create virtual human characters that are indistinguishable from the real ones.

In games, people have been trying to create human characters that are interactive and realistic.

There are commercially available products that allow people to create realistic looking avatars that can be used in chatting rooms, e-mails, greeting cards, and teleconferencing.

Many human-machine dialog systems use realistic-looking human faces as visual representation of the computer agent that interacts with the human user.

5

Face Modeling from an Image Sequence

Face modeling:image matching,structure from motion, and model fitting.

First, two or three relatively frontal views are selected, and some

image matching algorithms are used to compute point correspondences. Point correspondences are computed

either by using dense matching techniques such as optimal flow or feature-based corner matching.

Second, one needs to compute the head motion and the 3D structures of the tracked points.

Finally, a face model is fitted to the reconstructed 3D points. People have used different types of face model representations including parametric surfaces, linear face scans, and linear deformation vectors.

6


Liu et al. developed a face modeling system that allows an untrained user

with a personal computer and an ordinary video camera to create an

instantly animate his or her face model.

After the matching is done, they used both the corner points from the

image matching and the five feature points clicked by the user to estimate

the camera motion.

7


Shan et al. proposed an algorithm, called model-based bundle

adjustment, that combines the motion estimation and model fitting

into a single formulation. Their main idea was to directly use the

model space as a search space.

8


On the top are the front views, and on the bottom are the side views.

On each row, the one in the middle is the ground truth, on the left is the result from the traditional bundle adjustment, and on the right is the result from the model-based bundle adjustment.

The result of the model-based bundle adjustment is much closer to the ground truth mesh.

9

Face Modeling from 2 Orthogonal Views

A number of researchers have proposed that we create face models from two orthogonal views: one frontal view and one side view.

The frontal view provides the information relative to the horizontal and vertical axis, and the side view provides depth information.

The user needs to manually mark a number of feature points on

both images. The feature points are typically the points around the face features, including eyebrows, eyes, nose and mouth. The more feature points, the better the model, but one needs to balance between the amount of manual work required from the user and the quality of the model.

10

Face Modeling from a Single Image

Liu developed a fully automatic system to construct 3D face models from a single frontal image.

They first used a face detection algorithm to find a face and then a feature alignment algorithm to find face features. By assuming an orthogonal projection, they fit a 3D face model by using the linear space of face geometries. Given that there are existing face detection and feature alignment systems, implementing this system is simple.

The main drawback of this system is that the depth of the reconstructed model is in general not accurate. For small head rotations, however, the model is recognizable.

11

Example of model generation

Figure (top) shows an example where the left is the input image and the right is the feature alignment result. Figure (middle) shows the different views of the reconstructed 3D model. Figure (bottom) shows the results of making expressions for the reconstructed face model.

12

Outline

Face Synthesis Face Modeling Facial Expression Synthesis Conclusions

13

Facial Expression Synthesis

Physically Based Facial Expression Synthesis One of the early physically based approaches is the work by

Badler and Platt, who used a mass and spring model to simulate the skin. They introduced a set of muscles. Each muscle is attached to a number of vertices of the skin mesh. When the muscle contracts, it generates forces on the skin vertices, thereby deforming the skin mesh. A user generates facial expressions by controlling the muscle actions.

Waters introduced two types of muscles: linear and sphincter. The lips and eye regions are better modeled by the sphincter muscles. To gain better control, they defined an influence zone for each muscle so the influence of a muscle diminishes as the vertices are farther away from the muscle attachment point.

Morph_Based Facial Expression Synthesis Given a set of 2D or 3D expressions, one could blend these

expressions to generate new expressions. This technique is called morphing or interpolation. This technique was first reported in Parke’s pioneer work. Beier and Neely developed a feature-based image morphing technique to blend 2D images of facial expressions. Bregler et al. applied the morphing technique to mouth regions to generate lip-synch animations.

14

Facial Expression SynthesisExpression Mapping Expression mapping (also called performance-driven

animation) has been a popular technique for generating realistic facial expressions. This technique applies to both 2D and 3D cases. Given an image of a person’s neutral face and another image of the same person’s face with an expression, the positions of the face features (e.g. eyes, eyebrows, mouths) on both images are located either manually or through some automated method.

Noh and Neumann developed a technique to automatically find a correspondence between two face meshes based on a small number of user-specified correspondences. They also developed a new motion mapping technique. Instead of directly mapping the vertex difference, this technique adjusts both the direction and magnitude of the motion vector based on the local geometries of the source and target model.

Liu et al. proposed a technique to map one person’s facial expression details to a different person. Facial expression details are subtle changes in illumination and appearance due to skin deformations. The expression details are important visual cues, but they are difficult to model.

15

Facial Expression Synthesis

Expression ratio image. Left: neutral face. Middle: expression face. Right: expression Ratio image. The ratios of the RGB components are converted to colors for display purpose.

Mapping a smile to Mona Lisa’s face. Left: neutral face. Middle: result from geometric warping. Right: result from ERI.

16

Geometry-Driven Expression Synthesis

Mapping expressions to statues. A. Left: original statue. Right: result from ERI. B. Left: another statue. Right: result from ERI (Z. Zhang, MSR).

17


To increase the space of all possible expressions, the face issubdivided into a number of subregions. For each subregion, thegeometry associated with the subregion is used to compute thesubregion texture image. The final expression is then obtained byblending these subregion images together. The figure is anoverview of the system.

Geometry-driven expression synthesis system.

18


The function MotionPropagationFeaturePointSet is defined asfollows:

19


a. Feature points. b. Face region subdivision

20


Example images of the male subject.

21


In addition to expression mapping,

Zhang et al. applied theirtechniques to expression

editing.

They developed an interactiveexpression editing system thatallows a user to drag a facefeature point, and the systeminteractively displays the

resultingimage with expression details.

The figure shows some of theexpressions generated by theexpression editing system.

Expressions generated by the expression editing system.

22

3D synthesis

Placement ofControl Points

Target pictures

Ge

ne

ric face

m

esh

Sibson CoordinatesComputation

DeformationWith

Control points

23

System Description

Placement of Control Points:Very important for the final synthesis result; Only the vertices inside are deformable

1. How many points? 2. Where to put those points?

• Important features must be controlled in a specific way (such as the corners of the eyes, the nose, the mouth etc..)

• Include as many facial vertices as possible• Acceptable to be pointed out manually• Computational expenses

24

System Description

18 control points: the fewest number to indicate those importance features, such as: eyes, eyebrows, nose, mouth etc..

25

System Description

28 morphing zones: Advantages: increase computational efficiency; local morphing effects: preventing incorrect deformation effects from other areas;

26

System Description

What we know?The original 3D control points position in the generic face meshWhat we get?The displacement of these control points

27

System Description

Sibson Coordinates Computation:

DFFD displacement relation:

In order to strengthen or alleviate the displacement effects from different Sibson neighbors, different weights wi to Sibson coordinates.Final Weighted DFFD relation is:

28

System Description

2D Sibson Coordinates Computation:

))((

))()(('

'

pVAre

pVPVAreu ii

))((

))()(('

'

pVVol

pVPVVolu ii

3D Sibson Coordinates Computation:

29

System Description

Deformation With Control points:

30

Synthesis Results

Implement in MatlabInput images, generic face mesh, texture all from: FaceGen platformGeneric face mesh: 7177 vertices, 6179 facetsInput Image size: 400x477 pixels, 96dpiExperiments on Intel Pentium 4 CPU: 2.0 GHz with 256MB of RAM: 180s

31

Synthesis Results

32

Conclusions for Face Synthesis

One problem is how to generate face models with fine geometric details. Many 3D face modeling techniques use some type of model space to constrain the search, thereby improving the robustness. The resulting face models in general do not have the geometric details, such as creases and wrinkles. Geometric details are important visual cues for human perception. With geometric details, the models look more realistic; and for personalized face models, they look more recognizable to human users. Geometric details can potentially improve computer face recognition performance as well.

Another problem is how to handle non-Lambertian reflections. The reflection of human face skin is approximately specular when the angle between the view direction and lighting direction is close to 900. Therefore, given any face image, it is likely that there are some points on the face whose reflection is not Lambertian. It is desirable to identify the non-Lambertian reflections and use different techniques for them during relighting.

How to handle facial expressions in face modeling and face relighting is another interesting problem. Can we reconstruct 3D face models from expression images? One would need a way to identify and undo the skin deformations caused by the expression. To apply face relighting techniques on expression face images, we would need to know the 3d geometry of the expression face to generate correct illumination for the areas with strong deformations.

33

Future Work Conclusion: A method for 3D facial model synthesis is

proposed. With two orthogonal views of an individual face image as the input, 18 feature points are defined on the two images. Then, a Voronoi based interpolation technique, DFFDs, is utilized to deform a generic face mesh to fit the input face images. With the synthesized facial models, the same animation technique can be used to generate individual facial animation.

Future work: 1. Increase control points number, automate feature points

extraction: image segmentation, edge detection techniques etc.. 2. Analyze real facial expression video data, construct a

common facial expression database to drive the animation

34

References S. Z. Li and A. K. Jain. Handbook of Face recognition, 2005 Y.-L. Tian, T. Kanade, and J. Cohn. Evaluation of gabor-

wavelet-based facial action unit recognition in image sequences of increasing complexity. In Proc. Of IEEE Int. Conf. on automatic face and gesture recognition, 2002

Z. Wen and T. Huang. Capturing subtle facial motions in 3D face tracking. In Proc. Of Int. Conf. On Computer Vision, 2003

J. Xiao, T. Kanade, and J. Cohn. Robust full motion recovery of head dynamic templates and registration techniques. In Proc. Of Int. Conf. On automatic face and gesture recognition, 2002

Z. Liu. A fully automatic system to model faces from a single image. Microsoft research technical report, 2003

Z. Liu, Y. Shan, and Z. Zhang. Expressive expression mapping with ratio images. In Siggraph 2001

Q. Zhang, Z. Liu, B. Guo, and H. Shum. Geometry-driven photorealistic facial expression synthesis. In SCA 2003.

1 Face Synthesis M. L. Gavrilova. 2 Outline Face Synthesis From Modeling to Synthesis Facial...

Documents

Transcript of 1 Face Synthesis M. L. Gavrilova. 2 Outline Face Synthesis From Modeling to Synthesis Facial...