Computer Animation (INFOMCANIM)

Computer AnimationLecture 4

Facial Animation

Facial animation

• It is a difficult task for computer animators

• Faces are too familiar to us

• They are unique

• Has a complex structure

Application areas

• Cartoon animation• Requires exaggerated expressions

• Realistic character animation• Must adhere to the constraints of realistic human anatomy

• Telecommunications and HCI• Must be computationally efficient

History

• 1972 – Frederic Parke: Pioneering work on facial animation

• 1977 - Ekman & Friesen : Facial Action Coding System

• 1979 - Parke : Parametric Models

• 1981 - Platt & Badler : Physically-based muscle-controlled face model

• 1985 - Tony de Peltrie : first animated film with facial expressions

• 1987 - Magnenat -Thalmann : Rendez-vous à Montréal

• 1987 - Lewis & Parke: Automatic Speech Synchronization

• 1987 - Waters : Muscle model

• 1987 - Magnenat-Thalmann : Abstract Muscle Action Procedures

• 1988 - Hill : Automatic Speech Synchronization

• 1991 - Kalra : SMILE Multilevel Animation System (pseudo muscles)

• 1996 - Pelachaud : Speech co-articulation

• 1998 - MPEG-4 : Facial Model and Animation Coding Standard

• 1999 - Blanz & Vetter : Principal Components for face

• 1999 - Voice Puppetry : Animate any face using voice

• 2000s : Facial animation used in many films, Starwars,

The Matrix, Shrek, Lord of the Rings

• 2006 - SoftImage - FaceRobot : First commercial software for facial animation

• ..

• 2021 – MetaHumans Epic Games: First realistic facial animations as part of a game engine (Unreal)

Tony de Peltrie

Rendez-vous à Montréal

Soft Image

MetaHumans

History

Deformation

Parameterization

FACSEkman 1978

Describe in termof muscle deformation

MPAKalra 1993

Visual result ofMuscle contraction

MPEG-41999

Feature point control

Based on musclesPlatt et al. 1981Water et al. 1987Terzopoulos et al. 1990

interpolationParke 1972, 1974

Pseudo musclesFFD ~1986 Spline ~1988 Feature points 2000Expression cloning 2001

Image-based techniques ~1992

Pighin 1998

Cosatto & Graf 2000

Meet Mike: Siggraph 2017

https://www.fxguide.com/featured/real-time-mike/

https://www.fxguide.com/featured/real-time-mike/

Virtual faces in games: Current state

HellBlade: Senua’s Sacrife Uncharted 4

Pushing the boundaries of uncanny valley

https://www.youtube.com/watch?v=CmrXK4fNOEo

Virtual faces in games: Current state

https://www.fxguide.com/quicktakes/epic-games-announces-metahuman/https://www.youtube.com/watch?v=fXWjaNHYUl4 (Dynamixyz + MetaHumans)

https://www.fxguide.com/quicktakes/epic-games-announces-metahuman/

https://www.youtube.com/watch?v=fXWjaNHYUl4

Autonomous Facial Animation

Autonomous facial animation based on the socio-emotional states

BabyXhttps://www.youtube.com/watch?v=yzFW4-dvFDASoulMachines: https://www.soulmachines.com/

MACH: My automated coach

https://www.youtube.com/watch?v=yzFW4-dvFDA

https://www.soulmachines.com/

Human head

• Skull, Facial muscles, Skin, Eyes, Teeth, Tongue

• 7 bones in the skull and 15 small bones in the nasal and oral cavity

• Facial muscles: • Muscles of facial expression• Muscles of jaw• Tension/relaxation of facial skin

• Muscles connect• Two bones, bone and skin/muscle, or two different

skin/muscle regions

Types of facial muscles

• Sphincters: contract radially towards a center point, e.g. orbicularis oris, orbicularis oculi

• Linear (parallel) muscles: contract longitudinally towards their origin, e.g. levator labii sup., zygomaticus minor/major

• Sheet muscles: composed of several linear muscles side-by-side, e.g. frontalis

Mechanical properties of skin

• Skin composed of various layers with different elastic and viscous characteristics

• Skin has highly non-linear stress-strain curve:• Low stress: low resistance against deformation

• High stress: sharp increase in resistance

Do we need to know all these?

Depends on the details you want to achieve- at least we need to know about the shape/structure/position of facial components and their interactions

Methods

• Generating the face

• Deforming the face

• Animating the face

• Rendering the face

Facial Animation Techniques

• Performance-driven• Transfer performance of human actor to a synthetic face model

• Synthetic motion• From text, audio or defined by an artist

• Complete script vs interactive animation

Two levels of facial animation

• Dynamics of motion (temporal domain)• Feature point coordinates

• Muscle contractions

• Action units

• Surface deformation (spatial domain)• Displacement of vertices of a high-resolution mesh

• Generating wrinkles

1) Dynamics of motion

• Linear interpolation, blending, morphing

• Segmental interpolation: different interpolation values for different regions (eyes, mouth)

2) Surface deformation

• Shape interpolation• Interpolation of the entire face

• Parametric models• Perform facial animation using a set of controllers that manipulate (local)

regions/features

• Physics-based animation• Represent and manipulate expressions based on physical characteristics of skin tissue

and muscles

• Learning-based animation• Record key frames from data and morph for smooth transitions between key frames

Shape Interpolation Methods

• One of the most popular methods in practice is to use shape interpolation

• Several different key expressions are sculpted ahead of time

• The key expressions can then be blended on the fly to generate a final expression

• One can interpolate the entire face (happy to sad) or more localized zones (left eyelid, brow, nostril flare…)

Shape Interpolation

• Shape interpolation allows blending between several pre-sculpted expressions to generate a final expression

• It is a very popular technique, as it ultimately can give total control over every vertex if necessary

• However, it tends to require a lot of set up time

• It goes by many names:• Morphing• Morph Targets• Multi-Target Blending• Vertex Blending• Geometry Interpolation• etc.

Shape Interpolation Algorithm

• To compute a blended vertex position:

• The blended position is the base position plus a contribution from each target whose DOF value is greater than 0

• To blend the normals, we use a similar equation:

( ) −+= baseiibase vvvv

( ) −+= baseiibase nnnn

Shape Interpolation and Skinning

• Usually, the shape interpolation is done in the skin’s local space

• In other words, it’s done before the actual smooth skinning computations are done

• We use a simple layered approach• Skeleton Kinematics

• Shape Interpolation

• Smooth Skinning

Skeleton, Morph, & Skin Data Flow

( )

parent

mjnt

WLW

LL

=

= ,...,, 21

*

*

1*

1

n

nn

WBnn

WBvv

=

=

=

−

−

iii

iii

w

w

( )

( )

−+=

−+=

baseiibase

baseiibase

nnnn

vvvv

M ...21 NMM ...21 ++

nv ,

Target Storage

• Morph targets can take up a lot of memory. This is a big issue for video games, but less of a problem in movies.

• The base model is typically stored in whatever fashion a 3D model would be stored internally (verts, normals, triangles, texture …)

• The targets, however, don’t need all of that information, as much of it will remain constant (triangles, texture …)

• Also, most target expressions will only modify a small percentage of the verts

• Therefore, the targets really only need to store the positions and normals of the vertices that have moved away from the base position (and the indices of those verts)

Colors and Other Properties

• In addition to interpolating the positions and normals, one can interpolate other per-vertex data:• Colors

• Alpha

• Texture coordinates

• Auxiliary shader properties

Vascular Expression

• Vascular expression is a fancy term to describe blushing and other phenomena relating to the color change in the face

• Adding subtle changes in facial color can help improve realism

• This can be achieved either by blending a color value with every vertex (along with the position and normal)

• Alternately, one could use a blush texture map controlled by a blended intensity value at each vertex

Wrinkles

• Every vertex stores an auxiliary property indicating how wrinkled that area is• On the base model, this property would probably be 0 in most of the verts, indicating an unwrinkled

state

• Target expressions can have this property set at or near 1 in wrinkled areas

• When facial expressions are blended, this property is blended per vertex just like the positions and normals

• For rendering, this value is used as a scale factor on a wrinkle texture map that is blended with the main face texture

• Even better, one could use a wrinkle bump map or displacement map

Why did we need a parameterization system ?

• What do we have ?• Complex mesh topology• Different mesh for each character

• What do we want ?• Quick and easy facial animation design• Animation working on any character• Running for real time application• Reduce data for animation (for storage

and network application)

• One solution• Create a topology independent

parameterization• Standardization

Parameterization standards

• 3 main parameterization systems• 1978 : Facial Action Coding System (FACS)

• 1993 : Minimal Perception Action

• 1999: MPEG4 parameterization for facial animation

Facial Action Coding System (FACS)• A system to taxonomize human facial

movements.

• Originally developed by Swedish anatomist Carl-Herman Hjortsjö (1969).

• Adopted by Paul Ekman in 1978.• 100 most influential people in the world by

TIME Magazine.

• A major update is done by Ekman and Friesen in 2002.

http://www.paulekman.com

Facial Action Coding System (FACS)• Action Units (AUs)

Emotion Action Units

Happiness 6+12

Sadness 1+4+15

Surprise 1+2+5B+26

Fear 1+2+4+5+7+20+26

Anger 4+5+7+23

Disgust 9+15+16

Contempt R12A+R14A

MPEG-4 Facial Animation Parameters

• Part of the MPEG-4 Face and Body Animation (FBA) International standard (ISO 14496) since 1999

• Developed by “Moving Experts Group” to virtually represent humans and humanoids, for low bitrate compression and transmission.

MPEG-4 Facial Animation Parameters (FAPs)• FDPs (Facial Definition Parameters):

Face is defined by 84 FAPs (feature points)• For constructing 3D Face Geometry

• FAPs (Facial Animation Parameters) Each FDP is moved by FAPs

• There are 68 FAPs

• FAPUs (Facial Animation Parameter Units): distance between key facial features on a specific face• To scale FAPs to any face

Robot Head animated with FAP animation – MPEG-4 to FACs conversion

http://www.zerrinyumak.com/?page_id=17

FAP_to_ROBOT.wmv

http://www.zerrinyumak.com/?page_id=17

Parametric models: direct parameterization

• Functional mapping consists of interpolations, affine transformations, translations and generative procedures applied to a subset of the surface control points

Functional mapping

Parameters

Facial attributes

Parametric models: high-level parameterization

Low-level control

Visemeparameters

Facial attributes

Emotion parameters

Speech Animation• Animation and sound are built automatically using phonemes and

visemes• Text → phoneme → viseme → animation parameters

• Animation is based on the building blocks in speech called phonemes. • Text-to-speech systems (TTS) convert text into phonemes

• There exist a viseme for each phoneme (or for a set of phonemes) which are visual counterparts of phonemes.

ii ou aa

Co-articulation

• Phoneme – diphone – triphone

• In actual speech production, boundaries between these discrete units are blurred (co-articulation)

• Important in animation so that smooth transitions can be generated

• Look-at the previous, present and the next phoneme to determine the mouth position • It might effect up to five positions before and after the current phoneme

Co-articulation models• Rule-based: Pelachaud [91]

• Clustering and ranking phoneme lip shapes based on how deformable they are. • Deformability refers to the extent that the lip shape for a phoneme cluster can be

modified by surrounding phonemes

• Least deformable such as f, v to most deformable such as s and m

• Also depends on speech rate• a person talking slowly moves her lips more than a person speaking rapidly

• Relaxation-contraction time• Whether each action has time to contract after the previous phoneme and relax before

the next phoneme

Co-articulation models

• Dominance and Blending - Cohen and Massaro [93]

• Each phoneme segment has an associated target set of facial control parameter values and a dominance function

Co-articulation models• Actual parameter value at a frame is determined by blending the

dominance functions for that parameter using a weighted average• N is the number of phoneme segments in the utterance

Learning co-articulation and emotion from data

• Derive a generative model of expressive facial motion that incorporates emotion control, while maintaining accurate lip-synching.

• The emotional content of the input speech can be manually specified by the user or automatically extracted from the audio signal using a Support Vector Machine classifier.

Yong Cao, Wen C. Tien, Petros Faloutsos, and Frédéric Pighin. 2005. Expressive speech-driven facial animation. ACM Trans. Graph. 24, 4 (October 2005), 1283-1302.

Learning co-articulation and emotion from data

• P: phoneme label

• C: trajectories of the prosody features

• M: compressed anime motion

• E: Emotion label

JALI: An Animator-Centric Viseme Model for Expressive Lip-Synchronization • http://www.dgp.toronto.edu/~elf/jali.html (Siggraph, 2016)

http://www.dgp.toronto.edu/~elf/jali.html

Expressive Speech

• Different animation layers can be used in order to create an expressive speech animation such as emotions, head rotation and eyes blinking/gaze

• Requires blending of different animations at different layers realistically

• Requires a high-level control mechanism to define the timing of each expression

Eyes blinkingand rotation

Headrotation

Emotions

Phonemes

Sound

Blending Speech and Emotions

▪ Co-articulation

Viseme parameters

<viseme id= "4" filename="sp_viseme_4.ex" spread="2.0" scale="0.8" weight="1.0" in="50" out="50"/>

▪ Generation of emotions

Streched to the duration of speech

Intensity is set to the intensity of max emotion in the emotional state vector received from emotionengine

Attack-sustain-decay-release envelop

Facial Animation Control

• Creating facial animation from tagged text

<begin_gesture id="raise_eyebrows"/>I don't know what that means,<end_gestureid="raise_eyebrows"/> but <begin_gesture id="lower_eyebrows"/> you do.<end_gestureid="lower_eyebrows"/>

Facial Animation Languages

• Provides higher level control of animation

• Integrated with an animation player

• Mostly XML based

• VHML: Virtual Human Markup Language

• AMPL: Affective Presentation Markup Language

• FML: Face Modeling Language

• AML: Avatar Mark-up Language

Behavior Mark-Up Language (BML)

Behavior Mark-up Language• Six phases of animation:

• start, ready, stroke-start, stroke, stroke-end, relax, end

• Synchronized by assigning the sync-point of one behavior to the sync-point of the other behavior

Physics-based models

• Represent and manipulate expressions based on physical characteristics of skin tissue and muscles

• Viscoelastic in its responses to• stress: the force of load• strain: the deformation or stretch

• Viscoelastic response to stress/strain• Elastic properties:

• Returns to rest shape when load is removed• Model: spring

• Viscous properties:• Energy is absorbed• Model: damper

Mass-Spring Networks

• Common technique for simulating the dynamics of skin

• Vertices = mass points, edges = springs

• Lagrangian equations of motion are integrated over time using numerical algorithms

Vector muscles (Waters, 1987)

• Simulation of muscle effect with vector representation of skin forces

• Good realism

• Less computation than physical representation of skin layers

Multi layer springs (Terzopoulos and Water, 1995)

• Representation of skin with multi layer springs mass system

• Each layer is related to anatomical point of view

• Computation of realistic skin deformation according to muscle contractions

• Lots of computation

Automatic determination of facial muscle activations from motion capture data (Sfakis et al. 2006)

• Anatomical face model controlled by• muscle activations• kinematic bone degrees-of-freedom

• Mapping motion capture parameters to control parameters of a physical face

• Phyesems: motion units for phonems containing muscle activation signals

Pseudo muscle deformation systems

• Pseudo muscles : related to the visual aspect of muscle contraction raise than muscle contraction itself

• Geometrical approaches• Free Form Deformation (Kalra, 1992)

• Spline pseudo muscles (Forsey, 1990)

• Radial basis functions (Noh and Neumann, 2000)

• Advantages• Control more easy

• Less computation required, more efficient for real time applications

• Inconvenient• Mesh deformation often less realistic

Free Form Deformation (Kalra, 1992)

• Combination of FFD and region interpolation to animate face

• For each group of muscles, an FFD is created

• Linear interpolation between regions

• No precise control of muscles and skin effects

Kalra, P., Mangili, A., Thalmann, N. M. and Thalmann, D. (1992), Simulation of Facial Muscle Actions Based on Rational Free Form Deformations. Computer Graphics Forum, 11: 59–69

Spline pseudo muscles (Forsey, 1990)

• Splines offer smooth and flexible deformations when compared to free form deformations.

• Muscle effects are controlled by patch splines.

• Since there are few data to be controlled it is more efficient for computation.

Radial Basis Functions (Noh and Neumann, 2000)

• Deformation controlled by a feature point on the model

• Mesh topology independency

• Intuitive manipulation

• Each feature point animate a part of the model (neighbor vertices)

Jun-yong Noh, Douglas Fidaleo, and Ulrich Neumann. 2000. Animated deformations with radial basis functions. In Proceedings of the ACM symposium on Virtual reality software and technology (VRST '00). ACM, New York, NY, USA, 166-174

Image-based deformation

• Morphing between photographic images• Blendshape interpolation

• Interpolating between static photos• Texture manipulation

• Vascular expressions• Skin color changes according to emotions• Most notable work: Kalra & Magnenat-Thalmann 1994

• Often used for special effects in movies• Lord of the Rings• Matrix• Star Wars

• Photorealistics results

Image-based deformation

• An early representative is Beier & Neely in 1992

• 1998: Pighin produce highly realistic facial expressions

- Thaddeus Beier and Shawn Neely. 1992. Feature-based image metamorphosis. In Proceedings of the 19th annual conference on Computer graphics and interactive techniques (SIGGRAPH '92), James J. Thomas (Ed.). ACM, New York, NY, USA, 35-42

- Frédéric Pighin, Jamie Hecker, Dani Lischinski, Richard Szeliski, and David H. Salesin. 1998. Synthesizing realistic facial expressions from photographs. In Proceedings of the 25th annual conference on Computer graphics and interactive techniques (SIGGRAPH '98). ACM, New York, NY, USA, 75-84

Blendshapes• Originated in industry

• The Curious Case of Benjamin Button, King Kong, The Lord of the Rongs, Final Fantasy

• Became a subject of academic research relatively recently

• Linear combination of facial expressions

Vector-matrix expression of a blendshape model

• Consider a facial model composed of 100 blendshapes

• Each having 10000 control vertices

Resulting face BlendshapesBlend weights

Delta blendshapes

Constructing Blendshapes

• Conceptually simple but difficult to construct• e.g. Lord of the rings had 675 targets

• A skilled artist deforms a based mesh

• Scanned from a real actor or a sculpted model

• 3D reconstruction from images

Animating with blendshapes

• Meaning specifying the weights

• Keyframe-animation (controlling the sliders)

• Performance-driven

• Direct-manipulation

Performance-driven animation

• 3D motion capture-data

• Model-based tracking of video (2D or depth camera)

• PCA-basis vs blendshape basis

Lance Williams, Performance-driven facial animation, Siggraph 1990

Learning controls for blend shape animation from 3D motion capture data

• Proposes an automatic technique that extracts a set of parameters from a blend shape model

Pushkar Joshi, Wen C. Tien, Mathieu Desbrun, and Frédéric Pighin. 2003. Learning controls for blend shape based realistic facial animation. In Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer Animation (SCA '03). Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, 187-192.


• When the source motion to match is available in the form of 3D motion capture, this is a constrained linear problem that can be solved with quadratic programming

Marker position

Corresponding blendshapes

Minimize

m is the number of motion markers

n is the number of blendshapes


• Solving this system is equivalent to orthogonally projecting the motion onto the set of blendshapes

• It does not have an exact solution, since the motions can be more expressive than what the blendshape model allows

• To produce a mesh that follows the motion precisely, translate the vertices by

• The residual, known for a small set of points is interpolated to the rest of the facial mesh using Radial Basis Function

Model-based tracking

• Online modelling for real-time facial animation (Siggraph 2013) (online using Kinect)

• https://www.youtube.com/watch?v=DBoChIFrj2c

• Displaced Dynamic Expression Regression for Real-time Facial Tracking and Animation (Siggraph 2014) (using a single video camera)

• https://www.youtube.com/watch?v=mAGEiv3UNEU

• Dynamic 3D Avatar Creation from Hand-held Video Input (Siggraph 2015) (using a mobile phone camera)

• https://www.youtube.com/watch?v=6zP0E2atshw

• FaceShift markerless motion capture (commercial)

• https://www.youtube.com/watch?v=24qUFDdZAG8

https://www.youtube.com/watch?v=DBoChIFrj2c

https://www.youtube.com/watch?v=mAGEiv3UNEU

https://www.youtube.com/watch?v=6zP0E2atshw

https://www.youtube.com/watch?v=24qUFDdZAG8

Direct manipulation blendshapes• Inverse kinematics for facial animation

• The artist directly moves points on the surface and the software must solve for the underlying weights or parameters

J.P. Lewis, Weta Digital Ken Anjyo, OLM Digital IEEE Computer Graphics & Applications, 2010

References

• Frederic Parke and Keith Waters: Computer Facial Animation, Second Edition, CRC press, 2008.

• MIRALab, University of Geneva, Course Notes on Facial Animation

• Steve Rotenberg, CSE 169, Facial Expressions, Course Notes

• Facial Modelling and Animation, Siggraph 2004 Course Notes

• Performance-driven Facial Animation, Siggraph 2006 Course Notes

• Practice and Theory of Blendshape Facial Models, Eurographics State-of-the-art, 2014

Computer Animation (INFOMCANIM)

Documents

Transcript of Computer Animation (INFOMCANIM)