Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan...

52
Face Animation Overview Face Animation Overview with Shameless Bias Toward with Shameless Bias Toward MPEG-4 Face Animation MPEG-4 Face Animation Tools Tools Dr. Eric Petajan Dr. Eric Petajan Chief Scientist and Chief Scientist and Founder Founder face2face animation, inc. face2face animation, inc. [email protected] [email protected]

Transcript of Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan...

Page 1: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Face Animation Overview with Face Animation Overview with Shameless Bias Toward MPEG-4 Shameless Bias Toward MPEG-4

Face Animation Tools Face Animation Tools

Dr. Eric PetajanDr. Eric Petajan

Chief Scientist and FounderChief Scientist and Founder

face2face animation, inc.face2face animation, inc.

[email protected]@f2f-inc.com

Page 2: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Computer-generated Face Computer-generated Face Animation MethodsAnimation Methods

Morph targets/key frames (traditional)Morph targets/key frames (traditional) Speech articulation model (TTS)Speech articulation model (TTS) Facial Action Coding System (FACS)Facial Action Coding System (FACS) Physics-based (skin and muscle Physics-based (skin and muscle

models)models) Marker-based (dots glued to face)Marker-based (dots glued to face) Video-based (surface features)Video-based (surface features)

Page 3: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Morph targets/key framesMorph targets/key frames

AdvantagesAdvantages– Complete manual control of each frameComplete manual control of each frame– Good for exaggerated expressionsGood for exaggerated expressions

DisadvantagesDisadvantages– Hard to achieve good lipsync without Hard to achieve good lipsync without

manual tweekingmanual tweeking– Morph targets must be downloaded to Morph targets must be downloaded to

terminal for streaming animation (delay)terminal for streaming animation (delay)

Page 4: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Speech articulation modelSpeech articulation model

AdvantagesAdvantages– High level control of faceHigh level control of face– Enables TTSEnables TTS

DisadvantagesDisadvantages– Robotic characterRobotic character– Hard to sync with real voiceHard to sync with real voice

Page 5: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Facial Action Coding SystemFacial Action Coding System

AdvantagesAdvantages– Very high level control of faceVery high level control of face– Maps to morph targetsMaps to morph targets– Explicit specification of emotional statesExplicit specification of emotional states

DisadvantagesDisadvantages– Not good for speechNot good for speech– Not quantifiedNot quantified

Page 6: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Physics-basedPhysics-based

AdvantagesAdvantages– Good for realistic skin, muscle and fatGood for realistic skin, muscle and fat– Collision detectionCollision detection

DisadvantagesDisadvantages– High complexityHigh complexity– Must be driven by high level articulation Must be driven by high level articulation

parameters (TTS)parameters (TTS)– Hard to drive with motion capture dataHard to drive with motion capture data

Page 7: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Marker-basedMarker-based AdvantagesAdvantages

– Can provide accurate motion data from most of the Can provide accurate motion data from most of the faceface

– Face models can be animated directly from surface Face models can be animated directly from surface feature point motionfeature point motion

DisadvantagesDisadvantages– Dots glued to faceDots glued to face– Dots must be manually registeredDots must be manually registered– Not good for accurate inner lip contour or eyelid Not good for accurate inner lip contour or eyelid

trackingtracking

Page 8: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Video-basedVideo-based

AdvantagesAdvantages– Simple to capture video of faceSimple to capture video of face– Face models can be animated directly from Face models can be animated directly from

surface feature motionsurface feature motion DisadvantagesDisadvantages

– Must have good view of faceMust have good view of face

Page 9: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

What is MPEG-4 Multimedia?What is MPEG-4 Multimedia?

Natural audio and video objectsNatural audio and video objects 2D and 3D graphics (based on VRML)2D and 3D graphics (based on VRML) Animation (virtual humans)Animation (virtual humans) Synthetic speech and audioSynthetic speech and audio

Page 10: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Samples versus ObjectsSamples versus Objects

Traditional video coding is sample based Traditional video coding is sample based (blocks of pixels are compressed)(blocks of pixels are compressed)

MPEG-4 provides visual object MPEG-4 provides visual object representation for better compression and representation for better compression and new functionalitiesnew functionalities

Objects are rendered in the terminal after Objects are rendered in the terminal after decoding object descriptorsdecoding object descriptors

Page 11: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Object-based FunctionalitiesObject-based Functionalities

User can choose display of content layersUser can choose display of content layers Individual objects (text, models) can be Individual objects (text, models) can be

searched or stored for later usedsearched or stored for later used Content is independent of display Content is independent of display

resolutionresolution Content can be easily repurposed by Content can be easily repurposed by

provider for different networks and usersprovider for different networks and users

Page 12: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

MPEG-4 Object CompositionMPEG-4 Object Composition

Objects are organized in a scene graphObjects are organized in a scene graph Scene graphs are specified using a binary Scene graphs are specified using a binary

format called BIFS (based on VRML)format called BIFS (based on VRML) Both 2D and 3D objects, properties and Both 2D and 3D objects, properties and

transforms are specified in BIFStransforms are specified in BIFS BIFS allows objects to be transmitted once BIFS allows objects to be transmitted once

and instanced repeatedly in the scene after and instanced repeatedly in the scene after transformationstransformations

Page 13: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

MPEG-4 Operation SequenceMPEG-4 Operation Sequence

Terminal setup• communications bit-rate• available RAM and disk• available MIPS• accelerators for graphics,etc• display resolution

Incremental or streaming data download • nearly exposed geometry and textures• audio data

Initial download • geometry• textures• articulated face model

Time stamped control parameter stream • view position• object instancing/destruction• face animation parameters

TIME

Page 14: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

MPEG4 Decoder Functional Architecture

DisplaySystem

layer(bitstream)

-content addressing

-synchronization

-scalability

-error correction

System layer,

Compositing and

Rendering

-user input

-user content mods

- user POV mods

Video/Image

decoders

-H.263

-MPEG2

-JPEG

Audio decoder

- low bit-rate

speech

- 64kbps - AAC

Audio

synthesizer/processor

- MIDI/Structure audio

- 3D processor.

Cached data

- geometry

- textures

- articulated figures

(faces)

- video clips

- audio clips

- FAP codebooks

2D/3D geometry

decoder

-Polygonal meshes

-Segmentation Masks

MPEG4

bitstream

User

input

Level of detail control

Page 15: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Faces are SpecialFaces are Special

Humans are hard-wired to respond to Humans are hard-wired to respond to facesfaces

The face is the primary communication The face is the primary communication interfaceinterface

Human faces can be automatically Human faces can be automatically analyzed and parameterized for a wide analyzed and parameterized for a wide variety of applicationsvariety of applications

Page 16: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

MPEG-4 Face and Body MPEG-4 Face and Body Animation CodingAnimation Coding

Face animation is in MPEG-4 version 1Face animation is in MPEG-4 version 1 Body animation is in MPEG-4 version 2Body animation is in MPEG-4 version 2 Face animation parameters displace feature Face animation parameters displace feature

points from neutral positionpoints from neutral position Body animation parameters are joint anglesBody animation parameters are joint angles Face and body animation parameter Face and body animation parameter

sequences are compressed to low bitratessequences are compressed to low bitrates

Page 17: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Neutral Face DefinitionNeutral Face Definition

Head axes parallel to the world axes Head axes parallel to the world axes Gaze is in direction of Z axisGaze is in direction of Z axis Eyelids tangent to the irisEyelids tangent to the iris Pupil diameter is one third of iris diameterPupil diameter is one third of iris diameter Mouth is closed and the upper and lower teeth Mouth is closed and the upper and lower teeth

are touchingare touching Tongue is flat, horizontal with the tip of tongue Tongue is flat, horizontal with the tip of tongue

touching the boundary between upper and lower touching the boundary between upper and lower teethteeth

Page 18: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Face Feature PointsFace Feature Points

xy

z

11.5

11.4

11.2

10.2

10.4

10.10

10.810.6

2.14

7.1

11.6 4.6

4.4

4.2

5.2

5.4

2.10

2.122.1

11.1

Tongue

6.26.4 6.3

6.1Mouth

8.18.9 8.10 8.5

8.3

8.7

8.2

8.8

8.48.6

2.2

2.3

2.6

2.82.9

2.72.5 2.4

2.12.12 2.11

2.142.10

2.13

10.610.8

10.4

10.2

10.105.4

5.2

5.3

5.1

10.1

10.910.3

10.510.7

4.1 4.34.54.6

4.4 4.2

11.111.2 11.3

11.4

11.5

x

y

z

Nose

9.6 9.7

9.14 9.13

9.12

9.2

9.4 9.15 9.5

9.3

9.1

Teeth

9.109.11

9.8

9.9

Feature points affected by FAPs

Other feature points

Right eye Left eye

3.13

3.7

3.9

3.5

3.1

3.3

3.11

3.14

3.10

3.12 3.6

3.4

3.23.8

Page 19: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Face Animation Parameter Face Animation Parameter NormalizationNormalization

Face Animation Parameters (FAPs) are Face Animation Parameters (FAPs) are normalized to facial dimensionsnormalized to facial dimensions

Each FAP is measured as a fraction of Each FAP is measured as a fraction of neutral face mouth width, mouth-nose neutral face mouth width, mouth-nose distance, eye separation, or iris distance, eye separation, or iris diameter diameter

3 Head and 2 eyeball rotation FAPs are 3 Head and 2 eyeball rotation FAPs are Euler anglesEuler angles

Page 20: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Neutral Face Dimensions for Neutral Face Dimensions for FAP NormalizationFAP Normalization

MW0

MNS0

ENS0

ES0IRISD0

Page 21: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

FAP GroupsFAP GroupsGroup Number of FAPs

1: visemes and expressions 2

2: jaw, chin, inner lowerlip, cornerlips, midlip 16

3: eyeballs, pupils, eyelids 12

4: eyebrow 8

5: cheeks 4

6: tongue 5

7: head rotation 3

8: outer lip positions 10

9: nose 4

10: ears 4

Page 22: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Lip FAPsLip FAPsMouth closed if sum of upper and Mouth closed if sum of upper and

lower lip FAPs = 0lower lip FAPs = 0

Page 23: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Face Model IndependenceFace Model Independence FAPs are always normalized for model FAPs are always normalized for model

independenceindependence FAPs (and BAPs) can be used without FAPs (and BAPs) can be used without

MPEG-4 systems/BIFSMPEG-4 systems/BIFS Private face models can be accurately Private face models can be accurately

animated with FAPsanimated with FAPs Face models can be simple or complex Face models can be simple or complex

depending on terminal resourcesdepending on terminal resources

Page 24: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

MPEG-4 BIFS Face NodeMPEG-4 BIFS Face Node Face node contains FAP node, Face scene Face node contains FAP node, Face scene

graph, Face Definition Parameters (FDP), graph, Face Definition Parameters (FDP), FIT,and FATFIT,and FAT

FIT (Face Interpolation Table) specifies FIT (Face Interpolation Table) specifies interpolation of FAPs in terminalinterpolation of FAPs in terminal

FAT (Face Animation Table) maps FAPs to FAT (Face Animation Table) maps FAPs to Face model deformationFace model deformation

FDP information included face feature points FDP information included face feature points positions and texture mappositions and texture map

Page 25: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Face Model DownloadFace Model Download

3D graphical models (e.g. faces) can be 3D graphical models (e.g. faces) can be downloaded to the terminal with MPEG-4downloaded to the terminal with MPEG-4

3D model specification is based on VRML3D model specification is based on VRML Face Animation Table( FAT) maps FAPs to Face Animation Table( FAT) maps FAPs to

face model vertex displacementsface model vertex displacements Appearance and animation of downloaded Appearance and animation of downloaded

face models is exactly predictableface models is exactly predictable

Page 26: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

FAP CompressionFAP Compression

FAPs are adaptively quantized to FAPs are adaptively quantized to desired quality leveldesired quality level

Quantized FAPs are differentially codedQuantized FAPs are differentially coded Adaptive arithmetic coding further Adaptive arithmetic coding further

reduces bitratereduces bitrate Typical compressed FAP bitrate is less Typical compressed FAP bitrate is less

than 2 kilobits/secondthan 2 kilobits/second

Page 27: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

FAP Predictive CodingFAP Predictive Coding

FAP(t) + Q

Q-1FrameDelay

- ArithmeticCoder

Bitstream

Page 28: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Face Analysis SystemFace Analysis System

MPEG-4 does not specify analysis MPEG-4 does not specify analysis systemssystems

face2face face analysis system tracks face2face face analysis system tracks nostrils for robust operationnostrils for robust operation

Inner lip contour estimated using adaptive Inner lip contour estimated using adaptive color thresholding and lip modelingcolor thresholding and lip modeling

Eyelids, eyebrows and gaze directionEyelids, eyebrows and gaze direction

Page 29: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Nostril TrackingNostril Tracking• At least 75% of nostril window area is skin color as indicated

by RGB skincolor table• After RGB thresholding nostril window, at least 15% of area is

subthreshold (nostril)• Min/Max constraints are met for nostril width, height, gap ,

center spacing, orientation in thresholded projection domain

Nostrils are detected only if:

Gap

Width

Height

Spacing

Nostrils

Projection Threshold

Vertical Nostril

Projection

Horizontal Nostril

Projection

Nostril Window

Page 30: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Inner Lip Contour EstimationInner Lip Contour EstimationDetect mouth

closure

Train horizontal mouth threshold array when mouth closed

Apply threshold array to mouth Region

Locate teeth by color and position

Form inner lip contour around inner mouth and teeth pixels

Page 31: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

FAP Estimation AlgorithmFAP Estimation Algorithm Head scale is normalized based on neutral mouth Head scale is normalized based on neutral mouth

(closed mouth) width(closed mouth) width Head pitch is approximated based on vertical Head pitch is approximated based on vertical

nostril deviation from neutral head positionnostril deviation from neutral head position Head roll is computed from smoothed eye or nostril Head roll is computed from smoothed eye or nostril

orientation depending on availability orientation depending on availability Inner lip FAPs are measured directly from the inner Inner lip FAPs are measured directly from the inner

lips contour as deviations from the neutral lip lips contour as deviations from the neutral lip position (closed mouth)position (closed mouth)

Page 32: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

FAP Sequence SmoothingFAP Sequence Smoothing

Time (1/30 sec)

-500

-400

-300

-200

-100

0

100

200

lower_t_midlip

raise_b_midlip

Time (1/30 sec)

-500

-400

-300

-200

-100

0

100

200

lower_t_midlip

raise_b_midlip

Page 33: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

MPEG-4 Visemes and MPEG-4 Visemes and ExpressionsExpressions

A weighted combination of 2 visemes A weighted combination of 2 visemes and 2 facial expressions for each frameand 2 facial expressions for each frame

Decoder is free to interpret effect of Decoder is free to interpret effect of visemes and expressions after FAPs visemes and expressions after FAPs are appliedare applied

Definitions of visemes and expressions Definitions of visemes and expressions using FAPs can also be downloadedusing FAPs can also be downloaded

Page 34: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

VisemesVisemesviseme_select phonemes example

0 none na

1 p, b, m put, bed, mill

2 f, v far, voice

3 T,D think, that

4 t, d tip, doll

5 k, g call, gas

6 tS, dZ, S chair, join, she

7 s, z sir, zeal

8 n, l lot, not

9 r red

10 A: car

11 e bed

12 I tip

13 Q top

14 U book

Page 35: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Facial ExpressionsFacial Expressionsexpression_select expression name textual description

0 na na

1 joy The eyebrows are relaxed. The mouth is open and the mouthcorners pulled back toward the ears.

2 sadness The inner eyebrows are bent upward. The eyes are slightlyclosed. The mouth is relaxed.

3 anger The inner eyebrows are pulled downward and together. Theeyes are wide open. The lips are pressed against each other oropened to expose the teeth.

4 fear The eyebrows are raised and pulled together. The innereyebrows are bent upward. The eyes are tense and alert.

5 disgust The eyebrows and eyelids are relaxed. The upper lip is raisedand curled, often asymmetrically.

6 surprise The eyebrows are raised. The upper eyelids are wide open, thelower relaxed. The jaw is opened.

Page 36: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Free Face Model SoftwareFree Face Model Software

Wireface is an openGL-based, MPEG-4 Wireface is an openGL-based, MPEG-4 compliant face modelcompliant face model

Good starting point for building high Good starting point for building high quality face models for web applicationsquality face models for web applications

Reads FAP file and raw audio fileReads FAP file and raw audio file Renders face and audio in real timeRenders face and audio in real time Wireface source is freely availableWireface source is freely available

Page 37: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Body AnimationBody Animation

Harmonized with VRML Hanim specHarmonized with VRML Hanim spec Body Animation Parameters (BAPs) are Body Animation Parameters (BAPs) are

humanoid skeleton joint Euler angleshumanoid skeleton joint Euler angles Body Animation Table (BAT) can be Body Animation Table (BAT) can be

downloaded to map BAPs to skin downloaded to map BAPs to skin deformationdeformation

BAPs can be highly compressed for BAPs can be highly compressed for streamingstreaming

Page 38: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Body Animation Parameters Body Animation Parameters (BAPs)(BAPs)

186 humanoid skeleton euler angles186 humanoid skeleton euler angles 110 free parameters for use with 110 free parameters for use with

downloaded body surface meshdownloaded body surface mesh Coded using same codecs as FAPsCoded using same codecs as FAPs Typical bitrates for coded BAPs is 5-Typical bitrates for coded BAPs is 5-

10kbps10kbps

Page 39: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Body Definition Parameters Body Definition Parameters (BDPs)(BDPs)

Humanoid joint center positionsHumanoid joint center positions Names and hierarchy harmonized with Names and hierarchy harmonized with

VRML/Web3D H-Anim working groupVRML/Web3D H-Anim working group Default positions in standard for Default positions in standard for

broadcast applicationsbroadcast applications Download just BDPs to accurately Download just BDPs to accurately

animate unknown body modelanimate unknown body model

Page 40: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Faces Enhance the User Faces Enhance the User ExperienceExperience

Virtual call center agentsVirtual call center agents News readers (e.g. Ananova)News readers (e.g. Ananova) Story tellers for the child in all of usStory tellers for the child in all of us eLearningeLearning Program guideProgram guide Multilingual (same face different voice)Multilingual (same face different voice) Entertainment animationEntertainment animation Multiplayer gamesMultiplayer games

Page 41: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Visual Content for the Visual Content for the Practical InternetPractical Internet

Broadband deployment is happening slowlyBroadband deployment is happening slowly DSL availability is limited and cable is sharedDSL availability is limited and cable is shared Talking heads need high frame-rateTalking heads need high frame-rate Consumer graphics hardware is cheap and Consumer graphics hardware is cheap and

powerfulpowerful MPEG-4 SNHC/FBA tools are matched to MPEG-4 SNHC/FBA tools are matched to

available bandwidth and terminalsavailable bandwidth and terminals

Page 42: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Visual Speech ProcessingVisual Speech Processing FAPs can be used to improve speech FAPs can be used to improve speech

recognition accuracyrecognition accuracy Text-to-speech systems can use FAPs Text-to-speech systems can use FAPs

to animate face modelsto animate face models FAPs can be used in computer-human FAPs can be used in computer-human

dialogue systems to communicate dialogue systems to communicate emotions, intentions and speech emotions, intentions and speech especially in noisy environmentsespecially in noisy environments

Page 43: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Video-driven Face AnimationVideo-driven Face Animation Facial expressions, lip movements and Facial expressions, lip movements and

head motion transferred to face modelhead motion transferred to face model FAPs extracted from talking head video FAPs extracted from talking head video

with special computer vision systemwith special computer vision system No face markers or lipstick is requiredNo face markers or lipstick is required Normal lighting is usedNormal lighting is used Communicates lip movements and facial Communicates lip movements and facial

expressions with visual anonymityexpressions with visual anonymity

Page 44: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Automatic Face Animation Automatic Face Animation DemonstrationDemonstration

FAPs extracted from camcorder videoFAPs extracted from camcorder video FAPs compressed to less than 2 FAPs compressed to less than 2

kbits/seckbits/sec 30 frames/sec animation generated 30 frames/sec animation generated

automaticallyautomatically Face models animated with bones rig Face models animated with bones rig

or fixed deformable mesh (real-time)or fixed deformable mesh (real-time)

Page 45: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

QuickTime™ and a decompressor

are needed to see this picture.

QuickTime™ and a decompressor

are needed to see this picture.

Page 46: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

What is easy, solved, or What is easy, solved, or almost solvedalmost solved

Can we do photorealistic non-animated Can we do photorealistic non-animated face models? YESface models? YES

Can we do near-real-time lip sync'ing Can we do near-real-time lip sync'ing that is indistinguishable from a human? that is indistinguishable from a human? NONO

Page 47: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

What is really hardWhat is really hard

Synthesizing human speech and facial Synthesizing human speech and facial expressionsexpressions

HairHair

Page 48: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

What we have assumed What we have assumed someone else is solvingsomeone else is solving

Graphics accelerationGraphics acceleration Video camera cost and resolutionVideo camera cost and resolution Multimedia communication Multimedia communication

infrastructureinfrastructure

Page 49: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Where we need helpWhere we need help We have a face with 68 parameters but We have a face with 68 parameters but

we need the psychologists to tell us how we need the psychologists to tell us how to drive it autonomouslyto drive it autonomously

We need to embody our agents into We need to embody our agents into graphical models that have a couple of graphical models that have a couple of thousand parameters to control gaze, thousand parameters to control gaze, gesture, body language, and do collision gesture, body language, and do collision detection-> NEED MORE SPEEDdetection-> NEED MORE SPEED

Page 50: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Core functionality of the faceCore functionality of the face SpeechSpeech

– Lips, teeth, tongueLips, teeth, tongue Emotional expressionsEmotional expressions

– Gaze, eyebrow, eyelids, head poseGaze, eyebrow, eyelids, head pose Non-verbal communicationNon-verbal communication Sensory responsivitySensory responsivity Technical requirementsTechnical requirements

– FramerateFramerate– SynchronizationSynchronization– LatencyLatency– BitrateBitrate– Spatial resolutionSpatial resolution– ComplexityComplexity

Common framework withbodyCommon framework withbody InteractionInteraction Different faces should respond similarly to common commandsDifferent faces should respond similarly to common commands Accessible to everyoneAccessible to everyone

Page 51: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Interaction with other Interaction with other componentscomponents

Language and discourseLanguage and discourse– Phoneme to viseme mappingPhoneme to viseme mapping

– Given/newGiven/new Action in the environmentAction in the environment Global informationGlobal information

– Emotional stateEmotional state

– PersonalityPersonality

– CultureCulture

– World knowledgeWorld knowledge

– Central time-base and timestampsCentral time-base and timestamps

Page 52: Face Animation Overview with Shameless Bias Toward MPEG-4 Face Animation Tools Dr. Eric Petajan Chief Scientist and Founder face2face animation, inc. eric@f2f-inc.com.

Open questionsOpen questions

Central vs peripheral functionalityCentral vs peripheral functionality Degree of interface commonalityDegree of interface commonality Degree of agent autonomyDegree of agent autonomy What should the VH be capable ofWhat should the VH be capable of