Download - Introduction to the Facial Action Coding System and Computer ...

TDLC workshop August 2009–Social Interaction Network Day

FACS

Facial Action Coding System

CERT

Computer Expression Recognition Toolbox

The Facial Action Coding SystemEkman & Friesen, 1978

How do humans detect facial actions?

• Relative movement of parts of the face. (motion‐based, video )

• Wrinkles and furrows. (texture based, photo)

• Shape of parts of the face. (texture based)

Fear brow 1+2+4

Neutral Brow raise

Video examples

• Show ELAN clip of TDLC_30_MysteryBox (motion enhances human detection)

• Show ELAN clip of TDLC_24_MysteryBox (interpreting low intensity action)

• Hand out contains list of numerical codes for most common facial action units.

• Briefly try producing :AU1+2, 4, 751,53,55,61,6312,20,26

4 alone1 alone

1+4 1+2+41+2

5 alone

Target Upper face AU’s

Target Lower face AU’s

10 201412

Basic Emotions

• Anger

• Disgust

• Fear

• Joy

• Sadness

• Surprise

DFAT Examples

Pain Actions

Real Pain with CERT

Warning

• Do not try to FACS code your own data.

• Real expressions are typically complex combinations of AUs at various intensities.

• Subtle differences not mentioned here.

C.E.R.T.

please hold while I open the CERT GUI

Computer Expression Recognition Toolbox (CERT)

FACS AU 1AU 2

:AU 46

Binary Classifier

Bank

FeatureSelection

Machine Learning

Developed at the Machine Perception Laboratory

filter

Meta Labels

Dynamics

Face detection

• Compaq Dataset: 5,000 face images from the web and segmented by hand.

• 8,000 non‐face images collected from the web.

• 30 frames/second (160x120 images, 2.1ghz)

• 90% Detection rate, 1/million false positive rate on CMU test set. Equivalent to Viola & Jones

• Source code available at http://mplab.ucsd.edu

+ ++

+

Frontal +‐10 degrees, Procrustes Alignment using 4 points

Automatic registration

Training Data Large datasets (over 10,000 images from over 300 subjects)

Expert FACS coding

Combined posed and spontaneous datasets:

DFAT ‐ Cohn‐Kanade

Ekman Hager

MMI ‐ Pantic et al

D006 D005 and D007‐ Frank et al

Training set size20,000 subject smile database

100 10,0001000

90%

Training set size

Performance

Whitehill et al.

POFA+noise

Gabor representation

SVM Classifiers

• 19 AUs with over 100 examples

• Unilaterals (left or right) for 3 Aus

• Fear, distress, smiles, blinks

• Yaw, pitch and roll

Action Unit Recognition Performance TableAU Name Posed Spont

1 Inner brow raise .97 .892 Outer brow raise .95 .824 Brow Lower .94 .745 Upper Lid Raise .96 .796 Cheek Raise .92 .907 (7) Lids tight .91 .789 Nose wrinkle .99 .8710 Upper lip raise .95 .7912 Lip corner pull .99 .9214 Dimpler .90 .7715 Lip corner Depress .97 .8617 Chin Raise .95 .8018 (18) Lip Pucker .83 .7220 Lip stretch .91 .6223 Lip tighten .85 .6624 Lip press .94 .7525 Lips part .96 .7226 Jaw drop .88 .711,1+4 Distress brow .94 .701+2+4 Fear brow .95 .63

Mean: .93 .77

Performance Area under the ROC

= fraction correct on a 2‐alternative forced choice.

Unbiased sensitivity

0 1 false alarm rate

A’

ROC curve1

hit rate

0

equal error

The SVM margin (CERT output) predicts human AU intensity label

Frame Number

Margin

Correlation

• Varies by subject• Ranges from r = .34 to r = .93• Mean r = .63AU 4

AU 7AU 9

…

Dynamic classifier output on 3.5 minutes of video. Spontaneous behavior. Action unit 12=Zygomatic

A

D

D

A

ED

B

C CD

C

E

D

Video Frame number

Evid

ence

of A

U12

Human AU codes* ABCDEApex Onset-offset interval low-high Intensity

Predicting self‐report of emotion

Dynamicsou

tput

Surprise Expressions

Frame Number Frame Number Frame Number Frame Number

Subject 1 Subject 2 Subject 3 Subject 4

* AU 1o AU 2

AU 5

Frame Number Frame Number Frame Number Frame Number

output

Subject 1 Subject 2 Subject 3 Subject 4

Disgust Expressions

* AU 4o AU 7

AU 9

Testing: MLR Weighted Temporal Windows

Classifier for Real vs Fake Pain

SVM(RBF)

Frame

Real Pain

Fake Pain

:

19 Facial Action Channels

AU1

AU2

AU4

AU28

Gabor‐like filter applied to 500 frame window

Statistics within episode

No Pain

Conclusions from Pain study• Automated Expression Coding identified similar Facial Actions to

previous human coding studies for genuine pain and posed pain.

• Dynamical information such as the variability of duration of theactions or the correlations between actions was necessary for good classification performance.

• A learned classifier outperformed naïve humans for discriminating fake from real pain expressions, based on 1 minute of video of each– 52% (human) for fake/real – 89% (cert) for fake/real

TDLC study

• Judy Reilly, Marni Bartlett, Gwen Littlewort Linda Phan, Grace Kang and co.

• 30 TD kids aged 4‐9 years doing a 45 minute session which includes many tasks that provoke facial expression through imitation, natural social interaction and games.

• Longitudinal study. Data base.

• Rapidly yields huge quantities of CERT output data.

• Need a variety of temporal dynamics measures to apply to CERT outputs such as event detection, measures of coordination and temporal integration, block statistics, correlation distributions.

show video subject 30 mystery box item 2

Head direction‐activity plot

activechild

Calm child

Finding a needle in a haystack – trajectory of synchronized and sequentially phased actions for one subject. 4 seconds of activity.

Comparing coordination of actions in an older and younger child during the latency period of a mystery box task: Trajectories of

frown and smile during a moment of “recognition”.

Social referencing ? Returning gaze to adult interviewer as puzzlement fades.