Nao Tech Day

23
Expressive Gestures for NAO NAO TechDay, 13/06/2012, Paris Quoc Anh Le- Catherine Pelachaud CNRS, LTCI, Telecom-ParisTech, France

Transcript of Nao Tech Day

Page 1: Nao Tech Day

Expressive Gestures for NAO

NAO TechDay, 13/06/2012, Paris

Quoc Anh Le- Catherine Pelachaud

CNRS, LTCI, Telecom-ParisTech, France

Page 2: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud

Motivation

page 1 NAO TechDay 2012

Importance of expressive

gestures [Li et al, 2009]

• Communicating messages

• Expressing affective states

Relation between gesture and

speech [Kendon, 2004]

• Two aspects of the same

process of utterance

• Complement and supplement

Believability and life-likeness

• Robot should communicate in

a human-like way (emotion,

persionality, etc) [Fong, 2003]

Page 3: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 2

Objectives

Generate communicative gestures for Nao robot

• Integrated within an existing platform for virtual agent

• Nonverbal behaviors described symbolically

• Synchronization (gestures and speech)

• Expressivity of gestures

GVLEX project (Gesture & Voice for Expressive Reading)

• Robot tells a story expressively.

• Partners : LIMSI (linguistic aspects), Aldebaran (robotics), Acapela (speech synthesis), Telecom ParisTech (expressive gestures)

NAO TechDay 2012

Page 4: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud

State of the art

Several initiatives recently:

• Salem et Kopp (2012): robot ASIMO, the virtual framework MAX, gesture description with MURML.

• Aaron Holroyd et Charles Rich (2011): robot Melvin, motion scripts with BML, simple gestures, feedback to synchronize gestures and speech

• Ng-Thow-Hing et al. (2010): robot ASIMO, gestures selection, synchronization between gestures and speech.

• Nozawa et al. (2006): motion scripts with MPML-HP, robot HOAP-1

Our system: Focus on expressivity and synchronization of gestures with speech using a common platform (SAIBA compliant [Kopp, 2006]) for Greta and for Nao

page 3 NAO TechDay 2012

Page 5: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 4 NAO TechDay 2012

Our methodology

Gesture describes with a symbolic language (BML)

Gestural expressivity (amplitude, fluidity, power, repetition, speed, stiffness,…)

Elaboration of gestures from a storytelling video corpus (Martin et al., 2009)

Execution of the animation by translating into joint values

Page 6: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 5 NAO TechDay 2012

Problem and Solution

Using a common framework to control both virtual and physical agents raises several problems:

• Different degrees of freedom

• Limit of movement space and speed

Solution: • Use the same representation language

- same algorithm for selecting and planning gestures - different algorithm for creating the animation

• Elaborate one gesture repository for the robot and another one for the Greta agent

• Gesture movement space and velocity specification

Page 7: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud

Steps

1. Build a library of gestures from a corpus of storytelling video: the gesture

shapes should not be identical (between the human, virtual agent, robot)

but they have to convey the same meaning.

2. Use the GRETA system to generate gestures for Nao

• Following the SAIBA framework

- Two representation languages: FML (Function Markup Language) and BML (Behavior

Markup Language)

- Three separated modules: plan communicative intents, select and plan gestures, and

realize gestures

page 6 NAO TechDay 2012

Text

Intent

Planning

Behavior

Planning

Behavior

Realizer

FML

BML

BML Behavior

Realizer

GRETA

System

Page 8: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud

Global diagram

page 7 NAO TechDay 2012

FML BML KEYFRAMES

LEXICONs

Gesture

Selection

Planification of

gesture

duration

Synchronisation

with AI speech

Modification of

gesture

expressivity

Nao Lexicon Greta Lexicon

Page 9: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 8 NAO TechDay 2012

Nao Lexicon

Gesture velocity specification

(Minimal duration,Fitt’s Law)

Position

(from\to) 000 001 002 010

000 0 0.15:0.18388 0.25:0.28679 0.166:0.2270

001 0.15:0.18388 0 0.19:0.19552 0.147:0.2754 …

002 0.25:0.28679 0.19:0.19552 0 1.621;0.3501 …

010 0.166:0.2270 0.147:0.2754 1.621;0.3501 0 …

… … … … … …

Gesture annotations [Martin et al., 2009]

Gesture Lexicon

Gesture space specification

Page 10: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 9 NAO TechDay 2012

Gesture Specification

Gesture->Phases->Hands (wrist position, palm orientation, hand shape,...) [Kendon, 2004]

Only stroke phases are specified. Other phases will be generated automatically by the system 1. <gesture id="greeting" category="ICONIC" min_time="1.0“ hand="RIGHT"> 2. <phase type="STROKE-START" twohand="ASSYMMETRIC“ >

3. <hand side="RIGHT">

4. <vertical_location>YUpperPeriphery</vertical_location> 5. <horizontal_location>XPeriphery</horizontal_location> 6. <location_distance>ZNear</location_distance> 7. <hand_shape>OPEN</handshape> 8. <palm_orientation>AWAY</palm_orientation> 9. </hand> 10. </phase> 11. <phase type="STROKE-END" twohand="ASSYMMETRIC"> 12. <hand side="RIGHT"> 13. <vertical_location>YUpperPeriphery</vertical_location> 14. <horizontal_location>XExtremePeriphery</horizontal_location> 15. <location_distance>ZNear</location_distance> 16. <hand_shape>OPEN</handshape> 17. <palm_orientation>AWAY</palm_orientation> 18. </hand> 19.</phase> 20.</gesture>

An example for a greeting gesture

Page 11: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 10 NAO TechDay 2012

Synchronization of gestures with speech

The stroke phase coincides or precedes emphasized words of the

speech [McNeill, 1992]

Gesture stroke phase timing specified by synch points

Page 12: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 11 NAO TechDay 2012

Synchronization of gestures with speech

Algorithm

• Compute preparation phase

• Delete gesture if not enough time

• Add a hold phase to fit gesture planned duration

• Coarticulation between several gestures

- If enough time, retraction phase (ie go back to rest position)

- Otherwise, go from end of stroke to preparation phase of next

gesture

Start

S-start S-end S-start S-end

end

Start end Start end

Page 13: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 12 NAO TechDay 2012

Gestural Expressivity vs. Affective States

A set of gesture dimensions [Hartmann, 2005]

• Spatial Extent (SPC): Amplitude of gesture movement

• Temporal Extent (TMP): Speed of gesture movement

• Power (PWR): Acceleration of gesture movement

• Fluidity (FLD): Smoothness and Continuity

• Repetition (REP): Number of stroke phases in a gesture movement

• Stiffness (STF): Tension/Flexibility

Example [Mancini, 2008]

Affective states SPC TMP FLD PWR

Sadness Low Low High Low

Happy High High High High

Angrily High High Low High

Page 14: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 13 NAO TechDay 2012

Spatial Extent (SPC)

A real number in the interval [-1 .. 1]

• Zero corresponds to a neutral behavior

• -1 corresponds to small and contracted movements

• 1 corresponds to wide and large movements

Guarantee the unchangeability of the meaning

• Gesture (modifiable dimension, unmodifiable

dimension)

• Example: Negation (vertical position is fixed)

Page 15: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 14 NAO TechDay 2012

Temporal Extent (TMP)

A real number in the interval [-1 .. 1]

• Zero corresponds to a neutral behavior

• Slow if the value is negative

• Fast if the value is positive

Page 16: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 15 NAO TechDay 2012

Power (PWR)

A real number in the interval [-1 .. 1]

• Zero corresponds to a neutral behavior

• Movements more powerful correspond to higher

acceleration

Affect hand shape (close to open)

• More relax/open if the value is negative

• Fist corresponds to 1

Affect duration of stroke phase repetitions

Page 17: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 16 NAO TechDay 2012

Fluidity (FLD)

A real number in the interval [-1 .. 1]

• Zero corresponds to a neutral behavior

• Higher values allow smooth and continuous

execution of movements

• Lower values create discontinuity in the movements

Not yet implemented for Nao

Page 18: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 17 NAO TechDay 2012

Repetition (REP)

Number of stroke phase repeats in a gesture

movement

Tendency to rhythmic repeats of specific

movements

Each stroke coincides with a emphasized

word/words of the speech

Page 19: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 18 NAO TechDay 2012

Animation Computation & Execution

Schedule and plan gestures phases

Compute expressivity parameters

Translate symbolic descriptions into joint values

Execute animation

• Send timed key-positions to the robot using

available APIs

• Animation is obtained by interpolating between joint

values with robot built-in proprietary procedures.

Page 20: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 19 NAO TechDay 2012

Example

<bml>

<speech id="s1" start="0.0“

\vce=speaker=Antoine\ \spd=180\

Et le troisième dit en colère: \vce=speaker=AntoineLoud\

\spd=200\

<tm id="tm1"/>J'ai très faim!

</speech>

<gesture id=“hungry" start="s1:tm1" end=“start+1.5"

stroke="0.5“> <PWR.value>1.0</PWR.value>

<SPC.value>0.6</SPC.value>

<TMP.value>0.2</TMP.value>

<FLD.value>0</FLD.value>

<STF.value>0</STF.value>

<REP.value>0</REP.value>

</gesture>

</bml>

<lexicon>

<gesture id="hungry" category="BEAT" hand="BOTH">

<phase type="STROKE" twohand="SYMMETRIC">

<hand side="RIGHT">

<vertical_location>YCenterCenter</vertical_location>

<horizontal_location>XCenter</horizontal_location>

<location_distance>ZMiddle</location_distance>

<hand_shape>CLOSED</handshape>

<palm_orientation>INWARD</palm_orientation>

</hand>

</phase>

</gesture>

</lexicon>

<bml>

<speech id="s1" start="0.0“

\vce=speaker=Antoine\ \spd=180\

Et le troisième dit tristement: \vce=speaker=AntoineSad\ \spd=90\

<tm id="tm1"/>J'ai très faim!

</speech>

<gesture id=“hungry" start="s1:tm1" end=“start+1.5" stroke="0.5“>

<PWR.value>-1.0</PWR.value>

<SPC.value>-0.3</SPC.value>

<TMP.value>-0.2</TMP.value>

<FLD.value>0</FLD.value>

<STF.value>0</STF.value>

<REP.value>0</REP.value>

</gesture>

</bml>

Different expressivity

(i.e. Anger) Different expressivity

(i.e. Sadness)

The same gesture prototype

Page 21: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud page 20 NAO TechDay 2012

Video demo: Nao tells a story

Page 22: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud

Conclusion

Conclusion

• A gesture model is designed, implemented for Nao while taking into account physical constraints of the robot.

• Common platform for both virtual agent and robot

• Expressivity model

Future work

• Create gestures with different emotional colour and personal style

• Validate the model through perceptive evaluations

page 21 NAO TechDay 2012

Page 23: Nao Tech Day

Le Quoc Anh & Catherine Pelachaud

Acknowledgment

page 22 NAO TechDay 2012

This work has been funded by the ANR GVLEX project

It is supported from members of the laboratory TSI, Telecom-ParisTech