Sound Emblems for Affective Multimodal Output of a Robotic ...emotional valence of the robot...

6
Heriot-Watt University Research Gateway Sound Emblems for Affective Multimodal Output of a Robotic Tutor: A Perception Study Citation for published version: Hastie, H, Dente, P, Küster, D & Kappas, A 2016, Sound Emblems for Affective Multimodal Output of a Robotic Tutor: A Perception Study. in Proceedings of the 18th ACM International Conference on Multimodal Interaction. Association for Computing Machinery, pp. 256-260. https://doi.org/10.1145/2993148.2993169 Digital Object Identifier (DOI): 10.1145/2993148.2993169 Link: Link to publication record in Heriot-Watt Research Portal Document Version: Peer reviewed version Published In: Proceedings of the 18th ACM International Conference on Multimodal Interaction Publisher Rights Statement: © ACM, 2016. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceeding ICMI 2016, http://doi.acm.org/10.1145/2993148.2993169 General rights Copyright for the publications made accessible via Heriot-Watt Research Portal is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy Heriot-Watt University has made every reasonable effort to ensure that the content in Heriot-Watt Research Portal complies with UK legislation. If you believe that the public display of this file breaches copyright please contact [email protected] providing details, and we will remove access to the work immediately and investigate your claim. Download date: 15. May. 2021

Transcript of Sound Emblems for Affective Multimodal Output of a Robotic ...emotional valence of the robot...

Page 1: Sound Emblems for Affective Multimodal Output of a Robotic ...emotional valence of the robot feedback. We discuss this in the context of an adaptive robotic tutor interacting with

Heriot-Watt University Research Gateway

Sound Emblems for Affective Multimodal Output of a RoboticTutor: A Perception Study

Citation for published version:Hastie, H, Dente, P, Küster, D & Kappas, A 2016, Sound Emblems for Affective Multimodal Output of aRobotic Tutor: A Perception Study. in Proceedings of the 18th ACM International Conference on MultimodalInteraction. Association for Computing Machinery, pp. 256-260. https://doi.org/10.1145/2993148.2993169

Digital Object Identifier (DOI):10.1145/2993148.2993169

Link:Link to publication record in Heriot-Watt Research Portal

Document Version:Peer reviewed version

Published In:Proceedings of the 18th ACM International Conference on Multimodal Interaction

Publisher Rights Statement:© ACM, 2016. This is the author's version of the work. It is posted here by permission of ACM for your personaluse. Not for redistribution. The definitive version was published in ProceedingICMI 2016, http://doi.acm.org/10.1145/2993148.2993169

General rightsCopyright for the publications made accessible via Heriot-Watt Research Portal is retained by the author(s) and /or other copyright owners and it is a condition of accessing these publications that users recognise and abide bythe legal requirements associated with these rights.

Take down policyHeriot-Watt University has made every reasonable effort to ensure that the content in Heriot-Watt ResearchPortal complies with UK legislation. If you believe that the public display of this file breaches copyright pleasecontact [email protected] providing details, and we will remove access to the work immediately andinvestigate your claim.

Download date: 15. May. 2021

Page 2: Sound Emblems for Affective Multimodal Output of a Robotic ...emotional valence of the robot feedback. We discuss this in the context of an adaptive robotic tutor interacting with

Sound Emblems for Affective Multimodal Output of aRobotic Tutor: A Perception Study

Helen HastieSchool of Mathematical and Computer Sciences

Heriot-Watt UniversityEdinburgh, UK

[email protected]

Pasquale Dente, Dennis Küster, ArvidKappas

Depart. of Psychology and Methods,Jacobs University, Bremen, Germany

p.dente, d.kuester,[email protected]

ABSTRACTHuman and robot tutors alike have to give careful consider-ation as to how feedback is delivered to students to providea motivating yet clear learning context. Here, we performeda perception study to investigate attitudes towards negativeand positive robot feedback in terms of perceived emotionalvalence on the dimensions of ‘Pleasantness’, ‘Politeness’ and‘Naturalness’. We find that, indeed, negative feedback isperceived as significantly less polite and pleasant. Unlikehumans who have the capacity to leverage various paralin-guistic cues to convey subtle variations of meaning and emo-tional climate, presently robots are much less expressive.However, they have one advantage that they can combinesynthetic robotic sound emblems with verbal feedback. Weinvestigate whether these sound emblems, and their posi-tion in the utterance, can be used to modify the perceivedemotional valence of the robot feedback. We discuss thisin the context of an adaptive robotic tutor interacting withstudents in a multimodal learning environment.

CCS Concepts•Human-centered computing→Natural language in-terfaces; Auditory feedback; Empirical studies in inter-action design;

KeywordsHuman-robot interaction, multimodal output, speech syn-thesis, synthesized sounds

1. INTRODUCTIONIn the classroom, teachers use multiple ways and modali-

ties to embed knowledge and provide feedback to students.Similarly, robotic tutors should use a range of modalitiesincluding language, gestures and non-verbal cues. Thesenon-verbal modalities can be used to emphasize, extend oreven replace the language output produced by a robot. In

addition, non-verbal cues can make the learning experiencemore novel and enjoyable for the students and potentiallyhelp compensate for the uncanny valley effect.

There is much discussion in the literature on the peda-gogical effectiveness of both negative and positive feedback[32]. There is, however, a consensus that whatever feedbackis given it needs to be carefully phrased [21], and needs tobe communicated in a clear and unambiguous fashion [13].

The study described here has three stages. The first stageis to investigate whether negative feedback is perceived tohave less emotional valence in terms of ‘Pleasantness’, ‘Po-liteness’ and ‘Naturalness’. The second stage is to investi-gate, if this is the case, could the addition of emotive soundemblems be used to subtly adjust this valence of the feed-back. Thirdly, we investigate the effect of positioning thesound emblems. Sound emblems may be better placed atthe end of an utterance, thus emulating punctuation or theymay be better placed at the beginning of an utterance as theemblem might be perceived as a spontaneous affect burst,followed by a verbal elaboration [31].

This work is in the context of an affective robotic or vir-tual tutor. It has been shown that understanding affect ina learning environment is key to aiding and assisting stu-dent progression [22]. While it is understood that too muchemotion is bad for rational thinking, recent findings suggestthat so too is too little emotion as intelligent functioningis hindered when basic emotions are missing in the brain[22]. Whilst progress is being made on robots that can ma-nipulate facial expressions, gestures and language to portrayemotion, some of the more affordable robotic platforms suchas the NAO robot, do have limited facial expressions. In ad-dition, endowing virtual agents with natural facial emotionscan be computationally expensive and time consuming toprogramme. The work discussed here is important to thefield as it gives insights into the use of sound emblems, alanguage- and platform-independent mechanism for express-ing affect in a multimodal system.

2. EMPATHIC ROBOT TUTORThe robotic tutor was designed as part of the FP7 Emote

project1 to be an empathic tutor who aids students agedbetween 11-13 years to exercise their map reading skills aspart of the geography school curriculum. The set-up is mul-timodal in that a map-related task is presented to them ona large interactive touch-table or tablet [2]. The students

1http://www.emote-project.eu

Page 3: Sound Emblems for Affective Multimodal Output of a Robotic ...emotional valence of the robot feedback. We discuss this in the context of an adaptive robotic tutor interacting with

are presented with a series of tasks, which can be solved byusing their skills in reading directions, measuring distancesand identifying symbols on a map. For instance, a sam-ple task would be “Go North West for 550 meters and finda museum”. Figure 1 gives the architecture whereby therobot uses sensors to determine the user’s affect and learnermodel, and adapts its interaction accordingly through theInteraction Manager [8]. The system generates personalisedpedagogical feedback that is aligned with the user’s affect,for example, the robot gives hints and encouraging remarksif the user is getting frustrated. Here, we discuss a percep-tion study that has informed the generation of this affectivefeedback as part of the Multimodal Output Manager mod-ule. All modules and other related resources are availablefrom the Emote project website.

Figure 1: Architecture of the Emote systemwhereby the Multimodal Output Manager is in-formed by this study

3. PREVIOUS WORKThe ultimate goal of the design of socially intelligent robots

is the development of systems capable of communicating andinteracting with humans at a human-like level of proficiency.This includes the use of naturally sounding human voicesor other types of sounds to portray emotional feelings [9].In previous studies, a positive effect has been observed inrobot-child interaction when the NAO robot expresses emo-tion through voice, posture, whole body poses, eye colour,and gestures [33]. Other child-robot interaction studies haveindicated that empathy facilitates interaction [2], and thathuman-based expressions can be successfully implementedby robots [18]. Finally, automatic feedback has been closelyexamined for tutoring systems [6]. However, little work hasbeen done on non-human sounds and the effects they haveon interaction, learning, and how the robot is perceived.

Some of the research in this field has relied on soundsthat are rich in music-like qualities, e.g. [9] who produceda small set of five sounds to represent speech acts. Otherresearchers have focused on simpler sounds, such as linearmodifications of pitch or duration of monophonic sounds [14,16, 15]. Conceptually, much of the research in this field hasbeen inspired by the sound design of science fiction movies.Movie portrayals of robots such as R2-D2 and WALL-E arereflected in recent approaches with regards the creation ofnon-verbal sounds by several authors as a means to replaceor augment synthetic speech lacking paralinguistic cues [16,23, 24]. The perception of abstract sounds, like other per-ceptual processes in psychology, can be expected to be de-pendent on further contextual information. As argued by[23], simple non-linguistic utterances can be used to shift theburden of interpretation to the human because such sounds

have less semantic content than spoken language. Questions,therefore, arise as to whether verbal utterances with simi-lar intended valence provide the necessary context to alignwith the intended valence of the sound emblem; and ulti-mately whether sound emblems, when combined with verbalutterances, can be used to shape the emotional climate ofinteraction between robots and humans.

4. METHOD120 participants (28 female, MAge = 31.02, SDAge = 8.03)

from 35 countries were recruited on Crowdflower’s onlineplatform2. Each participant listened to and rated a set of 5utterances randomly picked from the set of 30 stimuli andwere paid approximately US$1.50 on completion of each set.A total of approximately 5,500 ratings were collected.

4.1 The StimuliThe stimuli were composed of either a text-to-speech (TTS)

generated utterance paired with a sound emblem, or a TTSutterance presented in isolation. Half of the utterances werenegative tutor feedback utterances and half positive feed-back. Both sets of wording were chosen by learning andteaching experts and provide varying syllable length andvarying levels of valence, e.g. from high positive valence“Way to go!” to a more sedate “You got it right”. The TTSused was the Jack voice, a Received Pronunciation (RP)male voice from Cereproc3. The exact wording is given be-low:

• Positive: “Amazing Job”, “Excellent!”, “Way to go!”,“Wow!”,“You got it right”

• Negative: “No that’s 15 metres”, “That’s incorrect”,“OK, so that’s not quite right”,“No that’s a telephone”,“That solution is wrong unfortunately”

Each of the above-listed 10 utterances were matched upwith a sound emblem with similar intended valence. Therewere 3 versions of each of these utterance: one with the em-blem before the utterance, one with the emblem after andone with speech only utterance. The sound emblems are dis-cussed in detail in the following section and were not evalu-ated in isolation in this study as they have been previouslyvalidated for valence as reported in [11]. Each stimulus wasevaluated on three (7-point) Likert scales, in order to mea-sure perceived ‘Pleasantness’, ‘Naturalness’, and ‘Politeness’(Dependent Variables).

‘Pleasantness’ was designed to be used as an estimate ofemotional valence to test if negative robot feedback is per-ceived as more unpleasant than positive feedback. We de-cided to investigate ‘Politeness’ as politeness strategies arecommonly employed in one-to-one tutoring interactions andhave been shown to both enhance and inhibit learning [21].Both ‘Naturalness’ as well as ‘Politeness’ scales have beenused in a number of perceptual tests evaluating the outputof multimodal output modules and natural language gener-ation systems, in particular to inform surface realisation fordifferent user personalities [19] and preferred styles [3].

2http://www.crowdflower.com3http://www.cereproc.com

Page 4: Sound Emblems for Affective Multimodal Output of a Robotic ...emotional valence of the robot feedback. We discuss this in the context of an adaptive robotic tutor interacting with

Figure 2: Digital version of the Affect Grid adaptedfrom [27] as a single-item measure of valence andarousal. In this example, a judgment (indicated bythe cross) has been made on the 9x9 grid indicatinga response of ‘7’ for valence, and ‘8’ for arousal

4.2 The EmblemsSound emblems are short nonverbal utterances that are

conceptually separate from vocal “affect bursts” in that theyare on the “pull” end of Scherer’s [29, 30] push-pull dimen-sion. Scherer derived this conceptual distinction on the basisof prior work on the acoustic origin of facial expressions andnonverbal behavior [20, 10, 4], wherein the facial muscula-ture “pushes” the upper vocal tract to produce “raw affectbursts”, e.g. laughter [31]. At the other end of this contin-uum are pull effects that require the shaping of an expressionto conform to socio-cultural models [30]. Vocal emblems,e.g. “Yuck!” are, therefore, conceptually much closer to ver-bal interjections than raw affect bursts [31].

In the present research, we studied nonverbal sound em-blems rather than verbal interjections or affect bursts be-cause we aimed to interlink these emblems with verbal ut-terances of a robotic tutor that can be seen as standing in acultural tradition of robots such as R2-D2 and others thathave made prominent appearances in movies. These types ofrepresentations have shaped collective cultural expectationsabout the sounds to be made by small humanoid-lookingrobots. We thus utilize the term sound “emblem” to empha-size the conceptual distance from raw affect bursts, as well asto denote the explicit and culturally shaped signaling aimsfor which they were originally created.

The sound emblems were selected from BEST, a publiclyavailable sound database [11]. The BEST sounds were gen-erated by 9 expert performers on an iPad, via the Bebot app[1], a touch sound synthesizer. The database has been vali-dated by 194 unique judges, from 41 different countries. Thisevaluation comprised judgments for discrete emotions usingthe Differential Emotions Scale [7], as well as judgments forvalence and arousal obtained on the basis of dimensionalvalence-arousal circumplex models [25, 26, 34]. To avoidsubject fatigue during these validation studies, sounds wererandomly distributed across 4 subsets, and each individualsound was evaluated by at least 40 subjects using a digitalversion of the 9x9 Affect Grid (see Figure 2) adapted from[27] to provide a fast and user-friendly single-item measurefor valence and arousal. The Affect Grid is a frequentlyused single-item measure [28] that has been independently

Figure 3: Sound selection with BEST-IDs of positiveand negative stimuli from the BEST dataset as afunction of their valence and arousal ratings. Errorbars represent standard error of the mean (SEM)

validated and found to be valuable in situations where moretime-consuming measures would not be feasible [12].

Five negative sound emblems, and five positive ones wereselected from the BEST database (see Figure 3) and assignedrandomly to the respective negative and positive TTS utter-ances. The selection followed two criteria: the duration wasshorter than 2s, and the sound scored high vs. low on va-lence. After the selection, the sound files (TTS + acousticemblems) were equalized and mixed with SoX.

5. RESULTSWe tested three hypotheses for each of the three rating

scales (dimensions): I. (‘Valence’) whether there is indeeda significant difference between the rating of what we de-fined ‘positive’ and ‘negative’ stimuli in terms of our threedimensions of ‘Pleasantness’, ‘Politeness’ and ‘Naturalness’;II (‘Emblem Addition’) whether adding an emblem reducesor increases the values of the three dimensions; and III. (‘Po-sition Factor’) whether the position of the sound emblem,before or after the utterance, produces a significant differ-ence on the judges’ attribution. In order to test the abovementioned hypotheses, we carried out a Repeated-MeasuresANOVA for each of the rating scales (Dependent Variables),separately. Furthermore, we conducted a series of post-hoccontrasts. For this purpose, all contrasts were computedusing the Dunnett-Tukey-Kramer (DTK) pairwise multiplecomparison test adjusted for unequal variances and unequalsample sizes [17].

Figure 4 shows the rating results for the three dimen-sions. Firstly with regards Hypothesis I, the results con-firm that for the ‘Valence’, there is a significant difference(p < 0.05) between the ratings of negative feedback andpositive feedback across two of the three dimensions: ‘Pleas-antness’ F (1, 119) = 4.92 and ‘Politeness’ F (1, 119) = 4.46.In other words, positive feedback is considered significantlymore polite and pleasant than negative feedback, regardlessof whether it contains a sound emblem or not. There wereno significant effects for ‘Naturalness’, that is to say neg-ative and positive stimuli are considered to have the same‘Naturalness’, which is to be expected given that the TTSengine and selection of sound emblems are the same acrossstimuli.

Page 5: Sound Emblems for Affective Multimodal Output of a Robotic ...emotional valence of the robot feedback. We discuss this in the context of an adaptive robotic tutor interacting with

Figure 4: Mean ratings from a 7 point Likert Scale for ‘Pleasantness’, ‘Politeness’ and ‘Naturalness’. Meanand standard deviations are given

With regards Hypotheses II and III, for all three dimen-sions, the ANOVA showed a main effect (p < 0.05) for the‘Position Factor’ (3 levels: ‘sound before’, ‘sound after’, ‘ut-terance only’)4. The DKT post-hoc comparison helps us tounderstand that, as it happens, for all three dimensions, thesignificant contrasts (p < 0.05) are between ‘utterance only’vs. ‘sound after’, as well as, between ‘utterance only’ vs.‘sound before’, but not between ‘sound after’ vs. ‘sound be-fore’. Therefore, for our participants, the stimuli presentedin the ‘utterance only’ condition are perceived as more pleas-ant, natural and polite than those stimuli paired with anutterance with a sound emblem in either position. In otherwords, adding an emblem to an utterance (be it positiveor negative) significantly reduces the ‘Pleasantness’, ‘Natu-ralness’ and ‘Politeness’. For positive feedback, adding anemblem, in either position, does not increase the valence(i.e. make it more positive), in fact, surprisingly, it has theopposite effect and makes it perceived as significantly morenegative on the three dimensions (p < 0.05).

6. DISCUSSION AND FUTURE WORKOur study suggests that, even if every attempt is made to

make negative feedback sound pleasant and polite, that itwill still be deemed as less so than positive feedback. Thiswork can, therefore, be used to inform the design of roboticmultimodal output, allowing the system designer to be awareof the emotional valence induced by the various types offeedback so as to define strategies to mitigate any negativeconsequences. Whilst the results concerning the use of soundemblems are somewhat unexpected, it may be that the bi-nary categorisation of negative/positive may be too coarseand perhaps a more fine-grained and sophisticated pairingbetween the utterances is required in terms of utterance con-tent and length along with the intended valence of the soundemblems.

In addition, it is possible that users are so unused to hear-ing sound emblems of this type in everyday conversation thatthey were not sure how to judge them. This lack of generalconsensus on how to interpret the emblems is reflected by the

4F (2, 238) = 28.88 for ‘Pleasantness’; F (2, 238) = 22.17 for‘Politeness’; and F (2, 238) = 32.24 for ‘Naturalness’

standard deviations given in Figure 4, which are greater forthose utterances containing a sound emblem. Furthermore,it should be noted that, while participants in the presentstudy were informed that they would be listening to robotspeech, they were given no concrete visual representation ofthe robot. Therefore, it is possible that basic differences inperceived ‘Naturalness’ between ‘with sound emblem’ and‘without sound emblem’ stimuli may have been partially re-sponsible for the overall reductions in ‘Pleasantness’ for thecomposite stimuli. In an embodied presentation situation,the sounds might thus have been perceived as more congru-ent with the TTS than it was the case in the present onlinestudy.

One advantage of sound emblems is that they are lan-guage independent. However, further studies would need toinvestigate cultural differences along the lines of the studiesreported in [5]. In addition, future work involves repeatingthe study for the target age group of children and with thephysical embodiment of the robot.

In the Emote system for a robotic tutor, the multimodaloutput is currently categorised as a binary negative or pos-itive and the choice of this feedback is triggered by theinferred affect of the user in terms of four broad affectivestates: frustrated, excited, bored and neutral. Future workwould involve making the system adaptive at a more fine-grained level both in terms of the affect it infers from theuser and the multimodal output it generates. The study de-scribed here moves the field closer to these highly sensitiveaffective systems.

7. ACKNOWLEDGMENTSThis work was supported by the European Commission

(EC) and was funded by the EU FP7 ICT-317923 projectEmote. The authors are solely responsible for the content ofthis publication. It does not represent the opinion of the EC,and the EC is not responsible for any use that might be madeof data appearing therein. We would like to acknowledge theother Emote partners for their collaboration in developingthe system described here.

Page 6: Sound Emblems for Affective Multimodal Output of a Robotic ...emotional valence of the robot feedback. We discuss this in the context of an adaptive robotic tutor interacting with

8. REFERENCES[1] R. Black. Bebot - Robot Synth. Normalware, 2014.

[2] G. Castellano, A. Paiva, A. Kappas, R. Aylett,H. Hastie, W. Barendregt, F. Nabais, and S. Bull.Towards empathic virtual and robotic tutors. In Proc.of AIED. 2013.

[3] N. Dethlefs, H. Cuayahuitl, H. Hastie, V. Rieser, andO. Lemon. Cluster-based prediction of user ratings forstylistic surface realisation. In Proc. of EACL, 2014.

[4] P. Ekman and W. V. Friesen. The repertoire ofnonverbal behavior: Categories, origins, usage, andcoding. Semiotica, 1(1):49–98, 1969.

[5] H. Elfenbein and N. Ambady. On the universality andcultural specificity of emotion recognition: ameta-analysis. 128(2):203–35, 2002.

[6] A. C. Graesser, K. Wiemer-Hastings,P. Wiemer-Hastings, and R. Kreuz. Autotutor: Asimulation of a human tutor. Cognitive SystemsResearch, 1(1):35 – 51, 1999.

[7] C. E. Izard. The psychology of emotions. SpringerScience & Business Media, 1991.

[8] S. Janarthanam, H. Hastie, A. Deshmukh, R. Aylett,and M. E. Foster. A reusable interaction managementmodule: Use case for empathic robotic tutoring. InProc. of SemDial, 2015.

[9] E.-S. Jee, Y.-J. Jeong, C. H. Kim, and H. Kobayashi.Sound design for emotion and intention expression ofsocially interactive robots. Intelligent Service Robotics,3(3):199–206, 2010.

[10] H. G. Johnson, P. Ekman, and W. V. Friesen.Communicative body movements: American emblems.Semiotica, 15(4):335–354, 1975.

[11] A. Kappas, D. Kuester, P. Dente, and C. Basedow.Simply the B.E.S.T.! creation and validation of thebremen emotional sounds toolkit. In Poster presentedat the 1st International Convention of PsychologicalScience, Amsterdam, the Netherlands, 2015.

[12] W. Killgore. The affect grid: A moderately valid,nonspecific measure of pleasure and arousal.Psychological reports, 83(2):639–642, 1998.

[13] A. N. Kluger and A. DeNisi. The effects of feedbackinterventions on performance: A historical review, ameta-analysis, and a preliminary feedback interventiontheory. 119:254–284, 1996.

[14] T. Komatsu. Can we assign attitudes to a computerbased on its beep sounds? In Proc. of the AffectiveInteractions: The computer in the affective loopWorkshop at Intelligent User Interface 2005, 2005.

[15] T. Komatsu, K. Kobayashi, S. Yamada, K. Funakoshi,and M. Nakano. Augmenting expressivity of artificialsubtle expressions (ASEs): Preliminary designguideline for ases. In Proc. of the 5th AugmentedHuman International Conference, 2014.

[16] T. Komatsu and S. Yamada. How does the agents’appearance affect users’ interpretation of the agents’attitudes: Experimental investigation on expressingthe same artificial sounds from agents with differentappearances. Intl. Journal of Human–ComputerInteraction, 27(3):260–279, 2011.

[17] M. K. Lau. DTK: Dunnett-Tukey-Kramer PairwiseMultiple Comparison Test Adjusted for UnequalVariances and Unequal Sample Sizes.

[18] I. Leite, G. Castellano, A. Pereira, C. Martinho, andA. Paiva. Modelling empathic behaviour in a roboticgame companion for children: an ethnographic studyin real-world settings. In Proc. of HRI, 2012.

[19] F. Mairesse and M. Walker. Personality generation fordialogue. In 45th Annual Meeting of the Associationfor Computational Linguistics (ACL), 2007.

[20] J. J. Ohala. The acoustic origin of the smile. TheJournal of the Acoustical Society of America,68(S1):S33–S33, 1980.

[21] N. K. Person, R. J. Kreuz, R. A. Zwaan, and A. C.Graesser. Pragmatics and pedagogy: Conversationalrules and politeness strategies may inhibit effectivetutoring. Journal of Cognition and Instruction,13(2):161-188, 1995.

[22] R. W. Picard, S. Papert, W. Bender, B. Blumberg,C. Breazeal, D. Cavallo, T. Machover, M. Resnick,D. Roy, and C. Strohecker. Affective learning — amanifesto. BT Technology Journal, 22(4):253–269,2002.

[23] R. Read and T. Belpaeme. How to use non-linguisticutterances to convey emotion in child-robotinteraction. In Proc. of the seventh annualACM/IEEE international conference on Human-RobotInteraction, pages 219–220. ACM, 2012.

[24] R. Read and T. Belpaeme. Situational context directshow people affectively interpret robotic non-linguisticutterances. In Proc. of the 2014 ACM/IEEEinternational conference on Human-robot interaction,pages 41–48. ACM, 2014.

[25] J. A. Russell. A circumplex model of affect. J. Pers.Soc. Psychol., 39(6):1161–1178, 1980.

[26] J. A. Russell. Emotion, core affect, and psychologicalconstruction. Cognition and Emotion,23(7):1259–1283, 2009.

[27] J. A. Russell, A. Weiss, and G. A. Mendelsohn. Affectgrid: a single-item scale of pleasure and arousal.Journal of personality and social psychology,57(3):493, 1989.

[28] Y. I. Russell and F. Gobet. Sinuosity and the affectgrid: A method for adjusting repeated mood scores 1.Perceptual and motor skills, 114(1):125–136, 2012.

[29] K. R. Scherer. On the symbolic functions of vocalaffect expression. Journal of Language and SocialPsychology, 7(2):79–100, 1988.

[30] K. R. Scherer. Affect bursts. Emotions: Essays onemotion theory, 161:196, 1994.

[31] M. Schroder. Experimental study of affect bursts.Speech communication, 40(1):99–116, 2003.

[32] P. Sharp. Behavior modification in the secondaryschool: A survey of students’ attitudes to rewards andpraise. 9:109–112, 1985.

[33] M. Tielman, M. Neerincx, J.-J. Meyer, and R. Looije.Adaptive emotional expression in robot-childinteraction. In Proc. of the 2014 ACM/IEEEinternational conference on Human-robot interaction,pages 407–414. ACM, 2014.

[34] M. Yik, J. A. Russell, and J. H. Steiger. A 12-pointcircumplex structure of core affect. Emotion,11(4):705–731, 2011.