The Effect of Visual vs. Verbal Stimuli on Gesture

20
ORIGINAL PAPER The Effect of Visual vs. Verbal Stimuli on Gesture Production Autumn B. Hostetter  Catherine J. Skirving Published online: 26 April 2011  Springer Science+Business Media, LLC 2011 Abstract  Previo us research sug gests that speak ers gesture mo re when they are descr ibing information learned in a visuo-spatial format than when describing information learned in a verbal format. The present study examined whether this difference in gesture is due to decreased verbal accessibility when information has been learned visuo-spatially or to the dec reas ed acce ssib ilit y of mental imag es whe n inf ormatio n has bee n lear ned verb ally . Participants described the events of stories learned in two conditions. In one condition, the participants heard a verbal description twice. In the other condition, the participants heard the verbal description once and then watched an animated cartoon of the events. We found increased representational gestures when speakers had watched the cartoon in addition to hearing its description. Importantly, this increase in gesture does not appear to be due to differences in speech accessibility or output, suggesting that speakers gesture when they have a clear image in mind and not just when they do not have ready access to verbal labels. Keywords  Gesture    Speech    Imagery Introduction Speakers often accompany their speech with hand and arm movements. While some of the se mov emen ts, refe rred to as  beat gestu res, emphasize the import ance or stress of  cert ain wor ds, oth ers actu ally depict important aspe cts of a spea ker ’s mean ing. The se movements that depict meaning are often referred to as  representational gestures  because they create a physical representation of the speaker’s meaning (McNeill  1992). Not sur- prisin gly, representat ional gestures occur most frequen tly with speech about topics that can be the most easily represented in physical space, namely spatial or motor topics (e.g., A. B. Hostetter (&)    C. J. Skirving Department of Psychology, Kalamazoo College, 1200 Academy Street, Kalamazoo, MI 49006, USA e-mail: Autumn.Hostetter @kzoo.edu C. J. Skirving e-mail: [email protected] J Nonverbal Behav (2011) 35:205–223 DOI 10.1007/s10919-011-0109-2

Transcript of The Effect of Visual vs. Verbal Stimuli on Gesture

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 1/20

O R I G I N A L P A P E R

The Effect of Visual vs. Verbal Stimuli on Gesture

Production

Autumn B. Hostetter   • Catherine J. Skirving

Published online: 26 April 2011  Springer Science+Business Media, LLC 2011

Abstract   Previous research suggests that speakers gesture more when they are describing

information learned in a visuo-spatial format than when describing information learned in a

verbal format. The present study examined whether this difference in gesture is due to

decreased verbal accessibility when information has been learned visuo-spatially or to the

decreased accessibility of mental images when information has been learned verbally.

Participants described the events of stories learned in two conditions. In one condition, the

participants heard a verbal description twice. In the other condition, the participants heardthe verbal description once and then watched an animated cartoon of the events. We found

increased representational gestures when speakers had watched the cartoon in addition to

hearing its description. Importantly, this increase in gesture does not appear to be due to

differences in speech accessibility or output, suggesting that speakers gesture when they

have a clear image in mind and not just when they do not have ready access to verbal labels.

Keywords   Gesture    Speech     Imagery

Introduction

Speakers often accompany their speech with hand and arm movements. While some of 

these movements, referred to as   beat gestures, emphasize the importance or stress of 

certain words, others actually depict important aspects of a speaker’s meaning. These

movements that depict meaning are often referred to as  representational gestures  because

they create a physical representation of the speaker’s meaning (McNeill  1992). Not sur-

prisingly, representational gestures occur most frequently with speech about topics that can

be the most easily represented in physical space, namely spatial or motor topics (e.g.,

A B H (&) C J Ski i

J Nonverbal Behav (2011) 35:205–223

DOI 10.1007/s10919-011-0109-2

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 2/20

Alibali   2005; Feyereisen and Havard   1999; Krauss   1998). This co-occurrence of repre-

sentational gestures and spatio-motor topics could arise for several reasons.

One possibility is that speakers produce representational gestures with spatial and motor

topics because the gestures help speakers communicate clearly about these topics. For

example, Beattie and Shovelton (1999) found that representational gestures are particularlygood at communicating information about the relative size and location of objects. Further,

in a recent meta-analysis, Hostetter (2011) found that gestures are more communicatively

effective when they accompany speech about motor topics than when they accompany

speech about abstract topics. It seems then that speakers may use representational gestures

specifically when they are talking about spatial or motor topics because it is precisely in

these speaking situations that gestures are the most beneficial to communication.

While tenable, the above communicative explanation is unlikely to explain fully the co-

occurrence of representational gestures with spatial and motor topics. For instance, Alibali

et al. (2001) found that speakers produced representational gestures more frequently with

speech about spatial content regardless of whether their listener could see the gestures (see

also Pine et al.   2010). Although it is possible that speakers simply get in the habit of 

producing gestures and imagine listeners in their head even when none are visibly present

(de Ruiter 2000; Fridlund et al. 1992), it is also possible that speakers continue to gesture

with spatial and motor content even in the absence of a visible audience because the gestures

are not meant to be communicative. Indeed, Bavelas and colleagues (Bavelas et al.  2002,

2008) have shown that speakers design gestures to be more communicatively effective (e.g.,

the gestures are smaller and more redundant with speech) when a listener will see the

gesture than when a listener will not see the gesture. Thus, while speakers may produce

more elaborate or particularly communicative gestures when the gestures will actuallycommunicate, it is unlikely that they always produce gestures for communicative reasons.

In fact, there is increasing evidence that, in addition to any communicative role they

may play, gestures also reflect speakers’ current cognitive states (e.g., Alibali et al.  1999;

Kita and Ozyurek  2003; Wagner et al.  2004).

One prominent set of theories proposes that speakers may actually produce represen-

tational gestures as a way of facilitating their own speaking processes. For example, Krauss

and colleagues (Krauss et al.  2000; Rauscher et al.   1996) have proposed that representa-

tional gestures might serve as cross-modal primes that serve to facilitate lexical access to

spatial words and phrases. Alternatively, Kita and colleagues (Alibali et al.   2000; Kita

2000; Melinger and Kita   2007) have proposed that representational gestures might be aneffective means of breaking a spatial and holistic image down into component pieces that

can be described in the linear stream of speech. Both of these theories predict that speakers

should produce more gestures when speech production is difficult than when it is easy.

There is evidence supporting this prediction. Speakers produce more representational

gestures when they are speaking extemporaneously, for example, compared to when their

speech has been rehearsed (Chawla and Krauss  1994). Further, representational gestures

are more prevalent when speakers are describing figures that are difficult to name than

when they are describing figures that are easy to name (Graham and Argyle 1975; Morsella

and Krauss 2004). Finally, speakers also gesture more when there are more options aboutwhat to say than when there are fewer options (Hostetter et al.  2007; Kita and Davies 2009;

Melinger and Kita 2007). Taken together, this evidence suggests that speakers produce

206 J Nonverbal Behav (2011) 35:205–223

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 3/20

motor topics. Although there is little direct evidence regarding the clarity of the verbal

representation that underlies spatial and motor descriptions, it does seem likely that

speakers form a mental image of spatial and motor information that might not be

accompanied by a well-planned verbal representation (e.g., Reichle et al.   2000). This

reliance on mental images over verbal representations may lead to increased speech pro-duction difficulties when describing spatial and motor information. For example, Bortfeld

et al. (2001) found that speakers produced speech that was more disfluent when describing

abstract spatial pictures, for which there are few clear words available, than when

describing photographs of children, for which there is a more readily available vocabulary.

Further, Roberts et al. (2009) found that speakers tended to produce more filled pauses and

other speech disfluencies when describing a topic that contained a lot of spatial and motor

information (e.g., a sport) than when describing a topic that contained less spatial and

motor information (e.g., their job). Thus, it is possible that speakers have particular dif-

ficulty describing spatial and motor topics, and that this increased difficulty leads to

increased gesture production.

Under this view, speakers should produce representational gestures with spatial topics

less frequently when they have ready access to appropriate verbal labels than when they do

not. Hostetter and Hopkins (2002) tested this hypothesis by comparing the representational

gestures (which they called ‘‘lexical movements’’) produced by speakers who retold the

events of an animated cartoon that they watched to those produced by speakers who retold

the same events after reading about them. They found that speakers who had read the

written description produced fewer gestures than speakers who had watched the cartoon.

They interpreted these results as suggesting that speakers produce representational gestures

when they do not have ready access to the words necessary to express their meaning.However, there is also an alternative explanation for the Hostetter and Hopkins ( 2002)

findings. Speakers who watched the animated cartoon not only had a weaker verbal rep-

resentation of the events than speakers who read the description, but they also had a

stronger spatial representation. Thus, the increase in representational gestures in the car-

toon condition may be due not to more difficult lexical access and verbal planning, but to

the presence of richer, clearer mental images in the mind of the speaker.

This latter interpretation is compatible with several theories that propose that repre-

sentational gestures arise from mental images (e.g., de Ruiter   2000; Kita and Ozyurek 

2003). One recently proposed theory, the Gesture as Simulated Action (GSA) framework,

situates gesture within a larger embodied cognitive system to explain how it is that mentalimages give rise to representational gestures (Hostetter and Alibali   2008). In line with

embodied theories of cognition (e.g., Barsalou 1999; Glenberg 1997), the GSA framework 

proposes that speakers frequently run neural simulations of the motor and spatial events

they describe. These neural simulations utilize the same areas of motor and visual cortex

that are involved in actually producing or viewing the described events. When these

simulations and the corresponding activity in the motor cortex are particularly strong, the

simulation is overtly expressed in the form of a representational gesture. Under this view,

the correspondence between representational gestures and speech about spatial or motor

topics is due to the strong action simulations that are often involved when speakers talk about imagistic topics.

For example, consider a speaker who is talking about a triangle. According to the GSA

J Nonverbal Behav (2011) 35:205–223 207

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 4/20

as she automatically activates information about how to physically interact with, reach for,

or touch the triangle (Goodale   2001; Tucker and Ellis   1998). As a result of this motor

cortex activation, the speaker might produce a representational gesture that enacts this

motor simulation. However, if the speaker does not actually visualize the triangle she is

talking about, then her visual cortex, and in turn her motor cortex, will not be activated,and a representational gesture will not be produced.

The claim at the center of the GSA framework (and other theories that propose an

imagistic source of gesture) is thus that representational gestures occur when speakers

describe events that they are thinking about imagistically. In contrast, when speakers

describe events by activating verbal codes, without activating the images that are associ-

ated with those verbal codes, they should produce fewer representational gestures.

According to this view, the results reported in Hostetter and Hopkins (2002) are due to the

strong reliance on visual images when speakers have watched the cartoon, rather than to

the decreased access to verbal codes. The actual events that were viewed in the cartoon

were visualized and mentally recreated by the speaker, thus leading to stronger simulations

and more representational gestures. Of course, it is likely that some events are visualized

by speakers even when they have only read about the events, as there is much research to

suggest that language comprehension can also rely on perceptual simulations (Glenberg

and Kaschak   2002; Spivey et al.   2005; Zwaan et al.   2002). However, the simulations

involved in thinking about a verbal description may occur less frequently and involve

weaker motor activation than the simulations involved in thinking about perceived images.

Indeed, Willems et al. (2010) found that more areas of primary motor cortex were activated

when participants imagined themselves performing a particular action than when they

merely read a word describing the action.This is a different explanation for the Hostetter and Hopkins (2002) results than that

proposed by theories that focus on the role of representational gestures in facilitating speech

production. The two views are not incompatible, and indeed, Hostetter and Hopkins may

have observed the combined effect of having both a weak verbal representation and a strong

spatial representation after watching the cartoon. However, the GSA framework predicts

that even when verbal labels are readily accessible in both conditions, speakers will still

produce more representational gestures when they have a strong spatial representation of the

events than when they do not. The purpose of the present study is to test this claim.

Toward this aim, speakers in the present study described the events that occurred in

animated cartoons that they had learned in two different conditions. In both conditions,participants first learned about the events by hearing a verbal description. In the verbal

condition, they then heard this verbal description a second time before retelling the events.

In the spatial condition, they watched an animated cartoon of the events before retelling

them. In this way, participants were exposed to the appropriate verbal labels and linguistic

structures for describing the events in both conditions, and they also had two exposures to

the events in both conditions. However, in the spatial condition, they also actually saw

images corresponding to the events, while in the verbal condition they did not. The GSA

framework predicts that this exposure to images will lead speakers to produce more rep-

resentational gestures in the spatial condition than in the verbal condition.The GSA framework makes no predictions regarding the effect of the present manipu-

lation on beat gestures (non-representational gestures that appear to serve a communicative

208 J Nonverbal Behav (2011) 35:205–223

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 5/20

speech is more likely to be repaired in one condition over the other. Note that these two

views actually make completely opposite predictions about which condition should see the

most beats. If beat gestures are associated with strong verbal representations, we might

expect more beats in the verbal condition. If beats are associated with verbal repairs during

speech production, we might expect more beats in the spatial condition, particularly if thereis evidence that speakers have more difficulty producing speech in that condition.

Method

Participants

Thirty-seven undergraduate students (22 female) at Kalamazoo College volunteered to

participate in exchange for extra credit in their General Psychology course. All participants

were native English speakers, with 73% self-identifying as Caucasian, 10% as African-

American, 5% as Asian–American, and 5% as multi-racial. The remaining 5% preferred

not to specify their ethnicity.

Stimuli and Materials

Four cartoon clips from the German series ‘‘Die Sendung mit der Maus’’ were shown in the

spatial trials of the experiment. The clips ranged in length from 34 to 75 s, and each clip

portrayed a big orange mouse and a little blue elephant interacting with each other and

their environment. Music and sound effects accompanied the events in each cartoon, butthere was no spoken dialogue. Cartoons were presented in color on a Macintosh ibook 

computer with a 14-inch screen.

A verbal description was written for each cartoon. To create these verbal descriptions,

four research assistants viewed the cartoons and wrote a description of each. These

descriptions were then combined into a single composite verbal description for each car-

toon by the lead investigator. Any event that was included in at least two of the

descriptions was included in the composite. Further, for each event that was included in the

composite, the lead investigator chose the wording from the descriptions that she perceived

as the clearest. Five undergraduate research assistants then reviewed the composite

descriptions, and small wording changes were implemented based on their feedback regarding clarity of expression. The final composite descriptions ranged from 125 words to

201 words in length. A fifth cartoon was used as the basis for a practice story. A composite

description was also created for this cartoon that was 92 words long. The complete verbal

descriptions can be seen in ‘‘Appendix’’.

Procedure

Participants arrived individually for a study about spatial memory and communication, but

they did not know that gestures were the primary interest. Participants were greeted by oneof five undergraduate female experimenters. All five experimenters followed a precise

script when interacting with participants and each experimenter tested between 3 and 11

J Nonverbal Behav (2011) 35:205–223 209

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 6/20

Experimenters first read the practice story out loud twice and asked the participants to

retell it in their own words, as if they were telling it to someone who had never heard it

before. Following the participants’ description, the experimenters gave feedback about the

level of detail and memory that was expected during the experiment. Typically, this simply

involved assuring the participants that they had done a good job and encouraging them tocontinue in a similar manner for the remaining stories. Occasionally when a participant

gave a particularly terse description, the experimenters simply reminded him or her that the

goal was to explain the story to someone who had never heard it before, so including a few

more details would be helpful.

The participants then completed the four experimental trials. Approximately half of the

participants (n  =  19) completed the first two trials in the spatial condition and the second

two in the verbal condition. The remaining participants experienced the reverse order.

Additionally, the order of the four stories was counterbalanced across participants. The

four stories were divided into two pairs that each contained one long story ([60 s) and one

short story (\60 s); these pairs then alternated between being presented first or second in

the trial sequence. Thus, there were four possible condition x story orders to which par-

ticipants were randomly assigned. Distribution of participants across these four orders was

approximately equal, with nine or ten participants in each order. Further, the four orders

were also randomized across experimenters; each experimenter administered each of the

four Condition  9  Story orders with similar frequency.

In the verbal trials, the experimenters read the description out loud twice to the par-

ticipant, as they had done in the practice trial. They paused briefly between the two

readings, saying ‘‘now I will read the story a second time.’’ Following the second reading,

participants retold the story in their own words. In the spatial trials, the experimenters firstread the verbal description of the cartoon to the participant, exactly as they did for the first

portion of the verbal trials. Following this first reading, the experimenters paused and said

‘‘now I will show you a cartoon of the story’’ and pressed play on the computer to begin the

cartoon. After viewing the cartoon, the participants retold the events of the story in their

own words, just as they had done in the verbal trials. The experimenter remained oriented

toward the participant at all times and gave brief nods or smiles of understanding

throughout all four of the participants’ descriptions.

Before each description, the experimenter pressed ‘‘Record’’ on an audio-recorder.

There was also a hidden video camera that recorded the participants’ descriptions and

accompanying gestures. The camera was hidden so that participants did not become self-conscious about their gestural behavior or become suspicious that gesture was the purpose

of the study. The use of a hidden camera is in line with other studies investigating speech

accompanying gestures (e.g. Alibali et al.   2001; Hostetter et al.   2007). All participants

were debriefed about the interest in gesture and the presence of the video camera at the end

of their participation, and all consented to have their video data included in the study. The

entire procedure lasted about 20–25 min. These methods were approved by the Institu-

tional Review Board at Kalamazoo College.

Data Coding

The descriptions were transcribed verbatim and broken into clauses. We coded each clause

210 J Nonverbal Behav (2011) 35:205–223

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 7/20

describe each key event to the verb that had been used to describe that event in the verbal

description. Tense changes were considered to be the same verb, but all other changes were

coded as different. Table 1 displays the key events from each story, as well as examples of 

descriptions that used the same verb and descriptions that used a different verb.

We also counted the number of descriptive words used in each description. Descriptivewords were defined as words that modified a noun or verb, either describing size (e.g.,

small, little, big), color (e.g., blue, green, orange), speed (e.g., fast, slowly), position (e.g.,

back pocket, right side), emotional state (e.g., frustrated, mad), or how something hap-

pened or sounded (e.g., honking noise; fixed magically). These descriptive words were then

further divided into descriptive words that described visual or spatial properties, which

included size, color, speed, and position words, and descriptive words that described non-

visual properties, which included descriptions of emotions, noises, or events.

In addition to coding speech content, we also counted the number of speech disfluencies

that occurred in each description. We recognized five types of speech disfluencies:  unfilled 

 pauses, in which the speaker paused his or her speech for one second or more,  filled pauses

such as um, uh, or er;  restarts in which the speaker stopped mid clause and restarted with a

new syntactic frame (e.g., ‘‘and then the elephant came- well the mouse was doing flips’’);

repairs   in which the speaker changed a word within a syntactic frame (e.g., ‘‘and then he

was spin- flipping over the bar’’); and  repetitions  in which the speaker repeated a word or

part of a word (e.g., ‘‘the the mouse came in’’).

The manual gestures accompanying each description were also noted and coded as

representational or beat by one primary coder. Although this coder had acted as an

Table 1   Key events from each of the four stories

Story Key event Example description with

same verb

Example description with

different verb

Banana Mouse tosses the banana in

the air

 He tosses it in the air He throws it in the air 

The peel bounces back out of 

the basket

The peel bounces back out And the peel comes back out of 

the basket 

He turns the basket upside

down

Then turns the wastepaper 

basket over 

Then he flips it over 

HighBar The mouse starts to swingback and forth

 And then swings back and  forth slowly at first 

 And starts moving back and forth

The elephant applauds   And the little elephant um

applauds by stomping

The elephant claps with both his

back and front feet 

The man skips underneath

the bar

Um and he skips under the bar And walks underneath the

gymnastics bar 

Pancake Tries to flip the pancake   He tries to flip the pancake He decides to try turning it over 

It lands on top of the mouse’s

face

 And this time it lands on his

 face

 But instead it falls on his face

The mouse signals for the

elephant to back up

 And um signals for the

elephant to move back 

 He tells the elephant to back up

Pottery He places a bucket full of 

clay on the bench

 And places a bucket of clay

behind the wheel

 He sets the bucket on the bench

J Nonverbal Behav (2011) 35:205–223 211

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 8/20

experimenter for a subset of the participants, she was unable to discern the experimental

condition of each description during coding. Hand movements that conveyed semantic

information were coded as representational gestures. For example, a movement where the

speaker’s right hand was held in a loose fist that flipped up at the wrist along with the

words ‘‘he   tried to flip   the pancake’’ was coded as a representational gesture. Represen-tational gestures are often divided into smaller categories, including deictic gestures, which

denote objects or locations in space, and metaphoric gestures that depict abstract ideas.

Deictic gestures were rare in this dataset but were coded as representational when they did

occur (e.g., ‘‘the mouse came in’’ with a point to the left). Metaphoric gestures did not

occur in this data set.

Simple movements that did not convey semantic information were coded as beat ges-

tures. For example, a simple up and down movement with the word waste in the clause

‘‘there was a  wastepaper basket’’ was coded as a beat gesture.

Reliability

A second coder randomly selected 14 participants and compared the verbs used by these

participants to describe each of the key events against those used in the verbal description.

Reliability with the primary coder was nearly perfect (Cohen’s   j  =   .99). One verbal

description from each of 14 participants (approximately 11% of the data) was also ran-

domly chosen for coding reliability of speech disfluencies and reliability for coding spatial

and non-spatial descriptors. Reliability was 82% for coding unfilled pauses and 89% for

coding other speech disfluencies (filled pauses, restarts, repairs, and repetitions). Reliability

was 86% for recognizing descriptive terms and 100% for classifying each descriptor( N  =  93) as spatial or non-spatial.

To establish reliability for gesture coding, one description from 14 participants was

randomly chosen and coded by a second coder. Agreement was 89% for segmenting the

stream of manual activity into distinct gestures. Agreement was 95% for classifying each

gesture ( N  =  172) as representational or beat (Cohen’s  j  =   .84). For all reliability coding,

the second coder was blind to the experimental condition of each description.

Results

The main hypothesis in the present study is that speakers will produce more representa-

tional gestures when they have seen visual images of the events they describe than when

they have only heard the events described verbally. We predict that this effect will occur

because of the stronger spatial representations formed by participants after watching the

cartoon, rather than because of any differences in the speech produced in the two condi-

tions. Thus, before analyzing the gestures produced in each condition, we examined

whether the speech produced in the two conditions was comparable. We then used the

results of the speech analyses to introduce appropriate controls into the gesture analyses

that ensure that any differences in gestures are not the byproduct of differences in speech.For all analyses reported here, experimenter, condition order, and story order were initially

included as additional between-subjects factors. There were no significant main effects or

212 J Nonverbal Behav (2011) 35:205–223

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 9/20

Analysis of Speech

Previous research has found that speakers produce more words when describing events

they have learned by watching a cartoon than when describing events they have learned by

reading a verbal description (Parrill et al.   2010). As a result, the absolute number of gestures produced by speakers may be greater after watching a cartoon simply because the

increased amount of speech affords more opportunities to gesture. We compared the

average number of words spoken by participants in the two conditions with a paired

samples   t   test and found a significant difference at the .10 level,   t (36)  =  1.80,   p  =   .08

(two-tailed). Although this difference is not significant at the conventional .05 level, it does

suggest caution in interpreting any difference in gesture frequency. Because speakers

produced marginally more words after watching the cartoon ( M  =  153.60, SD  =  42.78)

than after hearing the description ( M  =  143.07, SD  =  19.57), they may simply have more

opportunities to gesture with the longer descriptions that follow the cartoon. We will

therefore limit our gesture analyses to comparisons of gesture rates per 100 words.

In addition to producing more total speech after watching the cartoon, it is also possible

that speakers were more likely to describe events that they had not heard described in the

verbal description after watching the cartoon. The cartoon includes many details, such as

the pace of the background music and the relative sizes and positions of the characters on

the screen, that were not included in the description read to the participants. Although these

details are generally unimportant to the events in the story, participants may have been

more likely to talk about them after watching the cartoon than after only hearing the

description, and this could have implications for the gesture analyses. We compared the

average number of extra events described by participants in the two conditions with apaired samples t  test. Speakers did in fact describe more extra events per story in the spatial

condition ( M  =   1.60, SD  =  1.92) than in the verbal condition ( M  =   .24, SD  =   .38),

t (36)  =  4.10, p\ .001. Thus, it is possible that any difference in gesture rates between the

two conditions could be the result of the increase in extra events being described after

watching the cartoon, rather than to the direct effect of seeing the images in the cartoon.

We will therefore analyze gesture rates in two ways; once based on the complete

descriptions produced by participants and once with the extra events and corresponding

gestures removed from each participant’s descriptions.

The above analyses address the amount of speech produced in each condition and

suggest that there are systematic differences across conditions. Is the quality and content of speech also different across conditions? Speakers may have been more likely to use

descriptive terms in their descriptions when they had seen the cartoon than when they had

not. Although the verbal descriptions did include words that described both spatial char-

acteristics of the events, such as the size and color of the characters, and non-spatial

characteristics, such as the sound of the elephant’s noise or that the mouse was frustrated,

these characteristics may have been more salient in the speakers’ minds when they had

seen the cartoon than when they had only heard the events described. If speakers were

more likely to include descriptive terms in their descriptions when they saw the cartoon

than when they did not, any difference in gesture rates could be due to this difference in

descriptive content.

To examine this possibility, we calculated the proportion of words in each participant’s

J Nonverbal Behav (2011) 35:205–223 213

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 10/20

repeated measures analysis of variance (ANOVA). Speakers in both conditions produced a

higher proportion of words describing spatial characteristics ( M  =   .016. SD  =  .009) than

non-spatial characteristics ( M  =   .003, SD  =   .002),   F (1, 36)  =  83.51.   p\ .001. Impor-

tantly, however, there was no effect of condition,   F (1, 36)  =   .95,   p  =   .34, and no Con-

dition x Type interaction,  F (1, 36)  =  1.15,  p  =   .29. This suggests that speakers were notmore likely to use spatial or non-spatial descriptor terms when they had seen the events in

the cartoon than when they had only heard them described.

Although the descriptions do not differ in descriptive content across the conditions, the

content of participants’ speech may still differ in other meaningful ways across the con-

ditions. Specifically, does the content differ in terms of lexical accessibility, which is

known to affect representational gesture rates (e.g., Rauscher et al.  1996)? In an attempt to

prime speakers in both conditions with appropriate verbal labels and reduce speech pro-

duction demands, we exposed participants to a verbal description of the events in both

conditions. However, in the verbal condition, participants heard these labels twice, while in

the spatial condition, they only heard them once. While this ensured that participants had

an equal number of exposures to the events in both conditions, it is possible that the verbal

labels were more accessible when participants had heard the description twice in the verbal

condition than when they heard it only once in the spatial condition. As a result, speakers

may have had a more difficult time describing the events in the spatial condition because

the verbal labels were less accessible, and this increased difficulty in speech production

could lead to increased gesture production. To address this possibility, we examined

whether speakers did in fact have an easier time describing the stories in the verbal

condition than in the spatial condition in two ways. First, we considered the rate of speech

disfluencies produced in each condition. Second, we considered how likely speakers wereto repeat verbatim the verbs used in the initial verbal description for several key events in

each story.

Speech disfluencies are frequently taken as a sign that speakers are having trouble

finding the words or grammatical structures needed to convey their message (e.g.,

Schachter et al. 1991), and the production of speech disfluencies often co-occurs with the

production of representational gestures (Christenfeld et al.  1991). If speakers did have a

more difficult time describing the events in the spatial condition than in the verbal con-

dition, then this should be evident in the rate of speech disfluenices produced in the two

conditions. We calculated the number of speech disfluencies produced per 100 words in

each condition and compared them with a paired samples   t   test. There was no significantdifference,   t (36)  =   .11,   p  =   .91. Speakers produced just as many disfluencies per 100

words when describing stories they had heard described twice in the verbal condition

( M  =  3.03, SD  =  2.51) as they did when describing stories they had only heard described

once in the spatial condition ( M  =  2.97, SD  =  3.03). This suggests that any increase in

gesture rates in the spatial condition is not the result of increased speech production

difficulty in that condition.

We next considered how likely speakers were to repeat the exact verb they had heard in

the initial verbal description in their own description. We chose to focus on verbs because

there is much evidence that speakers are particularly likely to gesture when describingmotor events (e.g., Feyereisen and Havard   1999; Hostetter and Alibali   2007). Thus,

speakers might be particularly likely to gesture when they are coming up with a verb for a

214 J Nonverbal Behav (2011) 35:205–223

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 11/20

conditions with a paired samples  t  test. Speakers repeated the same verb used in the initial

description more frequently when they had heard that description twice in the verbal

condition ( M  =  .59, SD  =  .18) than when they had heard the description only once in the

spatial condition ( M  =   .47, SD  =   .20),  t (36)  =  2.69,  p  =   .01. Put another way, speakers

were more likely to deviate from the verbs used in the verbal description they heard whenthey had also seen the cartoon than when they had only heard the verbal description.

This suggests that speakers were choosing words more extemporaneously in the spatial

condition than in the verbal condition. Although the analysis of speech disfluencies sug-

gests that speakers did not have any more trouble coming up with the novel words in the

spatial condition than they did remembering the exact words in the verbal condition, it is

possible that the act of generating new verbal labels in the spatial condition could lead to

increased gesture production. For example, previous research has shown that speakers

gesture more when speaking extemporaneously than when speaking from a rehearsed script

(Chawla and Krauss 1994). To account for this potential confound, we will also consider

whether any difference in gesture rates across conditions is stronger when speakers are

using a different verb to describe an event than when they are using the same verb they

heard in the verbal description.

Analysis of Gesture

Gesture rates per 100 words were analyzed with a 2 (condition: spatial vs. verbal)   9   2

(type of gesture: representational vs. beat) repeated measures ANOVA with both condition

and type of gesture as repeated measures. A main effect of condition was found,

F (1, 36) =

 9.19,   p\

.01, as well as a main effect of type of gesture,   F (1, 36) =

 49.29, p\ .001. The interaction between condition and type of gesture was also significant,

F (1, 36)  =  5.88,  p\ .05. As predicted, speakers produced more representational gestures

per 100 words when they had seen the cartoon version of the events ( M  =  6.51,

SD  =  4.74) than when they had only heard the cartoon’s events read to them ( M  =  4.62,

SD  =  4.53), but the rate of beat gestures produced was comparable between the conditions

(spatial:  M  =  1.32; SD  =  1.81; verbal:  M  =  1.09, SD  =  1.56). See Fig. 1.

However, as noted above in the speech analyses, speakers were more likely to mention

events that they had not heard about in the verbal description when they had also seen the

cartoon. It is possible, then, that speakers produced more representational gestures after

watching the cartoon because they were mentioning more events that they had not heardverbally described. To rule out the possibility that the description of extra events in the

spatial condition is driving the increased gesture rates, a second 2 (condition: spatial vs.

verbal)   9   2 (type: representational vs. beat) repeated measures ANOVA was conducted,

this time limiting the analysis to gestures that accompanied speech about events that had

been included in the initial verbal description heard by participants. The pattern of results

mirrored those from the analysis of the complete dataset. There were main effects of 

condition, F (1, 36)  =  10.13,  p\ .01, and type of gesture,  F (1, 36)  =  48.08,  p\ .001, as

well as an interaction between condition and type of gesture,   F (1, 36)  =  7.61,   p\ .01.

Even when the analysis is limited to speech about the events that had been explicitlymentioned in the verbal description, speakers still produced representational gestures at a

higher rate when they had seen the cartoon (M = 6.65, SD = 4.91) than when they had not

J Nonverbal Behav (2011) 35:205–223 215

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 12/20

Finally, the speech analyses also revealed that speakers were more likely to use the

same verb they had heard in the verbal description when describing stories they had heard

described twice in the verbal condition than when describing stories they had heard

described only once in the spatial condition. Thus, it is possible that the increased repre-

sentational gestures in the spatial condition are due to the more extemporaneous speech

produced in that condition rather than to the experience of seeing the images. To examine

this possibility, we calculated the representational gesture rates per 100 words produced by

each speaker as they described the key events in the spatial condition and in the verbal

condition. We calculated a separate rate within each condition for clauses in which

speakers used the same verb as the verb they had heard in the initial description and for

clauses in which speakers used a different verb. We then compared these rates in a 2

(condition: spatial vs. verbal) 9 2 (verb: same vs. different) repeated measures ANOVA. If 

the increase in gesture rates in the spatial condition is due primarily to increased deviation

from the words heard in the description, then there should be a Condition x Verb inter-action, such that the effect of condition is stronger when speakers are using different verbs

than when they are using the same verbs. However, this was not the case,  F (1, 33)  =   .27,

 p  =   .61. Speakers gestured at a higher rate in the spatial condition than in the verbal

condition regardless of whether they were using the same verb they had heard previously

or a different verb, as evidenced by the main effect of condition,  F (1, 33)  =  6.48, p  =   .02.

There was no main effect of verb,  F (1, 33)  =   .60,  p  =   .45. See Fig.  2.

Discussion

The present experiment provides evidence that speakers gesture more when they have seen

Fig. 1   Average gesture rates per 100 words produced in the spatial and verbal conditions.   Error bars

represent standard errors of the means

216 J Nonverbal Behav (2011) 35:205–223

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 13/20

First and foremost, Hostetter and Hopkins (2002) only exposed participants to appro-

priate verbal labels in one condition. Thus, it is difficult to know whether their results were

due to the lack of verbal labels in the spatial condition or to the presence of spatial images

in the spatial condition. The present study exposed participants in both conditions to

appropriate verbal labels, thereby reducing this confound. Speakers gesture more when

they have seen images that correspond to events than when they have not, even when theyhave also heard the events described verbally.

Further, Hostetter and Hopkins (2002) did not consider or compare the speech produced

by speakers in the two conditions, leaving the possibility that the differences they observed

in gesture production were the byproduct of differences in speech. Specifically, if speakers

generally talk more after viewing a cartoon than after reading a description, then they may

simply have more opportunities to gesture when they are describing a cartoon they have

seen. Indeed the present analysis aligns with other recent work (Parrill et al.  2010) and

suggests that speakers do talk more when they have seen a visual image than when they

have only read or heard a description. Thus, in the present analysis, we controlled for the

number of words produced in each condition by comparing gesture rates rather than rawgesture frequency. The significant difference in gesture rates per 100 words suggests that

speakers gesture more after seeing images than after hearing verbal descriptions, even

when the total amount of speech produced is controlled for.

This finding conflicts with a recent report by Parrill et al. (2010) in which they found

that when the number of words produced by speakers is controlled, gesture rates are not

different between speakers who have seen a cartoon and speakers who have only read

about the cartoon’s events. There are several potential reasons for the discrepancy, all of 

which will not be discussed here. However, perhaps the most prominent difference

between the two procedures is that the present study used a within subjects design whileParrill et al. used a between-subjects design. There are large individual differences in

gesture rates across speakers (e g Hostetter and Alibali 2007) which can make it difficult

Fig. 2   Average representational gesture rates per 100 words produced with clauses about key events in the

spatial and verbal conditions. Error bars  represent standard errors of the means

J Nonverbal Behav (2011) 35:205–223 217

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 14/20

In addition to controlling for the amount of speech produced across conditions, the

present study also considered and controlled for differences in the content of the speech

produced in the two conditions. Specifically, extraneous spatial details such as the relative

size and location of the characters are only apparent from watching the cartoon, and such

details may be particularly likely to be accompanied by representational gestures.Importantly, however, describing these extra events in the spatial condition does not seem

to be driving the overall difference in gesture rates, as the pattern emerges even when the

analysis was limited to clauses that detailed events from the original description.

Limiting the analysis to clauses that describe the same general events in both conditions

does not remove the possibility that the specific words speakers use to describe those

events differ in content or accessibility between the conditions. We found no evidence that

the general content or accessibility of words in the two conditions differed; speakers

produced descriptive terms and speech disfluencies at comparable rates in the two con-

ditions. However, we did find that speakers were more likely to reproduce verbatim the

verbs they heard in the verbal description when they had heard that description twice than

when they heard it only once. Although this could affect gesture rates, we found the same

increased rate of gestures in the spatial condition regardless of whether speakers were

using the same verb they had heard in the original description or a different verb. This

suggests that the difference in gesture rates across conditions is not the result of speakers

generating new verbs to describe the events in the spatial condition.

By carefully analyzing the speech produced in the two conditions, we have been able to

rule out the possibility that the difference in representational gesture rates across conditions

is purely the result of speech quantity, content, or accessibility. If the increase in gesture

rates after watching a cartoon cannot be explained by differences in verbal accessibility orspeech production, then how can it be explained? According to image-based theories of 

gesture production (de Ruiter 2000; Kita and Ozyurek  2003; Hostetter and Alibali  2008),

speakers produce representational gestures when they are forming and activating mental

images in the interest of speaking. The GSA framework (Hostetter and Alibali   2008)

proposes that gestures occur because the formation of mental images activates visual and

motor areas in the speaker’s brain, and this activation is overtly expressed in the form of 

representational gestures as the speaker talks.

Under this view, speakers are particularly likely to produce representational gestures

when they describe spatial and motor events because these events are particularly likely to

be simulated and imagined by the speakers. This does not rule out the possibility thatrepresentational gestures are also particularly beneficial to communication about spatial

and motor topics. However, in the present paradigm, the communicative demands were

identical in the two conditions, making it difficult to explain the difference in gesture rates

observed here based solely on a communicative explanation. Instead it seems that speakers

gesture with spatial and motor topics, at least in part, because their representations of those

topics is particularly conducive to gesture.

The GSA framework contends that it is specifically the motor and perceptual simula-

tions that occur with imagery that lead to increased gesture in the visual condition.

However, the results may be explained in slightly different terms as well. For example, it ispossible that the speakers simply knew the material more thoroughly after viewing the

events in addition to hearing them than when they had only heard about them. Although we

218 J Nonverbal Behav (2011) 35:205–223

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 15/20

In fact, according to Dual Coding theory (Paivio   1991), imagery does lead to better

memory for stimuli than does verbal rehearsal alone.

However, Dual Coding theory is not entirely at odds with the explanation given by the

GSA framework. Rather, the GSA framework and the embodied theories of cognition that

it draws from provide an explanation for why memory traces are likely stronger afterseeing an image than after seeing a word; speakers have stronger memory traces for stimuli

when they have engaged their motor and perceptual systems during the encoding of the

stimuli than when they have not. Thus, according to the GSA framework, part of the reason

why speakers gesture more when they have seen an image is because their mental rep-

resentation of the event is stronger; however, the reason that it is stronger is because the

image activates perceptual and motor systems in the brain (see Hostetter and Alibali 2010

for more on this point). Clearly, the present behavioral study cannot directly test the claim

that images do draw on perceptual and motor systems in the brain. However, neuroimaging

studies have supported this claim (e.g., Willems et al.  2010) and thus lend credence to the

idea advanced here that imagery leads to gesture production because of the motor and

perceptual simulations that it evokes.

The results obtained here may also be explained as behavioral mimicry. Chartrand and

Bargh (1999) have shown that individuals frequently mimic each other’s behavior, and that

doing so leads to smoother interactions and more positive impressions of one another. In

the present paradigm, it is possible that speakers who have an image of a character in their

mind may mimic the character’s actions as a means of understanding the character’s

behaviors and actions. While this behavioral mimicry explanation uses different termi-

nology from the GSA framework to describe the effect obtained here, the conceptual basis

of the two explanations is actually quite similar. Under both views, speakers engage theirmotor systems as a means of understanding the characters and events they observe.

It is also possible that the effect obtained here was the result of experimenter expectancy

effects. The experimenters in the present study were not blind to the hypothesis of the

study or to the participants’ experimental condition as they were retelling each story.

However, there are several reasons to doubt that the present findings are the result of the

experimenters consciously or unconsciously priming the participants to gesture more in the

spatial condition. First, the experimenters all followed the same prepared script, which

included specific instructions about both verbal and nonverbal behavior. Before running

the first participant, they rehearsed the script several times, and the primary investigator

gave each of them feedback on their adherence to the script. Second, there are no effects of experimenter in the analyses, suggesting that if the experimenters were unconsciously

changing their behavior in the spatial condition in a way that evoked more gesture, all five

were doing it to the exact same degree. This seems unlikely. Finally, the effect of condition

was specific to representational gestures. If the experimenters were unconsciously doing

something to encourage more gesture in the spatial condition, it is unclear what they might

have done that would specifically encourage higher rates of representational gestures,

rather than higher rates of every kind of gesture. Thus, although an experimenter expec-

tancy effect cannot be definitively ruled out, it seems like a less parsimonious explanation

than the explanation that viewing the cartoon affected gesture rates, as viewing the cartoonwas the most obvious consistent difference between the conditions. Future research is

needed to investigate the extent to which experimenter expectancy effects might be

J Nonverbal Behav (2011) 35:205–223 219

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 16/20

that suggests that gestures arise from mental images (e.g., de Ruiter   2000; Kita and

Ozyurek   2003) rather than from verbal or propositional codes is compatible with the

present findings. In contrast, Butterworth and Hadar (1989) propose that gestures arise

from semantic representations of words in the lexicon. Krauss et al. ( 2000) propose that

gestures arise from elementary spatial features that are part of semantic concepts, ratherthan from an image per se. It is unclear how either of these proposals would explain the

present findings, as the concepts being described in both conditions were identical. The

only difference is that in one case, the concepts have been represented imagistically, while

in the other, they have not. Thus, the present findings are not just evidence for the GSA

framework, but also for any theory that claims an imagistic source of gesture. Indeed, the

GSA framework is not an alternative to other theories that propose that gestures stem from

images; rather, it is an attempt to explain more specifically why it is that mental images

lead so naturally to representational gestures.

In conclusion, the present study overcomes several limitations of the Hostetter and

Hopkins (2002) findings and suggests that speakers do gesture more when they are

describing events that they have witnessed visually than when describing events they have

only heard about. Although there are some differences in the speech produced by speakers

when they describe something they have seen versus something they have only heard

about, the increase in gesture rates occurs even when these differences in speech are

controlled. It seems that gestures occur, at least in part, when speakers have seen the

information they are describing.

Acknowledgments   This work was initiated as part of Catherine J. Skirving’s senior individualized project

at Kalamazoo College. We thank Justine Dolorfino, Emily Patzelt, Andrea Potthoff, and Carmen Wilson for

their assistance with data collection and transcription. We also thank Keith Moreno for assistance withcoding unfilled pauses, and Chelsea Baumgarten, Andrea Potthoff, and Erin Sullivan for their comments on

the manuscript.

Appendix

Pancake

The mouse is making a pancake in a pan on the stove. He tries to flip the pancake by

tossing it out of the pan and into the air, but it falls on the ground. The mouse picks up the

pancake and continues to cook it. The mouse flips the pancake and it falls on the flooragain. The mouse then continues cooking the pancake, flips it again, and it lands on the

floor for the third time. The mouse picks up the pancake and begins to cook it again. The

mouse flips the pancake for a fourth time and it lands on top of the mouse’s face. He slides

the pancake off his face and back into the pan. The mouse then whistles for the elephant

who enters the room. The mouse takes another pan off of the wall and hands it to the

elephant to hold in his trunk. The mouse signals for the elephant to back up a little bit,

which the elephant does. The mouse then flips the pancake and it lands in the elephant’s

pan. The elephant flips the pancake back to the mouse’s pan, and they continue to happily

flip the pancake back and forth.

High Bar

220 J Nonverbal Behav (2011) 35:205–223

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 17/20

forth slowly, until he gets more momentum and is spinning in complete circles around the

bar. The elephant enters and watches the mouse spinning. Then the mouse dismounts and

the elephant applauds by stomping his front and back feet back and forth. The elephant

walks up to the bar and jumps up trying to grab it. He finally gets high enough to grab it

with his trunk, and the bar bends in the middle. The mouse lets out a noise of frustrationand tries to straighten the bar. A little green man with a tall hat enters and skips underneath

the bar. His hat touches the bar and it is magically fixed. The elephant laughs at the mouse.

Banana

The mouse pulls a banana out of his back pocket. He tosses it in the air, catches it, and then

peels it. He throws the peel into a nearby basket, but as he takes a bite of the banana, the

peel bounces out of the basket and comes back to hit him in the face. He tosses the peel

towards the basket again, but again as he takes a bite of the banana, the peel bounces back 

out of the basket and onto his face. He finishes eating his banana and walks over to the

basket. He places the peel inside the basket and then turns the basket upside down. As the

mouse walks away, the basket lifts up to reveal the elephant’s four blue feet. The elephant

makes a honking noise from under the basket and follows the mouse out of the scene.

Pottery

The mouse walks up to a potter’s wheel. He places a bucket full of clay on the bench

behind the wheel and then sits down on the bench next to it. The mouse takes some clay out

of the bucket and places it on the wheel. He spins the wheel with his foot, and the clay fliesoff the wheel and hits the wall on his right. The mouse puts more clay on the wheel and

spins it again with his foot. The clay again flies off the wheel and this time hits the wall on

his left. The mouse then sits on the wheel himself and begins to spin around. He spins so

fast that he flies off of the wheel to his left.

Practice: Circus

The big orange mouse is dressed up like the ringmaster at a circus. He has a hoop that he

waves back and forth several times. Then the little blue elephant comes out dressed up like

a clown. The mouse holds out the hoop, and the elephant jumps through it. He stands up onhis back legs as though taking a bow. Then the elephant jumps up on a small pedestal and

does a handstand. He bounces back to upright and there are noises in the background as

though an audience is cheering.

References

Alibali, M. W. (2005). Gesture in spatial cognition: Expressing, communicating, and thinking about spatial

information. Spatial Cognition and Computation, 5, 307–331.Alibali, M. W., Bassok, M., Solomon, K., Syc, S. E., & Goldin-Meadow, S. (1999). Illuminating mental

representations through speech and gesture.   Psychological Science, 10, 327–333.

J Nonverbal Behav (2011) 35:205–223 221

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 18/20

Bavelas, J., Gerwing, J., Sutton, C., & Prevost, D. (2008). Gesturing on the telephone: Independent effects of 

dialogue and visibility.  Journal of Memory and Language, 58, 495–520.

Bavelas, J., Kenwood, C., Johnson, T., & Phillips, B. (2002). An experimental study of when and how

speakers use gestures to communicate.  Gesture, 2, 1–18.

Beattie, G., & Shovelton, H. (1999). Do iconic hand gestures really contribute anything to the semantic

information conveyed by speech? An experimental investigation.   Semiotica, 123, 1–30.Bortfeld, H., Leon, S. D., Bloom, J. E., Schober, M. F., & Brennan, S. E. (2001). Disfluency rates in

conversation: Effects of age, relationship, topic, role, and gender.   Langauge and Speech, 44(2),

123–147.

Butterworth, G., & Hadar, U. (1989). Gesture, speech, and computational stages: A reply to McNeill.Psychological Review, 96 , 168–174.

Chartrand, T. L., & Bargh, J. A. (1999). The chameleon effect: The perception-behavior link and social

interaction. Journal of Personality and Social Psychology, 76 , 893–910.

Chawla, P., & Krauss, R. M. (1994). Gesture and speech in spontaneous and rehearsed narratives.  Journal of 

Experimental Social Psychology, 30, 580–601.

Christenfeld, N., Schachter, S., & Bilous, F. (1991). Filled pauses and gestures: It’s not coincidence.  Journal

of Psycholinguistic Research, 20, 1–10.

de Ruiter, J. P. (2000). The production of gesture and speech. In D. McNeill (Ed.),  Language and gesture(pp. 284–311). Cambridge: Cambridge University Press.

Feyereisen, P., & Havard, I. (1999). Mental imagery and production of hand gestures while speaking in

younger and older adults.  Journal of Nonverbal Behavior, 23, 153–171.

Fridlund, A. J., Kenworthy, K. G., & Jaffey, A. K. (1992). Audience effects in affective imagery: Repli-

cation and extension to dysphoric imagery.  Journal of Nonverbal Behavior, 16 (3), 191–212.

Glenberg, A. M. (1997). What memory is for.  Behavioral and Brain Sciences, 20, 1–55.

Glenberg, A. M., & Kaschak, M. P. (2002). Grounding language in action.  Psychonomic Bulletin & Review,

9, 558–565.

Goodale, M. A. (2001). Different spaces and different times for perception and action.   Progress in Brain

 Research, 134, 313–331.

Graham, J. A., & Argyle, M. (1975). A cross-cultural study of the communication of extra-verbal meaning

by gestures.  International Journal of Psychology, 10, 57–67.Hostetter, A. B. (2011). When do gestures communicate? A meta-analysis.  Psychological Bulletin, 137 ,

297–315.

Hostetter, A. B., & Alibali, M. W. (2007). Raise your hand if you’re spatial: Relations between verbal and

spatial skills and representational gesture production.  Gesture, 7 , 73–95.

Hostetter, A. B., & Alibali, M. W. (2008). Visible embodiment: Gestures as simulated action.  Psychonomic

 Bulletin & Review, 15, 495–514.

Hostetter, A. B., & Alibali, M. W. (2010). Language, gesture, action! A test of the Gesture as Simulated

Action framework.  Journal of Memory and Language, 63, 245–257.

Hostetter, A. B., Alibali, M. W., & Kita, S. (2007). I see it in my hands’ eye: Representational gestures

reflect conceptual demands.  Language and Cognitive Processes, 22, 313–336.

Hostetter, A. B., & Hopkins, W. D. (2002). The effect of thought structure on the production of lexical

movements.  Brain and Language, 82, 22–29.Kita, S. (2000). How representational gestures help speaking. In D. McNeill (Ed.),  Language and gesture

(pp. 162–185). Cambridge: Cambridge University Press.

Kita, S., & Davies, T. S. (2009). Competing conceptual representations trigger co-speech representational

gestures. Language and Cognitive Processes, 24(5), 761–775.

Kita, S., & Ozyurek, A. (2003). What does cross-linguistic variation in semantic coordination of speech and

gesture reveal? Evidence for an interface representation of thinking and speaking.  Journal of Memory

and Language, 48, 16–32.

Kosslyn, S. M. (2005). Mental images and the brain.   Cognitive Neuropsychology, 22, 333–347.

Krauss, R. M. (1998). Why do we gesture when we speak?  Current Directions in Psychological Science, 7 ,

54–60.

Krauss, R. M., Chen, Y., & Gottesman, R. F. (2000). Lexical gestures and lexical access: A process model.

In D. McNeill (Ed.),  Language and gesture   (pp. 261–283). Cambridge: Cambridge University Press.McNeill, D. (1992).  Hand and mind: What gestures reveal about thought . Chicago: University of Chicago

Press

222 J Nonverbal Behav (2011) 35:205–223

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 19/20

Paivio, A. (1991). Dual coding theory: Retrospect and current status.  Canadian Journal of Psychology, 45,

255–287.

Parrill, F., Bullen, J., & Hoburg, H. (2010). Effects of input modality on gesture-speech Integration.  Journal

of Pragmatics, 42, 3130–3137.

Pine, K. J., Gurney, D. J., & Fletcher, B. (2010). The semantic specificity hypothesis: When gestures do not

depend upon the presence of a listener.  Journal of Nonverbal Behavior, 34, 169–178.Rauscher, F., Krauss, R. M., & Chen, Y. (1996). Gesture, speech, and lexical access: The role of lexical

movements in speech production.   Psychological Science, 7 , 226–231.

Reichle, E. D., Carpenter, P. A., & Just, M. A. (2000). The neural bases of strategy and skill in sentence-

picture verification.   Cognitive Psychology, 40, 261–295.

Roberts, P. M., Meltzer, A., & Wilding, J. (2009). Disfluencies in non-stuttering adults across sample

lengths and topics.  Journal of Communication Disorders, 42, 414–427.

Schachter, S., Christenfeld, N., Ravina, B., & Bilous, F. (1991). Speech disfluency and the structure of 

knowledge. Journal of Personality and Social Psychology, 60, 362–367.

Spivey, M. J., Richardson, D. C., & Gonzales-Marquez, M. (2005). On the perceptual-motor and image-

schematic infrastructure of language. In D. Pecher & R. A. Zwaan (Eds.),  Grounding cognition: The

role of perception and action in memory, language, and thinking  (pp. 246–281). Cambridge: Cam-

bridge University Press.Tucker, M., & Ellis, R. (1998). On the relations between seen objects and components of potential actions.

 Journal of Experimental Psychology: Human Perception and Performance, 24, 830–846.

Wagner, S., Nusbaum, H., & Goldin-Meadow, S. (2004). Probing the mental representation of gesture: Is

hand-waving spatial?  Journal of Memory and Language, 50, 395–407.

Willems, R. M., Hagoort, P., & Casasanto, D. (2010). Body-specific representations of action verbs: Neural

evidence from right- and left-handers.   Psychological Science, 21, 67–74.

Zwaan, R. A., Stanfield, R. A., & Yaxley, R. H. (2002). Language comprehenders mentally represent the

shapes of objects.   Psychological Science, 13, 168–171.

J Nonverbal Behav (2011) 35:205–223 223

8/18/2019 The Effect of Visual vs. Verbal Stimuli on Gesture

http://slidepdf.com/reader/full/the-effect-of-visual-vs-verbal-stimuli-on-gesture 20/20

Reproducedwithpermissionof thecopyrightowner. Further reproductionprohibitedwithoutpermission.