Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976,...

31
Journal of Memory and Language 45, 337–367 (2001) doi:10.1006/jmla.2000.2770, available online at http://www.academicpress.com on 0749-596X/01 $35.00 Copyright © 2001 by Academic Press All rights of reproduction in any form reserved. 337 A Theory of Sentence Memory as Part of A General Theory of Memory John R. Anderson, Raluca Budiu, and Lynne M. Reder Carnegie Mellon University We describe anACT-R model for sentence memory that extracts both a parsed surface representation and a propositional representation. In addition, if possible for each sentence, pointers are added to a long-term mem- ory referent which reflects past experience with the situation described in the sentence. This system accounts for basic results in sentence memory without assuming different retention functions for surface, propositional, or situational information. There is better retention for gist than for surface information because of the greater com- plexity of the surface representation and because of the greater practice of the referent for the sentence. This model’s only inference during sentence comprehension is to insert a pointer to an existing referent. Nonetheless, by this means it is capable of modeling many effects attributed to inferential processing. TheACT-R architecture also provides a mechanism for mixing the various memory strategies that participants bring to bear in these ex- periments. © 2001 Academic Press Key Words: sentence memory; ACT-R theory; surface information; propositional information; situational in- formation; inferential processing. In his 1998 book, Kintsch writes: “We don’t need a special theory of sentence memory: If we understand sentence comprehension (the CI theory) and recognition memory (the list-learn- ing literature), we have all the parts we need for a sentence recognition model” (p. 263). CI is Kintsch’s construction-integration theory (Kintsch, 1988, 1998) and he adopts Gillund and Shiffrin’s (1984) SAM model of memory to account for sentence memory. In this article we argue for a conclusion that has a similar spirit— which is that the established results on sentence memory also follow from the ACT-R cognitive architecture (Anderson & Lebiere, 1998). ACT- R bears similarity to SAM but is a more com- plete theory of cognition because it contains a model of cognitive control. As such we can di- rectly embed in it a theory of sentence compre- hension. Because of some of the architectural commitments of ACT-R, the theory of sentence comprehension is somewhat different than Kintsch’s and closer to what is characterized as the minimalist hypothesis of sentence process- ing (McKoon & Ratcliff, 1992, 1995). This article demonstrates, even more strongly than has Kintsch, that there is nothing special about sentence memory. An important novel conclusion from this theory is that there are not different retention functions for the three forms of memory that have been postulated to encode information about a sentence (e.g., Fletcher, 1994; Graesser, Singer, & Trabasso, 1994; Kintsch, 1998)—surface code (exact words and syntax), textbase (propositions asserted in the text), and situation model (inferences con- tributed from long-term memory). A single re- tention function contrasts with a frequent as- sumption (e.g., Anderson, 1974, 2000; Brainerd & Reyna, 1995; Kintsch, Welsch, Schmalhofer, & Zimny, 1990) that the superficial surface in- formation is more rapidly forgotten than the propositional information, which is in turn for- gotten more rapidly than the situation informa- tion. However, we do not challenge the concept of the three levels of representation—although in keeping with ACT-R’s minimalist leanings, we offer a somewhat Spartan interpretation of what the situation information amounts to. In this article we present ACT-R models for a number of sentence memory tasks that empha- This research was supported by Grant N00014-96-1-0491 from the Office of Naval Research. We thank Alex Petrov and Charles Brainerd for their comments on earlier drafts of this article. Address correspondence and reprint requests to John R. Anderson, Department of Psychology, Carnegie Mellon University, Pittsburgh, PA 15213. E-mail: ja1@cmu.edu, ralucav1@cmu.edu, or [email protected].

Transcript of Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976,...

Page 1: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

Journal of Memory and Language 45,337–367 (2001)doi:10.1006/jmla.2000.2770, available online at http://www.academicpress.com on

A Theory of Sentence Memory as Part of A General Theory of Memory

John R. Anderson, Raluca Budiu, and Lynne M. Reder

Carnegie Mellon University

We describe an ACT-R model for sentence memory that extracts both a parsed surface representation and amem-

unts fornal, orr com-ce. Thistheless,ecturese ex-

l in-

I

Tfromandthis

AAndUnivralu

propositional representation. In addition, if possible for each sentence, pointers are added to a long-termory referent which reflects past experience with the situation described in the sentence. This system accobasic results in sentence memory without assuming different retention functions for surface, propositiosituational information. There is better retention for gist than for surface information because of the greateplexity of the surface representation and because of the greater practice of the referent for the sentenmodel’s only inference during sentence comprehension is to insert a pointer to an existing referent. Noneby this means it is capable of modeling many effects attributed to inferential processing. The ACT-R architalso provides a mechanism for mixing the various memory strategies that participants bring to bear in theperiments. © 2001 Academic Press

Key Words:sentence memory; ACT-R theory; surface information; propositional information; situationaformation; inferential processing.

n his 1998 book, Kintsch writes: “We don’t:eneCrynow—neT

irer

ca

Kintsch’s and closer to what is characterized asss-

glycialvelnotmsode

her,94;and theon-e-as-erder,in-her-a-

need a special theory of sentence memorywe understand sentence comprehension (ththeory) and recognition memory (the list-learing literature), we have all the parts we nefor a sentence recognition model” (p. 263).is Kintsch’s construction-integration theo(Kintsch, 1988, 1998) and he adopts Gilluand Shiffrin’s (1984) SAM model of memory taccount for sentence memory. In this articleargue for a conclusion that has a similar spiritwhich is that the established results on sentememory also follow from the ACT-R cognitivarchitecture (Anderson & Lebiere, 1998). ACR bears similarity to SAM but is a more complete theory of cognition because it containsmodel of cognitive control. As such we can drectly embed in it a theory of sentence comphension. Because of some of the architectucommitments of ACT-R, the theory of sentencomprehension is somewhat different th

33

eptughs, of

aha-

his research was supported by Grant N00014-96-1-04 the Office of Naval Research. We thank Alex Petro

Charles Brainerd for their comments on earlier drafts article.ddress correspondence and reprint requests to Johnerson, Department of Psychology, Carnegie Melloersity, Pittsburgh, PA 15213. E-mail: [email protected],

[email protected], or [email protected].

7

IfCI-dI

d

e

ce

--a--alen

the minimalist hypothesis of sentence proceing (McKoon & Ratcliff, 1992, 1995).

This article demonstrates, even more stronthan has Kintsch, that there is nothing speabout sentence memory. An important noconclusion from this theory is that there are different retention functions for the three forof memory that have been postulated to encinformation about a sentence (e.g., Fletc1994; Graesser, Singer, & Trabasso, 19Kintsch, 1998)—surface code (exact words syntax), textbase (propositions asserted intext), and situation model (inferences ctributed from long-term memory). A single rtention function contrasts with a frequent sumption (e.g., Anderson, 1974, 2000; Brain& Reyna, 1995; Kintsch, Welsch, Schmalhof& Zimny, 1990) that the superficial surface formation is more rapidly forgotten than tpropositional information, which is in turn fogotten more rapidly than the situation informtion. However, we do not challenge the concof the three levels of representation—althoin keeping with ACT-R’s minimalist leaningwe offer a somewhat Spartan interpretationwhat the situation information amounts to.

In this article we present ACT-R models fornumber of sentence memory tasks that emp

91v

of

R.n

0749-596X/01 $35.00Copyright © 2001 by Academic Press

All rights of reproduction in any form reserved.

Page 2: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

D

nc

sic

-nee

dwa

ssdes)wtoh

toweallymon-ol-

ghe

atenalery

uch

338 ANDERSON, BU

size different subsets of these three represetions. In each case we present models that aally perform in real time the tasks describedthe literature. These models can be run andspected by going to thePublished Modelslink athttp://act.psy.cmu.edu. The real-time naturethese models is significant because constraintprocessing time force the models in the directof minimalist encoding. In ACT-R each prodution rule applies serially and requires a minimuof 50 ms and often more. When we apply ACTto sentence processing we find there is justenough time, at normal reading or listening ratto do more than a minimal number of inferenc

We chose to model data sets that wouldrectly test two critical aspects of the ACT-R thory—its retention assumptions and its assumtions about the speed of production rules. Soof the data sets (Anderson, 1972, 1974; Re1982; Schustack & Anderson, 1979) that model are ones gathered from our own laborries and in these cases the models that wescribe are ACT-R implementations of what aessentially the models that we already propoprior to the development of ACT-R. In thecases we show that the earlier proposed moare consistent with the general ACT-R architture. We also model other researchers’ data (Bower, Black, & Turner, 1979; Zimny, 1987Although we do not know these data sets as as our own, they were chosen because serve to test significant aspects of the theThis article begins with a description of t

n

t

v

re-s-

e-easainc-reus,

r ofc-

ACT-R architecture, a minimal model for setence processing and representation, and thederlying architectural assumptions that conthe behavior of the model.

THE ACT-R THEORY

General Architectural Commitments

The basic assumption throughout the deopment of the ACT theory (e.g., Anderso1976, 1983, 1993; Anderson & Lebiere, 199has been that human cognition emerges thro

an interaction between a procedural memoand a declarative memory. The basic units knowledge in procedural memory are produtions and the basic units of knowledge in decl

IU, AND REDER

ta-tu-

inin-

ofon

on-mRots,s.di-e-p-

meer,eto-

de-reed,eelsc-ets

.ell

heyry.e-

un-rol

el-n,8)ugh

ative memory are chunks. Since we want make the point that the ACT-R assumptions are using for sentence memory apply generthroughout cognition, we first illustrate thewith respect to mathematics. For instance, csider a student in the midst of solving the flowing multicolumn addition problem:

336+848

4

The next production to apply might be:

IF the goal is to add n1 and n2 in a columnand n3 can be retrieved as the sum of n1

and n2THEN set as a subgoal to write n3 in that

column.

This production would retrieve the followinchunk from declarative memory encoding tfact that the sum of 3 and 4 is 7:

factisa addition factaddend1 threeaddend2 foursum seven

and embellish the goal with the information th7 is the number that should be written out. Thother productions would apply that might dewith things like processing the carry into thcolumn. The basic premise of the ACT-R theois that cognition unfolds as a sequence of sproduction-rule firings where each rule can trieve chunks from declarative memory to tranform the goal state.

One of the major trends in the ACT theory dvelopment from ACT* (Anderson, 1983) to thcurrent ACT-R (Anderson & Lebiere, 1998) hbeen a firmer commitment to the temporal grsize at which cognition unfolds. Each prodution rule in ACT-R takes at least 50 ms to fiand almost never much more than 500 ms. Thwe have bounded the time scale to an ordemagnitude and we will shortly describe the fa

ryofc-ar-

tors that determine just how long a productionrule takes in the 50- to 500-ms range. The ACT-R theory is also committed to the proposal thatonly one production rule can fire at a time.

Page 3: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

a

y

t

v a

httoc

h

i

ra-emrgu-

theatu-cion

ofklarn-

theide in-ra-si-llycito

ns asd

sto er-l-

hehehasnd

ticstructure of a sentence (Fig. 2a) similar to the

1 The actual names used to refer to the slots have been

THEORY OF SEN

These commitments to a temporal grain size serial production-rule firing place severe costraints on a theory of linguistic processing bcause ACT-R must complete all the steps neeto comprehend a sentence in the short time tcally allocated to sentence processing.

Representational Commitments

Another significant constraint on the propostheory of language processing is that it mustcorporate the theory of declarative representathat was articulated in the theory of list memo(Anderson et al., 1998) and was elaborated intheory of analogy (Salvucci & Anderson, 2001In the theory of serial memory, declaratichunks are used to encode the position of anment in a higher structure. Thus, a sequence“392 714 856” would be encoded in the hierchical graph structure depicted in Fig. 1. Eanode and link in this figure is a chunk. While tnodes contain no structural information (e.g.,leaf 3 in the graph is a chunk that encodes digit 3, with no information about it being part this list), the links are more complex (for simpliity, in Fig. 1 we only show the structure of twlink chunks; the other links are similar, thoughAs Fig. 1 shows, together with pointers to tnodes that they connect (the parent and childslots), the link chunks maintain informatioabout the position of the child within the paregroup. For instance,9, the child of Group1, occu-pies the second position in Group1and this infor-mation is recorded in the slot role of the link thatconnects 9 and Group1. Also, in order to be ableto keep track of different lists, it was importanthave a contextslot in each link chunk and in thway identify to which list a given link should bassociated. Individual declarative chunks in ACR can be forgotten or confused with others, athese chunk-based processes produce manthe error patterns associated with serial mem(Anderson & Matessa, 1997).

Salvucci and Anderson (2001) elaborated ageneralized this representation to account the semantic effects found in the analogy liteture. Thus, to model the famous solar systanalogy (Gentner, 1983), they represented a

ments to a proposition like “The planets revolaround the Sun” with a number of chunks like

TENCE MEMORY 339

ndn-e-dedpi-

edin-ionrythe).eele-liker-che

hehef-o).e

nnt

toseT-ndy ofory

ndfor

Chunk82isa semantic-chunkparent revolveschild Sunrole centerreferent revolutioncontext solar-system.

This chunk encodes the fact that the Sunserves the center rolein that proposition.1 An-other chunk would be used to encode that planets serve the role of revolving objects. This, there is a separate link chunk for each argment of the proposition. Note also that Salvucand Anderson added a new piece of informatito the link chunk: the referentslot which pointsto the more general concept of the motion revolution. Sometimes it may be useful to thinof the referent as the prototype of the particuinstance that is represented. In Salvucci and Aderson’s model, the referent served to guide analogy process. It can be also used to gumetaphor comprehension and other semanticterpretation processes (see Budiu, in prepation). The chunks we use to encode propotional information in sentences are basicaidentical to the chunks introduced by Salvucand Anderson. The referent link is important our theory of situation memory.

It is worth noting that this representatiotakes what commonly had been thought of asingle proposition (e.g., “the planet revolvearound the Sun”) or a single group “(3 2 9)” anfragments it into multiple ACT-R chunks. Thifragmentation proved useful in list memory account for phenomena such as transpositionrors. It also proved useful in the theory of anaogy to explain how a participant analyzes tcomponents of an analogical mapping. In tcase of sentence memory, this assumption implications for fragmentary sentence recall awe test these implications in this article.

Representation of Sentential Information

We propose a representation for the syntac

ve:changed from those used by Salvucci and Anderson to facil-itate current exposition.

Page 4: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

340 ANDERSON, BUDIU, AND REDER

FIG. 1. The encoding of a serial list into a set of chunks from Anderson, Bothell, Lebiere, Matessa (1998). Each link and

node in the graph reflects a chunk.

FIG. 2. A comparison of the syntactic encoding of a sentence (a) with its propositional encoding (b).

Page 5: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

tii

o

en

xlehl

in

ee

-

t

n eF

t

etli

lee

ig

te inulo

alen

-illeriore

uctur-si-ta-butta-

ngce.one-alTheat- theen-ryem-on&.k,byesce

ergetnsedz-tstes

le-d

nt

THEORY OF SEN

list representation (Fig. 1) and a representafor its propositional structure (Fig. 2b) that basically identical to the semantic represention developed in the Saluvcci and Anders(2001) model. Thus, for the sentence Bob paidthe waiter, the syntactic representation is an coding of the actual parse tree of the senteThe nodes in this tree are either words like Bob,paid, and the-waiteror nonterminals like NP1,VP1, V81,NP2, and Sentence1. The null elementin the verb phrase encodes potential verb auiaries. As before, the links are more compchunks containing structural information. Tlabels of the links represent the syntactic rothat the children play within the parents (for stance,Bobis the head of NP1, which is the firstargument of Sentence1). As in the solar systemrepresentation, the link chunks also encodreferent, whose value denotes a more genconcept (e.g., the link connecting NP1and Bobhas the referent NP to denote that it is an instance of a noun phrase structure). The contextslot in the link representation keeps track of current sentence.

Similarly, the semantic structure of the setence is encoded as a tree whose nodes arecepts or propositions and whose links represrelationships among these concepts (see 2b). Thus, the link between the concept *BOB*and the chunk Proposition-4encodes the facthat *BOB* is the agent of Proposition-4. Thereferent slot records that the relationship coded is an instance of paying in a restauranscript. All the links in the representation of thproposition can have the referent slot pointingthis referent. In general, the referent slot is filwith a pointer to some analogous past expence or generalization from past experiencNote that our “semantic representation” in F2b might better be termed a “gist represention.” It collapses, for instance, any semandistinction between an active or passive stence. Its essential feature is that it reducesdetail of the sentence down to its core mean

Again, because the links contain all the strtural information, their retrieval will be criticafor sentence recall. Note that there are m

chunks (in terms of both nodes and links, blinks will be our primary interest) in the syntac

TENCE MEMORY 341

onsta-n

n-ce:

il-xees-

aral

he

-con-ntig.

n-ikes todri-

es..

ta-icn-theg.c-

re

tic encoding (8 links) than in the propositionencoding (3 links). The discrepancy is evgreater in the case of the passive sentence Thewaiter was paid by Bob, where the syntactic encoding has 10 links, while the propositional sthas only 3. This greater difference in the numbof chunks accounts for the apparent supermemory for propositional information becausfewer things have to be retrieved to reconstrthe proposition than the syntax. The exact sface structure in Fig. 2a and the exact propotional structure in Fig. 2b depend on representional assumptions that might be questioned the general principle is that the gist represention will be a smaller representation encodionly significant aspects of the original sentenThus, the model is committed to the predictiof poorer memory for surface structure, not bcause of worse retention of the individuchunks, but because there are more chunks. more chunks there are, the more likely it is thsomething will be lost with delay. Ability to recognize the exact sentence depends on all ofelements being present in the surface represtation. While the model predicts better memofor the meaning, it is not inconsistent with thobservation that surface memory can be iproved by manipulations that focus attention surface details (e.g., Kennan, MacWhinney,Mayhew, 1977; Murphy & Shapiro, 1994)ACT-R predicts that memory for any chunsyntactic or semantic, will be enhanced greater processing. However, the theory dopredict inferior surface memory in the absenof special processing.

Figure 3 is an attempt to illustrate the largstructure that is created when story sentencesattached to referents, in this case propositiofrom a restaurant script. The big boxes, label“Story” and “Restaurant,” represent the organiing units that are pointed to by the context sloof the individual chunks encoding the links thamake up the two sets of propositions. Thsmaller boxes reflect the individual propositionthat are pointed to by the parent slots. The ements within the proposition boxes are pointeto by the child slots. The arrows reflect refere

ut-slots pointing from the chunks to the referentproposition. This representation illustrates that
Page 6: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

D

nrn

ipifdtib

torp

me

ene-r- in inalar-si- if

ating: an-elm-he totsple

anchac-l-

aar-enion.n &ioneirut

onsre-ry.se

al--

ordctic tohe

FIG. 3. A representation of the chunks in a story an

the participant might not be able to find referefor all the propositions in the story and that themight not be story propositions correspondito all the propositions in the referent. While threferent propositions in this example come froa classic Schank and Abelson (1977) scrthere is nothing in the model that requires thThe referents could come from another story,instance. The sources of the referent just neebe some well-encoded structure in declaramemory that contains propositions that canput in correspondence with the propositionsthe story. Our concept of a referent is similarSanford and Garrod’s (1998) scenario anduse of the referents is similar to their scenamapping except that they do not build up a serate propositional representation.

The representation in Fig. 3 illustrates soof the potential for inferences based on th

their connections to the propositions in a referent.

referent links to prior knowledge. Suppose participant can retrieve just one chunk from

IU, AND REDER

tsegemt,

s.orto

vee

inourioa-

ese

story proposition (say the one for order inProposition-2) but this has a referent link. Ththe participant can use this referent link to rtrieve the corresponding proposition. Furthemore, the participant can use the argumentsthe referent proposition to infer the argumentsthe story proposition (for instance, that a mewas ordered). Being more adventuresome, pticipants might also guess that other propotions in the script occurred in the story eventhese propositions are not pointed to.

Sentence Processing

We now turn to describing productions thperform three tasks during sentence processderiving a parse of the sentence, buildingpropositional representation, and trying to idetify a referent for the proposition. This modmakes almost no effort at elaboration, i.e., ebellishment of the ideas in the sentence. Treason for this is that the model is constrainedfit the data from experiments where participanare reading stories at the rate of at least a couof words per second. This implies no more tha few hundred milliseconds to process eaword and therefore constrains what can be complished in that time. The one bit of embelishment that the model will do is try to find referent for the sentence. Of course, when pticipants are given more time to study they oftengage in extensive inference and elaboratIndeed, we have argued elsewhere (AndersoReder, 1979; Reder, 1979) that such elaboratcan have significant consequences for thmemory of the sentences. However, it turns othat we do not need to make such assumptiin order to account for a number of classic sults about inference in sentence memoRather, they can be explained simply by the uof referents.

The parsing model we use is essentiallyscaled-down version of the ACT-R model deveoped by Lewis (1999) for simulating comprehension effects. It assumes that, with each wprocessed, the participant retrieves the syntacategory of the word and uses that knowledgeintegrate the word into a syntactic parse of t

d

342 ANDERSON, BU

aasentence. Lewis’s work is more concerned withsentence complexity and garden-path effects

Page 7: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

eim thne0

iotine

h

y

it8

r

ctao

l t

o

f

ni-led

t re-ce. theeeseor aan- it5imeval,tic

ds

forers

hedateepte, it

e

re-hehey

ethe

T-o-re-agndfor

THEORY OF SEN

than are we, and he models these effects bytrieval of declarative fragments of the parse trWe assume the participant is only parsing sple sentences without significant ambiguitiessyntactic complexities. Our model builds uppropositional representation as it builds up parse tree. When the propositional represetion is complete, it will attempt to retrieve threferent. Elsewhere (Budiu & Anderson, 200we have argued that in at least some situatparticipants are also retrieving a referent for sentence before they finish reading it. As a splification, we postpone retrieval of a refereuntil the end of the sentence, but it is not esstial to the model.

Figure 4 shows how the propositional ansemantic representation is built when tACT-R model processes the active sentenBob paid the-waiter. The noun phrases are hphenated to represent the assumption thatdeterminer–noun combination is processedone encoding. This is roughly consistent weye movement data (Just & Carpenter, 19and serves to eliminate any differences btween processing of phrases likethe-waiterandBob.

For each word, there is a cycle of three pductions which fire:Read-word, taking 100 msto encode the current word; Retrieve-Type, tak-ing about 50 ms to retrieve the syntactic cagory of the word; and a variable third prodution that actually uses this information appropriately augment the syntactic and semtic structures. To illustrate, at the beginning the sentence, after reading the word Boband re-trieving the fact that Bob is a noun, the modebuilds up the parts of the syntactic tree andthe semantic representation correspondingBob. For the syntactic tree, the model creanew nodes (NP1and Sentence1) to denote that itis dealing with a new sentence and a new nphrase and also new links to relate these no(namely, a link which encodes that Bob is thehead of the new noun phrase NP1 and a linkwhich records that NP1 is the first argument othe sentence Sentence1). For the semantic representation, the model builds a new node (Propo-

sition-4) corresponding to the new propositionand then it creates a link between Proposition-4

TENCE MEMORY 343

re-e.-

oraeta-

)ns

hem-tn-

dece-theash7)e-

o-

te--

on-f

ofto

es

undes

-

and the meaning of Bob (denoted *BOB* in thefigure). The model is biased to believe that itial nouns are agents, so this link is labeagent. The context slot of this link is filled withthe value experiment, and the referent link is lefunset to reflect the fact that we postpone thetrieval of a referent until the end of the sentenThe process repeats for each new word, withcategory of the word and the state of the trinfluencing which productions fire. When thend of sentence is reached, the model looks flong-term memory referent which has a semtic structure similar to the semantic structurehas just built. The relatively long latency (46ms) at the end of the sentence reflects the tfor separate productions to set up the retrieretrieve the referent, and modify the semanchunks with the referent.

Figure 5 shows how the model comprehenthe passive sentence The-waiter was paid byBob. The process is very similar to the one the active sentence: At first, the model considthe initial noun The-waiteras an agent. Onlyafter it recognizes that the auxiliary plus tverb make the sentence a passive does it upthe representation to reflect that the conc*WAITER* is a patient. To perform the updatthe model takes a little more time becauseneeds to retrieve the link between Proposition1and *WAITER* in order to be able to change thold agentlabel to a patientone. As before, theprocessing of the sentence ends both with thetrieval of a referent and with the updating of tlinks in the semantic representation so that tpoint to the retrieved referent.

The traces in Figs. 4 and 5 display the timtaken by the productions. We now present equations that determined these timings.

ACT-R’s Subsymbolic Assumptions

To this point we have largely described ACR as a symbolic theory in which discrete prductions are fired and discrete chunks aretrieved. However, underlying ACT-R issubsymbolic layer of continuously varyinquantities that determine which productions achunks are selected, if any, and the latency

,each chunk’s retrieval. Processing at the sub-symbolic level is controlled by quantities called
Page 8: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

344 ANDERSON, BUDIU, AND REDER

FIG. 4. Time frames in the parsing of the active sentence, “Bob paid the waiter.”

Page 9: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

ricmale

it

e

i-

e-fol-

e

activations in the case of declarative memoand utilities in the case of procedural memoAlso, while the computation at the symbollevel is serial, the computation at the subsybolic level is parallel. Underlying the firing ofsingle production is a large amount of paralactivation computation and parallel utility computation.

The activation of a chunk is determined bybase level and its associations to elementsthe current context. The following equation dscribes the level of activation,Ai, of a chunki interms of its base-level activation,Bi, that reflects

FIG. 5. Time frames in the parsing of th

its past history of encodings (as defined belowas well as the strengths of association,Sji, to el-

ryy.

-

l-

sin-

ementsj in the goal that send it additional actvation:

Activation Eq. (1)

The base-level activation varies with the frquency and recency of use according to the lowing equation:

Base-Level Learning Eq. (2)

where tj is the time since the jth use of the chunk

B ti jd

j

n

=

=∑ln ,

1

A B W Si i j

j

ji= + ∑ .

passive sentence, “The waiter was paid by Bob.”

THEORY OF SENTENCE MEMORY 345

)and d is a parameter controlling activationdecay. As developed in Anderson (1982) and ex-

Page 10: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

D

n t

m n-r

e tle a

ve

cn

&oi

uc

)

e

b

ub-y.

hences-u-

ss-

m-y

-Rityes,n w-

verl.the

d

c-on--erensoseeri-

heionde-

hinur-

poses of testing the theory are (a) a minimalprocessing of the sentence which derives aparse tree, a propositional representation, and

346 ANDERSON, BU

tensively tested in Anderson, Fincham, aDouglass (1999), this equation both predictspower law of learning (Newell & Rosenbloom1981) and the power law of forgetting (Wickegren, 1972). For current purposes, the sumtion in this equation implies that the morechunk is used, the stronger will be its encodiThe decay function tj

2d implies that the baselevel activation will decay with time. Elsewhe(e.g., Anderson & Lebiere, 1998; AndersonReder, 1999) we have elaborated a theorystrength of associative activation [the SWjSji inActivation Eq. (1)], relating it to things like thfan effect; however, for current purposes itenough to assume that this produces a booselements associated to the goal. The base-learning equation above is at the heart of theplications reported in this article that are cocerned with the retention of a sentence over ious delays. We model data assuming therone decay constant d for both syntactic andpropositional information about the sentenFurthermore, taking the strong commitmefrom other ACT-R models (Anderson Lebiere, 1998) we have fixed this decay cstant at .5. This is one instantiation of our clathat all levels of information about the sentenhave the same memory properties.

The activations are noisy quantities and fltuate around their expected values. A chunk be retrieved if its activation value is abovethreshold t. The probability of retrieving achunk with expected activation A is given by thefollowing equation:

Probability Retrieval Probability Eq. (3

where s reflects the noise in the activation valuand is related to the variance,s, of the noise bythe equation The activation,A, of achunk is also related to the time to retrieve itthe following equation:

Retrieval Time Eq. (4)

where F is the latency scale factor.The preceding equations describe the s

symbolic part of ACT-R’s declarative memor

Time = −Fe A ,

s = 3σ π/ .

=+ − −

1

1 e A s( ) / ,τ

The procedural memory also has subsymboaspects. When there are a set of productiothat can apply, ACT-R chooses among them a

IU, AND REDER

dhe,l-a-ag.

e&of

is forvelp-

n-ar- is

e.t

n-mce

c-ana

s

y

cording to how well they have performed in tpast. The measure of production performais called utility. There is one such quantity asociated with each production and it is calclated as PG 2 C, where P is the probabilitywith which the production has led to a succeful completion in past attempts,C is the aver-age amount of time that it took to reach copletion, and G is the value of successfullachieving the goal. The parameters P and C arebased on past experience2 with the productionwhile G is a parameter to be estimated. ACTselects the production with the highest utilvalue, but because of noise in these utilitithere is only a probability that any productioiwill be selected and this is given by the folloing equation:

Probability of choosing i = Conflict Resolution Eq. (5)

where the summation in the denominator is othe productions,j, that currently match the goaThis is a softmax rule which tends to select best production. The parameter t reflects thenoise in the estimation of production utility anis related to the variance,s, of this noise by the

equation The units of utility are seonds and throughout this article we use a cstant estimate for t of .05 s. One theme in a number of the models that we describe is that thare multiple strategies for answering questioabout sentences and that participants choamong these strategies according to their expenced utilities.

Summary

We have now described the basics of tACT-R theory and the general representatand processing assumptions. We have alsoscribed a model for sentence processing witthe theory. The important assumptions for p

t = 6σ π/ .

ee

E t

j

E t

i

j∑

/

/,

licnsc-

2 In the simulations P is set to the actual probability ofsuccess in the simulation and C to the actual processing timeit took the simulation.

Page 11: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

h

tn

dn

re-ateis iscallitionthisnal

itua-

THEORY OF SEN

a referent if one can be found and (b) tsame retention function for all information. Whave yet to describe how the model deals wthe memory tests, as this depends onspecifics of the particular experiment’s testiprocedure. However, data from the expements will be modeled assuming either arect effort to retrieve information from the se

r

g-eri-cyfor aisn- ac-

sonref-

tence encoding or an effort to use the refereif there is one, to infer an answer for thmemory task.

THE EXPERIMENT MODELS

Table 1 lists the experiments that are modein this article and the parameter estimatesthese experiments. We start with a model foexperiment described in Anderson (1974) tha

concerned with the processing of surface a

specific parameters 4 5Latency R2 .991 — .95Accuracy R2 — .992 .85

TENCE MEMORY 347

eeithheg

ri-i--nt,e

ledfor ant is

the issue of whether a single proposition isally fragmented into a number of separchunks as assumed by the ACT-R model. Ththe only model that looks at sentence remeasures rather than sentence recognmeasures. In our model for the data from experiment we make extensive use of situatioreferents. We also make extensive use of stional referents to model plausibility and reconition judgements in Reder (1982). That expment was primarily concerned with latenmeasures. We adapt that model to account similar experiment by Zimny (1987), which concerned with probability of recognizing setences. The Reder model is also adapted tocount for data from Schustack and Ander(1979) showing that sometimes situational

nderents can result in increased ability to recog-

rner,

)

on

&

propositional information. Next, we discuss anexperiment by Anderson (1972) that addresses

nize studied sentences. This model is in tuadapted to account for results from Bow

TABLE 1

The Experimental Models and Parameter Estimates

Anderson Anderson Reder Zimny Schustack & Bower, Black,(1974) (1972) (1982) (1987) Anderson (1979) & Turner (1979

Latency Scale(F) 0.30 s As Anderson As Anderson As Anderson As Anderson As Anderson(1974) (1974) (1974) (1974) (1974)

Time to Read a Word 0.10 s As Anderson As Anderson As Anderson As Anderson As Anders(1974) (1974) (1974) (1974) (1974)

Intercept 0.65 s Not used 0.85 s As Reder As Reder As Reder(1982) (1982) (1982)

Utility Noise(t) 0.05 s Not used As Anderson As Anderson Not used Not used(1974) (1974)

Activation Noise(s) Not used 0.2 As Anderson As Anderson As Anderson As Anderson(1972) (1972) (1972) (1972)

Ret Threshold(t) Not used 0.3 As Anderson As Anderson 20.05 As Schustack &(1972) (1972) Anderson (1979)

Slip Probability Not used Not used 0.12 0.24 0.125 As Schustack Anderson (1979)

Goal Value(G) Not used Not used 34 10.5 Not used Not usedGuess Latency Not used Not used 0.80 s As Reder As Reder As Reder

(1982) (1982) (1982)Model Unique p(ref) 5 p(Plausible) 5 Plaus rated 3.5

.20, .39 .90A 5 0.25 p(guess) 5 .06 Seen rated 6.0

Number of experiment-

5 2 3 34 — — —9 .923 .999 . 995
Page 12: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

D

l

de

e

rd

e

tsef

hv

da

ot

ic

g

assheir atch

ayi-he theuse toke5).lybuthee-

ee

hehen

ofd

delinktser-esbeur-rst it

andse,),

theeri-. Inb if auldesa-sa,

348 ANDERSON, BU

Black, and Turner (1979) showing that situtional referents can sometimes result in poodiscrimination of target sentences. At the endthis article we return to the issue of the stabiof the parameter estimates. All of these modare available by following the “Published Moels” link from the ACT-R home pag(http://act.psy.cmu.edu). The interested reamay inspect the details of these models, obsthem run, and check their behavior with othparameter settings.

Anderson (1974): Surface versus PropositionRepresentations

Anderson (1974) reported an experimentwhich participants studied sentences eithethe active voice or passive voice and then hajudge whether active or passive test probes wimplied by these sentences. The foils switchthe roles of the agent and object. Thus, the oinal study sentence might be either The-sailorshot the-painteror The-painter was shot by thsailor and the participants would later be askto judge whether a test probe followed from studied sentence. For either of the sentencetrue sentence would be either that sentencthe other form. For either of the sentences sentences could be either active or passive aThe-painter shot the-sailoror The-sailor wasshot by the-painter.

Thus, the trials could be classified by tvoice of the study sentence (active or passithe voice of the probe sentence (active or psive), and whether the probe sentence was aget (true) or a foil. Participants were testedther immediately after reading the stusentence or at a 2-min delay. Figure 6 displthe results from these two conditions. The potive judgments in the immediate condition sha strong interaction between the voice of studied sentence and the probe sentence,participants much faster for targets for whthe voices match. The data at a delay are qdifferent and show a large effect of the voicethe test sentence with participants taking lonfor passives.

At the time this experiment was publishe

these data were taken as evidence for more raforgetting of the surface form of the senten

IU, AND REDER

a-rer ofityels-

derrveer

al

in in toereedrig-

-edhe the oroils in

ee),as- tar-ei-yyssi-whewithh

uiteofer

d,

than of the propositional form. The analysis wbasically as follows: Immediately, participanthad access to a surface trace and made tjudgements on the basis of that, producingrapid response when there was an exact maof form. This surface trace decayed with deland the participant was left with the propostional trace that did not encode the voice of tstudied sentence. There was a large effect ofvoice of the probe sentence at delay becaparticipants had to comprehend the sentencematch propositional traces and passives talonger to comprehend (compare Figs. 4 and The ACT-R model fit to the data in Fig. 6 largereproduces the account in Anderson (1974) it does not assume a differential forgetting of ttwo traces. Still it does a good job of fitting theffect of delay because of the differential complexity of surface and proposition traces (sFig. 2).

Figure 7 is a schematic representation of tmodel we implemented, which is essentially tmodel described in the original Anderso(1974) article. Figure 7 also gives the rangetimes for each step which vary with delay anvoice of the sentences. The actual ACT-R mocan be accessed at the “Published Models” lat the ACT-R website. Here we just review ibasic logic. The model chooses between a vbatim and propositional strategy. If it choosthe verbatim strategy it never parses the prosentence but rather immediately retrieves a sface trace from memory that contains the finoun phrase of the probe sentence. Thenchecks to see whether the retrieved sentencethe probe sentence match on first noun phraverb auxiliary, and verb. As in Anderson (1974it is assumed that the participant never readssecond noun phrase, as all probes in the expment can be judged without the second nounfact, the model in Fig. 7 only checks for verauxiliary and does not read the main verbthere is an auxiliary. The model starts out withresponse index set to yes and switches it shothe subjects mismatch or the verb auxiliarimismatch. When judging a passive transformtion of an active studied sentence or vice ver

pid

ceboth subject and verb auxiliary will mismatchand the response index will be switched twice

Page 13: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

349

nr

rge

hteh

ao

tiebiov

e

en-e-

wasjectialal

the orver-werthe el-theith

ept

achionac-

als

FIG. 6. Results from Anderson (1974) and ACT-R predictions in bold lines.

from yes to no and back to yes. Such sentetake longer to judge, not because of this sponse switching per se, but because of more complex processes of retrieving the tasentence. The noun used to retrieve the sentin step 2 will be the first noun in the probe bthe second noun in the retrieved sentence. Wthe participant has to retrieve the subject of memorized sentence in step 3 this will be diffent than the noun retrieved in step 2 and so tis not a benefit of a recent retrieval.3

If the participants adopt the propositionstrategy they must first comprehend the prsentence and this comprehension will showlarge effect of whether the sentence is activepassive. Having done this, the probe proposican be more economically matched to the mory representation. In all, four chunks must retrieved from the propositional representatto complete the matching—one to first retriethe proposition and three to match the ag

verb, and object (these are the chunks encodlinks in Fig. 2). In contrast, seven to nine chun

tion

re,si-

therun

3 For example, when using the verbatim strategy, if thprobe sentence is “The-sailor shot the-painter,” the modlooks for any surface representation involving the-sail(step 2 in Fig. 7). If the studied sentence “The-painter wshot by the-sailor” is retrieved, the-painter will have to bretrieved from this sentence to compare to the-sailor (stein Fig. 7).

cese-theetnceuten

her-ere

lbe a oronm-enent,ingks

need to be retrieved from the verbatim represtation—two to four to retrieve the sentence (dpending on whether the studied sentence active or passive) and five to match the suband verb auxiliary. This reflects the differentcomplexity of the surface versus propositionrepresentations in Fig. 2. For every chunk in propositional representation there are twomore chunks that need to be retrieved in the batim representation. Moreover, there are fecues for retrieving the chunk in the case of verbatim representation. In checking that theements of the retrieved proposition match probe proposition, each chunk can be cued wboth the retrieved proposition and the conc(e.g.,Proposition-4and *Waiter* in Fig. 2b). Incontrast, there is only one cue available for eretrieval in checking the verbatim representatbecause of the extra intervening layer of synttic phrase structure (NP1, VP1, V81, and NP2 inFig. 2a). In summary, there are fewer retrievin the case of the propositional representaand more sources of activation [j’s in the Activa-tion Eq. (1)] to guide these retrievals. Therefoparticipants are faster at retrieving the propotional structure.

Table 2 summarizes the comparison of verbatim and propositional strategies when

eel

orase

THEORY OF SENTENCE MEMORY

through the simulation described above. Thepropositional strategy requires an initial parsing

p 3

Page 14: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

itpimmlehath, d

alsoimas aare in

er-cts foron-yede-

FIG. 7. The model derived from Anderson (1974) which describes the processing of the sentences.

but places less demand on memory. The inparsing takes .80 s for actives and 1.31 s for sives for an average of 1.05 s. This parsing tdoes not vary with delay but the matching tidoes because it involves retrieving more or active studied information from memory. In timmediate condition, the matching takes an erage of 0.67 s. In the delay condition matching takes an average of 0.94 s. Thuseffect of delay for the propositional strategyto increase the retrieval time by 0.27 s. In adtion to the parsing and retrieval times there is

“intercept time,” which is the time to initiallydetect the probe and generate a response an

ialas-

eessev-etheisi-

an

estimated as 0.65 s. These intercept times apply to the verbatim strategy. The verbatstrategy avoids the 1.05-s parsing cost but hgreater matching cost. The matching costs 1.01 s in the immediate condition and 2.25 sthe delayed condition.

Putting the component times together (intcept, matching, parsing), the model predi1.66 s for the verbatim strategy versus 2.37 sthe propositional strategy in the immediate cdition and 2.90 s versus 2.64 s in the delacondition. These times influence choice b

350 ANDERSON, BUDIU, AND REDER

d istween the two strategies through the ConflictResolution Eq. (5) given above. The different

Page 15: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

T

n0

deeegn

a-e

o

ntardrmse eorer

gocl tgp

ep- ig-e-reeofn

as

e-ndse-onob- isdes ofta-

leastm-ssen),de-re-t to

usm-al

ion. forag-r-

2bnks.hatcallomis-

onver

Probability of Verbatim 100% 0%

costs in time result in completely different tedencies to select the verbatim strategy—10in the immediate condition and 0% in the layed condition. The reader can confirm thpercentages by substituting these times (ntively weighted) into the Conflict ResolutioEq. (5) and using the value of t 5 .05 s, which isthe noise estimate throughout this article.

In addition to the t parameter, the other prameters estimated for this experiment werfollows: intercept time 5 0.65 s,F parameter inthe latency time equation 5 0.30 s, and time tread a word 5 0.10 s.

Thus, in total there are four parameters aexcept for the intercept, they are held consthroughout the article. The intercept and woreading times are reasonable in absolute teThe F parameter and the expected-gain noitare both in the ballpark of other estimatesACT-R modeling (e.g., Anderson & Lebier1998). The overall correlation between theand data is .996, which compares to the cortion of .976 reported by Anderson (1974) fomodel with more parameters.

The good fit of this model derives in larpart from the good fit of the model in Anders(1974), since Fig. 7 is adapted from that artiThe substantial parameter reduction reflectsfact that ACT-R was able to unify many thinwhich the other model had to estimate se

rately such as probabilities of verbatim strategin various conditions and changes in processi

ENCE MEMORY 351

-%-

sea-

as

d,nt-s.

in,ryla-a

ene.hesa-

time with delay. The slightly better fit of thmodel reflects the fact that this unification catured some subtle trends in the data that werenored in the original model. The two key elments to the unification that ACT-R provides athe theory of activation decay built into thBase-Level Learning Eq. (2) and the theory strategy selection built into Conflict ResolutioEq. (5), which determined which branch wfollowed in Fig. 7.

The basic insight is that the difference btween results from using the verbatim apropositional representations is not a conquence of inherent differences in their retentiproperties. The reason why differences are served between verbatim and gist informationbecause the verbatim representation encoeach word in the hierarchical parse structurethe sentence while the propositional represention encodes the essence of the sentence (at for purposes of this experiment) in a more copact (fewer chunks) form. This compactnemeans fewer and more efficient retrievals. Whwe look at the experiment of Zimny (1987which used accuracy measures with longer lays, we also see that the more compact repsentation means that fewer things can be losforgetting.

Anderson (1972): All-or-None versusFragmentary Recall

Representational complexity in the previoexperiment was measured in terms of the nuber of chunks it took to encode the propositionrepresentation and the syntactic representatThese representations, with separate chunkseach term, might strike the reader as quite frmented. For instance, Kintsch (1974) or Andeson (1983) would treat the proposition in Fig. as one unit rather than three separate chuSuch a fragmented representation implies twe should observe fragmentary sentence resuch that some but not all of the concepts frthe proposition might be recalled. There clearly fragmentary recall of propositional information as was documented in Anders(1972). There has been some controversy o

THEORY OF SEN

TABLE 2

Analysis of Strategy Selection in Anderson (1974)

Immediate Delayed(seconds) (seconds)

Verbatim StrategyMatching Time 1.01 2.25Intercept 0.65 0.65Total 1.66 2.90

Propositional StrategyParsing 1.05 1.05Matching Time 0.67 0.94Intercept 0.65 0.65Total 2.37 2.64

Difference 20.71 0.26

yngthe magnitude of this partial recall, with R. C.Anderson (1974) dismissing it as insignificant

Page 16: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

D

aalrn

v r

teoi

i2nu

8ouel

lovehe

nteh

rnnn

8psrebin1 o

uc-ysthenaldelandnalbeg.

ro- re-, toheomhe ofre-hees

g

c-the of

g

-onely,ati-es ard theict

the high frequency of zero elements recalled but

352 ANDERSON, BU

while others developing special theories to count for it (Jones, 1978). Figure 8 plots the dfrom Anderson (1972) and illustrates the haempty, half-full nature of this debate. The figuplots number of concepts recalled from setences consisting of four (Experiment 1) or fi(Experiments 2 and 3) concepts. In the casefour concepts, the sentences were of the fo“In the park the hippie touched the debutanAnd in the case of five concepts Anders(1972) used sentences like “In the park the hpie touched the debutante at night.”4

If a sentence has n concepts and one conceptused to cue recall of the sentence, there are n21

possible patterns of recall including all remainiitems recalled, no items recalled, and variopossibilities of partial recall. The data in Fig.are plotted in terms of the proportion of trials which various patterns occurred with zero to foconcepts recalled. Except in the case of zitems recalled or total recall, there are multippossible patterns of partial recall. Figure 8a pthe proportion of each possible pattern for a ginumber of words recalled. Figure 8b plots ttotal proportion of all patterns for a given numbof words recalled. In all of these experimeabout 60% of trials resulted in total failure of rcall. The real interest lies in the distribution of tremaining data in terms of the probability ofparticular pattern of items being recalled asfunction of the number of items in the patteWith the exception of recalling nothing, the eveof recalling all elements is much more frequethan any other specific recall pattern (see Fig. however, there are many possible patterns of tial recall and the total frequency of all of thepatterns of partial recall is about double the fquency of perfect recall (see Fig. 8b). The probilities of partial recall were 24, 26, and 29% the three experiments while total recall was 10, and 18%. Thus, partial recall is clearlyprominent aspect of recall despite a disproptionate tendency to recall everything.

Figure 8 also displays the predicted recapatterns by ACT-R according to the Retrieva

4 In some experiments Anderson (1972) used other fivconcept sentences but these were the ones we used in athe simulations.

IU, AND REDER

c-taf-e-

eofm.”np-

s

gs

nrroetsners-eaa.tt

a);ar-e-

a-

2,ar-

Probability Eq. (3). Because the surface strture was unlikely to be available at the delaused in these experiments (about 10 min),model we produced only used the propositiorepresentations like those in Fig. 2b. The modepends both on the propositional encoding on the referent pointed to by the propositiochunks but first we discuss what can achieved by just the propositional encodinThe propositional representation by itself pduces a certain all-or-none character in thecall. The probe consists of a single word andbegin recall, the participant must retrieve tchunk that contains the probe concept. Frthis chunk, the participant can retrieve tproposition, which is necessary for the recallthe remaining terms. Thus, conditional on trieval of the chunk encoding the probe, tprobability of the various recall patterns satisfithe binomial formula pm 3 (1 2 p)n, where p isthe probability of recalling a chunk encodinthat a term occurred in the proposition,m is thenumber of other terms recalled, and n is thenumber not recalled.5 However, before any termcan be retrieved from the proposition, it is neessary to retrieve the chunk connecting probe term to the proposition. The probabilityretrieving this probe chunk is p. Thus, thismodel predicts that the probability of retrievinm elements and failing on n is:

p 3 pm(1 2 p)n 5 pm11 (1 2 p)n if m . 0 (1 2 p) 1 p(1 2 p)n if m 5 0,

where the first p in the first line reflects the retrieval of the probe chunk giving the propositiand the first 1 2 p in the second line reflects thfailure to get to the proposition. InterestingRoss and Bower (1981) found that a mathemcal model such as the one given above dogood job in predicting recall of unrelated wosets. However, such a model cannot predictpattern of recall from sentences. It can pred

lllnot the high frequency of all elements recalled.

e-ll of

5 Throughout this discussion we derive the predictions forspecific patterns of recall (i.e., Fig. 8b) from which the pre-diction for total frequency (i.e. Fig. 8b) of all patterns can bederived.

Page 17: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

lax

) b

muan

afb

her ofel

kiontcentstest

meanproportion of each pattern with the specified number of words recalled. (b) The total proportion of all patterns with the spec-

This model predicts recall patterns that corre2.135 with the data in Fig. 8a (when we eclude the data points for zero items recalledcontrast to the .995 correlation exemplified the ACT-R model that we used.

The successful ACT-R model involves an iportant embellishment. It assumes that at stthere is a certain probability that participants able to retrieve a referent for the target senteSo, given “The hippie touched the debutantethe park,” the participant might retrieve episode from the movie Hair as the referent. IACT-R can retrieve a chunk that links a pro

ified number of words recalled.

word to a studied proposition and the chuncontains a pointer to a referent proposition,

te-iny

-dyrece.inn

e

can use this proposition to infer what the otterms were (see Fig. 3). Thus, the probabilityrecalling m and not recalling n in the new modis:p(R 1 (1 2 R)pm) if m 5 max(1 2 R)pm11(1 2 p)n if 0 . m . max(1 2 p) 1 (1 2 R)p(1 2 p)n if m 5 0,

wherep is the probability of retrieving a chunencoding that a term is in the studied propositandR is the probability of finding a referent astudy. This implies better recall for the sentenif participants are encouraged to find referefor the sentence. Experiment 3 contained a

THEORY OF SENTENCE MEMORY 353

FIG. 8. Proportion of recall of various sentence patterns from Anderson (1972) and ACT-R predictions. (a) The

kitof this proposal: Participants were asked toimagine a referent for the sentence and recall

Page 18: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

D

we

v%

th

4een

e

e2

oh

sohena

feainine

aieurtrb

at o

allant, bed.ent

in tot tore-95;

s in-ichh aiedra-fer- theted. ofat oc-ourringhefer- the

r-esere-edntsther

rt oforlitym-

yn-to-cesednd ei-en-ere

e-th

354 ANDERSON, BU

was higher in that experiment. As Fig. 8 shothe major impact of this manipulation is on thfrequency with which participants can retrieall the elements (10% for Experiment 2 vs 18for Experiment 3).

Three parameters were estimated to fitmodel. There was a probability,R, of finding areferent estimated at .20 for the nonimagery Eperiments 1 and 2 and at .39 for the imagery Eperiment 3. There isp, the probability of retriev-ing a studied chunk, which was estimated at .However, this probability cannot be directly sin ACT-R but results from the setting of threother parameters: the activation of the chu(A), the threshold (t), and the activation noise (s)according to Retrieval Probability Eq. (3). Bason prior models (e.g., Lebiere, 1998) we sets 50.20. We choset to be 0.3, consistent with thmodel for the next experiment (by Reder, 198that we model. To get a retrieval probability.44 we estimated A to be .25, just under tthreshold.

In addition to providing an excellent fit, thimodel provides an interesting perspectivesentence memory and all-or-none recall. In tmodel, perfect recall depends on finding a rerent for the sentence in past experience,on any inherent “Gestalt” properties ofproposition. One consequence of using a reent is that participants may not always recthe same words but rather similar-meanwords. For instance, while “park” may bethe sentence it might really be a “forest” in threferent and so “forest” will be recalled. R. CAnderson (1974) reports about 20% ofwords recalled are not the actual words studbut rather are semantically related to the stied words. Graesser (1978) similarly repothat intrusions (which are a minority of the erors, the majority being omissions) tend tosemantically related.

Reder (1982): Retrieval versus Inference

There are two ways that one can decide thsentence about a story is true if one has eslished a referent for the whole story. One istry to directly retrieve it (its surface encoding

its propositional encoding). The other is to infethe sentence from other sentences that can be

IU, AND REDER

s

e

e

x-x-

4.t

k

d

)fe

nisf-ot

r-llg

.lld

d-s-e

t aab-tor

called. Thus, even if we cannot directly recthat Bob ate the meal, if he went to a restaurordered a meal, and paid the bill we mightwilling to infer that the meal was consumeReder (1982) has referred to such a judgemas a “plausibility judgement” and noted thatmost real-life situations people are askedjudge what they believe to be true and nojudge what was literally stated. Other searchers (e.g., Graesser & Zwaan, 19Kintsch, 1998) have taken such inferences adicating the creation of a situation model, whinvolves embellishing the stated material witmental representation of the situation implby the material. A significant issue in the liteture on text memory is how many of these inences are made during normal reading oftext and how many are made only when tesBecause of the architectural commitmentsACT-R, we are committed to the position thfew inferences can be made at study if studycurs at normal reading or listening rates. In model, those few inferences generated dureading involved adding a pointer from tchunks encoding the proposition to a past reent. This referent link enables inferences attime of test.

We first test the ACT-R model of such infeences with Reder’s (1982) experiments. Thexperiments looked at the transition from trieval-based judgments to plausibility-basjudgments over time. In her task, participaread stories and then had to judge either whesentences were explicitly presented as pathe story (in the recognition condition) whether they were plausible (in the plausibicondition). Reder’s stories consisted of coplex, free-form sentences. To simplify the stactic processing, we presented ACT-R with sries consisting of subject-verb-object sentenlike “Bob entered the-restaurant,” “Bob orderthe-meal,” “The-waiter delivered the-meal,” a“Bob ate the-meal.” Then ACT-R was testedther with sentences it had studied, like “Bob tered the-restaurant,” or sentences which wconsistent with the script, like “Bob left threstaurant,” or in the plausibility condition wi

r re-sentences that did not fit the script, like “Bobdelivered the-meal.”
Page 19: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

N

et

u o t

e)o

n

igr

m

atytiA

itawasdi-ndveteinm

a p

d

nde is

Fin-er,wsn be0-sand theiva-ay

par-.tsitytheusfer-is is thee-iedses

ac-onusere- esti-

mated that only 90% of these sentences wouldbe judged plausible by the plausibility strategy.7

n

7 In an immediate test Reder found that participants areas

THEORY OF SE

Participants were tested either immediatafter reading the story (which Reder interpreas a 120-s delay), after 20 min, or after 2 daFigure 9 displays the latencies for the old (stied) sentences (which were targets in bothrecognition and the plausibility condition), fplausible new sentences (which were foils inrecognition condition and targets in the plaubility condition), and for implausible sentenc(which were foils in the implausible condition6

With longer delays between reading the stand test, participants showed large increaselatencies in the recognition condition but a decrease in latencies in the plausibility contion. Figure 10 displays the error data, whshow a large increase in error rates for recotion judgments and relatively constant errates for plausibility judgments.

The ACT-R model for this experiment is a siplified version of the model offered in Red(1982). Reder’s model assumed that participcould judge sentences by either a retrieval straor an inference strategy. The retrieval strategACT-R was implemented by the same recognimodel (see Fig. 7) that we used for modeling derson (1974). The inference strategy involvedtrieving the referent of the story (in the precedexample this would be a proposition in the resrant script) and seeing if the test proposition stored in the same script. In the plausibility contion the model either (1) tried retrieval first aonly switched to inference if it could not retriethe sentence; or (2) tried the inference strafirst, in which case it just omitted retrieval. The ference strategy is faster because of the stroencoding of the referent propositions but is sowhat less accurate because some studied tences might not be judged as plausible (becthey are not stored as part of the script for thatticipant) but could be retrieved. Reder (1982) aassumed that participants mixed strategies inrecognition condition; however, for simplicity thACT-R model always tried retrieval in this contion and never plausibility.

In modeling the effect of delay we assumthat the immediate condition represented a 12

6 The data in Figs. 9 and 10 are the average of Reder’sexperiments.

TENCE MEMORY 355

lyedys.d-

therhesi-s

.ry

s inet

di-chni-or

-erntsegy inonn-

re-ngu-

delay, the 20-min delay condition 1200 s, athe 2-day delay 5000 s. The 2-day delay valutaken from other research (e.g., Anderson,cham, & Douglass, 1999; McBride & Dosh1997) showing that decay dramatically sloafter the experimental session is over and camodeled by a slowing of the clock. The 500estimate is based on Anderson, Fincham,Douglass, who showed that each day afterexperimental session is approximately equlent to half an hour in the experiment. This mreflect the decrease in interference when theticipant leaves the context of the experiment

ACT-R allows us to model how participanwill shift between strategies in the plausibilcondition. Table 3 presents an analysis of relative utilities of the two strategies at variodelays. As can be seen, at all delays the inence strategy has a latency advantage. Thbecause the participants avoid searching forsentences which will be futile for the threfourths of the probes that do not involve studsentences. This advantage slightly increawith delay. The retrieval strategy has a slightcuracy advantage for judging plausibility those trials involving a studied sentence becasometimes participants did not judge nonpsented plausible sentences as plausible. We

gyn-gere-

sen-usear-

lsotheei-

ed0-s

TABLE 3

Analysis of Strategy Selection in the Plausibility ConditioReder (1982)

120 s 20 min 2 days

Retrieval StrategyAccuracy (P) .861 .861 .855Mean Time (C) 2.89 3.02 3.07Utility ( PG-C) 26.38 26.25 26.01

Inference StrategyAccuracy (P) .842 .842 .842Mean Time (C) 2.31 2.36 2.44Utility ( PG-C) 26.32 26.27 26.19

Difference in Utility 0.06 20.03 20.18

10% more likely to judge a sentence as plausible if it hbeen presented.

twoProbability of Retrieval .78 .37 .03

Note. G 5 34.

Page 20: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

e thtvp

icted

en-ict

-on

FIG. 9. Latency data from Reder (1982) and ACT-R prediction. Data are plotted for the two types of judgments (Recog

FIG. 10. Error rates from Reder (1982) and ACT-R predictions. Data are plotted for the two types of judgments (Recogvs Plaus) and type of sentence (Old, New, and Implausible).

On the other hand, every retrieved sentencjudged plausible. The accuracy advantageretrieval is small because there is only a 10%vantage for only one-quarter of the probes had been studied, and this only occurs if studied sentence can be retrieved. This adtage reduces with time because a smaller portion of the studied sentences can betrieved. Thus, the retrieval strategy has

advantage in terms of probability (P) of a cor-rect answer, while the inference strategy has

isforad-at

hean-ro-re-an

advantage in terms of the time (C) to produce ananswer. As described with respect to ConflResolution Eq. (5), these factors are combininto a net utility that is calculated as PG 2 C.The value estimated for G is 34.8 Table 3 alsoshows the differences which lead to the differtial choice of strategies according to the ConflResolution Equation (5) with the t parameter estimated at .05 as in the model for Anders

356 ANDERSON, BUDIU, AND REDER

vs Plaus) and type of sentence (Old, New, and Implausible).

an(1974). These probabilities are given in the finalline of Table 3.

Page 21: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

d

aoi rf

sh

cfro

nd tasti

dli

a

ar

a

h

me

asifichele.),gic.e-erengelly

,r-n-ou-ya

se-ses

re-er

s-er11urof

oa-te

es.elr-

THEORY OF SEN

The attempt to judge a sentence can enone of three ways—ACT-R is unable to retrieany proposition (studied or script), ACT-R cretrieve a proposition that mismatches the prsentence, or ACT-R can retrieve a propositthat matches. If it matches, the ACT-R modelsponds yes; if it mismatches it responds no; iproposition is retrieved the model guesses tween yes and no with equal likelihood. We emated that participants took .8 s to make tguess but we did not model these guessprocesses. We also estimated a .85-s intertime, which is .2 s longer than in the model Anderson (1974). This extra time probably flects the extra time to comprehend the mcomplex sentences that Reder used. We alsReder’s error data and to do this we had to emate a probability of making a slip and givithe unintended response which we estimatebe .12. We achieved a correlation of .977 fortency and .927 for error rates with 5 parameestimated (see Table 1). These are comparto the fits reported in Reder (1982), who emated 20 parameters but also fit other aspecthe data we did not. The two parsimonachieved by the ACT-R model are that it donot need to estimate separate latency and aracy parameters for the different delays andoes not have to estimate separate probabiof strategy selection for the different delays.

The basic insight of this simulation is that wcan achieve the inferential capacities associwith situation models by simply storing pointer to a existing knowledge structure. Tprevious simulation of Anderson (1972) hshown that this can also serve as the basis foall-or-none character of recall. The subsequsimulations will show how this mechanism cproduce some of the other effects associawith inferential memory. This situational oscript information is better retained than tstudied propositional information because it h

received more practice in the past and not cause of different retentive properties. We cla

s-ng

-8 G was not estimated in the model (Table 2) for Anders

(1974) because accuracy remained at ceiling over the speriod of that experiment and so did not differ between two strategies.

TENCE MEMORY 357

invenbe

one-

nobe-ti-at

ingept

ore-re

o fitsti-g to

la-ersbleti-s ofesesccu- itties

etedahed theentntedreasbe-im

that equivalent practice would convey the saretentiveness on the studied propositions.

The standard assumption in the literature hbeen that participants will use the most specrepresentations if available and only use tmore inferential if the others are not availabHowever, the ACT-R model, like Reder (1982makes choice among representations strateParticipants will tend to use whichever reprsentation has the highest net utility. Red(1987) showed that participants’ choice betwethe retrieval and inference strategies will chandepending on which strategy has been locasuccessful.

Zimny (1987): Surface versus Propositionalversus Situational Information

Zimny (1987; reported in Kintsch et al.1990, who also report a CI model for the expeiment) conducted an experiment that had cosiderable similarity to that of Reder (1982; alsReder, 1976, 1979) but which focused on accracy of judgments rather than latency. Zimnlooked at sentence memory just after readingstory, 40 min after studying the story, 2 dayafter, or 4 days after. Participants were prsented with verbatim sentences, paraphra(which were identical propositionally to thestudied sentences), inferences, or novel unlated sentences. Unlike the judgments in Red(1982), Zimny’s participants were asked to dicriminate verbatim sentences from all othsentences including paraphrases. Figureshows the proportion accepted from the focategories of probe sentences as a functiondelay. Participants more rapidly lose ability tdiscriminate verbatim sentences from parphrases than they lose the ability to discriminabetween studied propositions and inferencWe decided to adapt the two-strategy modthat we used for Reder (1982) to make the vebatim judgments in this experiment. We asumed that participants were selecting amothe following strategies.

1. Retrieval strategy: Try to retrieve a ver-batim trace (e.g., Fig. 2a) to match the sen

onhort

tence. Only if this fails go on to retrieve apropositional trace (e.g., Fig. 2b). If no

the

Page 22: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

n-s

e

r.

t-l

0yeinaovarc2).util- for

, isssid

eir

he

sti-c-ion

-iese

islexbe-ex- iner-heuldly

hism-

pletely forget the stories that they have studiedand so forget the connections of the story to the

FIG. 11. Results from Zimny (1987) and ACT-R predictions. Data are plotted for the two types of judgments (Recog vs).

9

such trace can be retrieved assume the setence was not studied. This strategy will reject inferences and unrelated sentencesince there are no traces of these sentencIt will reject paraphrases if either a mis-matching verbatim trace can be retrieved othe propositional trace cannot be retrievedIt will reject verbatim sentences only if nei-ther the verbatim nor the propositionaltrace can be retrieved.

2. Inference strategy: Simply determine ifthe sentence is part of the script. This straegy will accept all sentences except noveunrelated sentences.

We estimate that the shortest delay was 6At this delay, the retrieval strategy will enjogreater success in discriminating verbatim stences (which is the participants’ task) but walso take longer to execute since the chuformed to encode the study sentence are wethan the referent chunks. As time passes, hever, the accuracy advantage of the retriestrategy disappears as memories decay their latency cost increases—just as the trieval strategy lost relative to the inferenstrategy in the simulation of Reder (198Table 4 presents an analysis of the relative ity of these strategies comparable to Table 3

Plaus) and type of sentence (Old, New, and Implausible

Reder (1982). The value of G estimated in thexperiment was 10.5. The fact that it is lowe

-

s.

s.

n-llkskerw-alnde-e

than the G from Reder (1982), which was 34interpreted as Zimny’s participants placing leemphasis on verbatim accuracy than dReder’s participants on the accuracy of thplausibility judgments.9 These net utilities canbe converted into probability of choice througthe Conflict Resolution Equation using thsame value of the noise parameter t of .05 thatwas used in the earlier models. We also emated a probability .24 of slipping and produing the wrong response. The overall correlatwith the data is .956.

As with the Reder model, this model illustrates how participants’ choice among strategis determined by the relative availability of thmemory structures. The verbatim structurethe most fragile because it is the most compand the situation referent is most permanent cause it has been well practiced before the periment. There are no inherent differencesthe traces set down in the experiment. It is intesting to note in Fig. 11 that, according to ttheory, even acceptance of inferences shostart dropping after 4 days. This trend is onslightly apparent in the data but eventually twould happen as participants come to co

358 ANDERSON, BUDIU, AND REDER

isr

The value of G is really being constrained to produce a50% strategy mix at the 40-min delay.

Page 23: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

T

crdgTmrdbfin

lethcie

&,

semncyemnn

e nesn

dat-eenara-e istoab-c-i-ent,-

re-rese.nd-en- suc-l ofuc-theas-t anob-um-

sis-ticbe-tim ac-

withents

an aedr-

mptt tooryn-e- re-ter

any

referent. Also note that initially the model acepts few inferences and a reduced numbeparaphrases. This is because initially the mois predominantly using the verbatim stratewhich rejects paraphrases and inferences. initial blocking of intrusions by the verbatitrace is similar to the proposal of BraineReyna, and Kneer (1995), who find that a vertim trace can block false alarms. They also that this effect decreases with delay.

After reading an earlier draft of this articCharles Brainerd asked us to consider whethis model predicts the pattern of dependenreported in an extensive series of sentence mory studies of children and adults (ReynaKiernan, 1994, 1995; Kiernan, 1993; Lim1993). Those experiments asked participanttry to discriminate among verbatim sentencparaphrases, and inferences just as in the Ziexperiment. Of interest was how performavaried between immediate recall and delarecall (often a week later). On immediate meory tests acceptance rates for verbatim sentewere stochastically independent of acceptarates for paraphrases and inferences but thceptance rates for paraphrases and inferewere positively correlated. On the delayed tacceptance rates for all three types of sentewere stochastically dependent.

Note. G 5 10.5.

We examined the issue of stochastic independence in the Zimny simulation and how th

ENCE MEMORY 359

- ofelyhis

,a-d

,eresm-

tos,nyed-

cesceac-cest,

ces

predictions of the ACT-R model would depenon the strategy. In the case of the retrieval stregy, the model produces a dependence betwthe acceptance of verbatim sentences and pphrases because both will be accepted if thera propositional trace and no verbatim tracereject the paraphrase. This means that in thesence of a verbatim trace either both will be acepted or neither will. However, in the immedate condition of the Zimny experiment, sincthe propositional trace is almost always presethis source of covariation is removed. In the immediate condition, verbatim sentences arejected only if the participant slips and slips arandom events, uncorrelated with anything el

The inference strategy produces a depeence between the recall of all three types of stences because they depend on the finding acessful referent. We assumed in our modethe Zimny data that participants always sceeded in finding a referent at study but to extent that they did not, there would be stochtic dependence. Since participants only adopinference strategy at delay this predicts the served stochastic dependencies at delay. In smary, the ACT-R model seems generally content with the reported patterns of stochasdependencies. It produces dependencies tween all types of sentences except for verbasentences in the immediate condition whoseceptance rates are at a maximum.

Schustack and Anderson (1979): Sentences Referents versus Sentences without Refer

As seen in the previous models, ACT-R cproduce inferential recall simply by addingpointer from chunks encoding the studiproposition to an existing proposition in a refeent context such as a script. There is no atteto copy over the structures from the referenadd explicit inferences to the sentence or strepresentation. As we saw in the model for Aderson (1972), this can improve memory bcause one can use the referent proposition tocall the sentence. However, the referent poinalso creates the potential for just guessing

THEORY OF SEN

TABLE 4

Analysis of Strategy Selection in Zimny (1987)

Immediate(60 s) 40 min 2 days 4 days

Verbatim StrategyAccuracy (P) .90 .79 .60 .51Time (C) 2.71 3.04 3.21 3.21Utility ( PG-C) 4.75 3.78 2.56 2.11

Inference StrategyAccuracy (P) .67 .67 .67 .67Time (C) 2.27 2.42 2.51 2.60Utility ( PG-C) 3.90 3.78 3.70 3.57

Difference inUtility 0.85 0.00 21.14 21.46Probability ofVerbatim 1.00 0.49 0.00 0.00

-eproposition in the referent even if it is notpointed to by a chunk from the memory experi-

Page 24: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

D

cud

o

en o aItsms

. c wu

te

ndlehen

nn

oc est trta oe

inlo-ey

. Inhorentdi-on-

ndessile saysedi-us,ap-es-iven

ro-ere ref-callhisncer-

wey tore,

lot.tterible

erpa-

topa-b-eiedto

360 ANDERSON, BU

ment. Anderson (1972) did not use sentenwith known referents and thus guessing conot be assessed. We now consider two stuthat explicitly manipulated the availability known referents.

The experimental literature is not consiston whether memory is enhanced for refereconsistent material. The best way to assessissue is with a recognition memory paradigmwhich participants are tested with referent-csistent sentences that came from the storyreferent-consistent sentences that did not. proved memory would be reflected in greadiscriminability, poorer memory in worse dicriminability, and a “guessing bias” in the forof a greater tendency to accept referent-content sentences, whether they occurred or notdescribe below an experiment by Bower, Blaand Turner (1979) that can be interpretedshowing poorer discriminability and bias. Hoever, first we describe an experiment by Schtack and Anderson (1979) that can be inpreted as showing increased discriminabilitywell as increased bias.

Schustack and Anderson (in an elaboratioSulin & Dooling, 1974) had participants stustories about fictional figures that had paralto well-known public figures. Thus, they migbe told that Yoshida Ichiro was a Japanpolitician of the 20th century who was “resposible for intensifying his country’s involvemein a foreign conflict” and other such facts cosistent with the American president LyndJohnson.10 In the experimental condition partiipants were told about the parallel and wereminded at test. They were asked to identify stences which they had studied. They were tewith sentences that they had studied and were true of the parallel as well as sentencesthey had not studied and were true of the palel. Participants achieved 87.9% hits on the gets while showing only 17.9% false alarmsrelated targets. In one control condition th

were not informed about a parallel at study test and achieved 67.3% hits and 13.6% fa he

i-ls.10 Note that these analogies are not scripts in the Sch

and Abelson sense but reflect the more general sense oferents in our model.

IU, AND REDER

esldies

f

ntt-

thisinn-nd

m-er-

is-Wek,as-s-r-

as

ofylstse-

t-

n-re-n-tedhathatal-r-ny

orlse

alarms. Perhaps a better control was onewhich they were given the name of a nonanagous public figure at study and test—here thachieved 71.6% hits and 12.6% false alarmsterms of d8 and bias measures, participants wstudied and judged the sentences with a refehad d8 values of 2.09 in the experimental contion versus 1.55 and 1.72 in the two control cditions. In terms of bias, the value of b was .77for the experimental condition versus 1.67 a1.63 in the control conditions (where values lthan 1 indicate a tendency to say “yes” whvalues greater than 1 indicate a tendency to“no”). Figure 12 graphically represents thedata, averaging together the two control contions (which is referred to as no-referent). Thparticipants were better when they had an propriate referent. Another experiment also tablished that they had to have the referent gboth at study and at test to enjoy this benefit.

The ACT-R model we have presented pvides a basis for enhanced memory when this a referent because it stores a pointer to theerent proposition. Just as in the model for rein Anderson (1972), participants can use treferent proposition to reconstruct the sentewhen they cannot directly recall it. This refeent-based recall can be further enhanced ifassume that participants have some tendencaccept any proposition in the referent structunot just the one pointed to in the referent sThe former process is responsible for the bememory while the latter process is responsfor the bias.

In adapting the ACT-R model of the Redtask for this experiment, we estimated threerameters. One was the retrieval thresholdt [seeRetrieval Probability Eq. (3)], which was set20.05. The second parameter was the sliprameter, which was .125. The third was the proability of accepting the probe if it was part of threferent’s history but not connected to a studproposition. This was .06 and reflects the biasaccept related sentences. Thed8 values are 2.10for the experimental conditions and 1.71 for tcontrols and theb values are .82 for the expermental condition and 1.65 for the controank

Under any parameter setting the model wouldpredict greater bias and discriminability in the

ref-

Page 25: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

tsisautnext

u

9eren ia ca

elscnthinwin

o

mehus-inges.rateone&do

par-rec-rec-

toheytersson.of

fectce

n bepleant

au-Inthebeion

il ictions.

referent condition. Given that ACT-R predicthe qualitative result, its good quantitative fitnot surprising, as there are three parametersfour data points. Thus, the most important resis the qualitative conclusion that ACT-R predica discriminability advantage for the referent codition in this paradigm. We use the parametertimates from this experiment to predict the ne

Bower, Black, and Turner (1979): Single versMultiple Uses of Scripts

Although Schustack and Anderson (197presented a situation in which providing a refent improved recognition accuracy, an expement described by Bower, Black, and Turn(1979) reversed this result. In their experimeparticipants studied one, two, or three storiesvolving the same script such as visiting a heprofessional. Their participants were askedgive recognition ratings of sentences on a sfrom 1 to 7 (1 5 high confidence rejection, 4 5guessing, 7 5 high confidence acceptance). Fig-ure 13 displays the recognition rates for targscript-related foils, and script-unrelated foiThe recognition ratings for studied sentenand unrelated foils did not vary much as a fution of the number of stories studied. On other hand, the ratings for script-related foils creased from 3.91 to 4.62 to 4.81 for one, tand three stories, respectively. It is worth not

FIG. 12. Percentage acceptance of targets and fo

about the design that the probability that thefoils appeared in another story varied with num

ndlts-s-.

s

)r-i-rt,n-lthtole

ts,.esc-e-o,g

ber of stories—0% for one story, 50% for twstories, and 100% for three stories.

We attempted to fit these data with the samodel and parameters that were used for Sctack and Anderson (1979). This required finda way for ACT-R to give confidence measurWhile we could have developed a more elabotheory of confidence judgments and have dso elsewhere (Anderson, Bothell, Lebiere,Matessa, 1998b), it would be a digression toso here. Therefore, we simply assumed thatticipants assigned a mean rating of 1.0 to unognized script-unrelated sentences, 3.5 to unognized script-related sentences, and 6.0script-related sentences that they thought trecognized. Otherwise the model and paramewere the same as for Schustack and AnderAs Fig. 13 illustrates, the model did a good jobreproducing these data (the correlation isr 5.998). The model produced an increasing efof number of stories on related foil acceptanbecause a proposition studied in one story caaccepted as foil in another story. As an examof how this can happen, suppose the participhas studied one restaurant story that includes

“Dan ordered the-meal” and another restrant story that includes “Bob ate the-meal.”the structure of the Bower et al. materials,“ordered the-meal’ proposition would not studied with Bob and “ate the-meal” proposit

s from Schustack and Anderson (1979) and ACT-R pred

THEORY OF SENTENCE MEMORY 361

se-would not be studied with Dan. Then the partic-ipant was tested with “Bob ordered the-meal.”

Page 26: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

th

iooef

s n

sn

e

r

nosto-

e-tiveturenotch.them-t-n.

g atdsg.sti-so,

FIG. 13. Mean ratings for targets and foils from Bower, Black, and Turner (1979) and ACT-R predictions.

The participant can find a referent pointer frothe-meal to the “person orders the-meal” in restaurant script because of the story studabout Dan. Retrieving a referent propositserves as the basis for accepting the prproposition just as it had in the previous modThe conclusion from this model and the one Schustack and Anderson is that use of a scsentence in one story makes it available bothcorrect recognition in that story and for falrecognition in other script-related stories. Itworth understanding why Bower et al. foupoorer discriminability while Schustack anAnderson found increased. Bower et al. ufoils from other stories which produced icreased false alarms. On the other hand, theynot have a condition like Schustack and Andson where there was no recognizable referenis in this condition that targets are more poo

recognized. In summary, if a referent is used fa single story it conveys a benefit on that sto

mrelative to conditions in which the story has

362 ANDERSON, BUDIU, AND REDER

eiednbels.orriptforeisdded- didr-

t. Itly

referent or the referent is also used for other ries.

CONCLUSIONS

It is not a trivial matter that one can implment models of sentence memory in a cogniarchitecture. This is because the architeccomes with certain commitments that are present when building a model from scratACT-R has commitments about the nature of retention function which are at odds with comonly held beliefs about the differential forgeting of different types of sentence informatioIt also has a commitment to serial processinthe symbolic level which might seem at odwith evidence about inferential processinThus, success in this modeling enterprise contutes a significant test of the architecture. Al

orrysince this architecture models cognition in mul-tiple domains (Anderson & Lebiere, 1998) our
Page 27: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

g

tit

y

a

i

w

d

nalfer-like notver- be-delich

pa-tialser,ff,me,yingAn- ar-nts

ratencetory toab-canrial

notap-xt.

er-ntedeseuc-uch isy in

outghtlas-

THEORY OF SEN

success provides support for the view that this nothing special about sentence processinsentence memory. Finally, the architecture cbring new integration to a domain like sentenmemory by explaining the selection among various strategies that a participant might brto bear in recalling a sentence. Basically, paripants tend to choose the strategy that delivthe best combination of high accuracy and shprocessing times and the best strategy change with delay (basically, the point madeReder, 1988).

The significance of modeling these six expeiments somewhat depends on the consistencparameter estimates. The decay parametedwas kept at .5 throughout all simulations as itin all ACT-R models (Anderson & Lebiere1998) and as it has been estimated in a extsive empirical investigation (Anderson, Fincham, & Douglass, 1999). The rest of the prameters are displayed in Table 1. With twexceptions the common parameters are remably consistent. Both exceptions are associawith the Zimny model that dealt with verbatimmemory judgments at very long delays. TheGparameter, measuring the value of accuracy wlower by a factor of 3 and the slip probabilitwas higher by a factor of 2. Our model for thtask was built on the assumption that the latecies for the memory judgments could be prdicted from the model for the Reder task. Hoever, since no latency data are available it wnot possible to check these assumptions.11 Aqualification on the generality of the conclusions here is that our model only has beenveloped to apply to simple and unambiguosentences. It is an open question how well it wgeneralize to more complex sentence forms.

Our model has numerous similarities to t

fuzzy trace model of Reyna and Braine(1995). Like that theory we assume these t

ingoryre-ectdelrgu- re-de-

11Actually, we have since learned from Zimny (personcommunication) that her study involved a word-by-wopresentation procedure with 300 ms/word and participatook less than a second after presentation of the sentenmake their judgments. This yields total times comparablethose produced in Table 4 by the Reder model, but the ferent procedure suggests our extrapolation of the Remodel to her task will be only approximate.

TENCE MEMORY 363

ere orancehengic-ersortcan in

r-of

ris,en--a-ork-ted

asysn-e--as

-e-

usill

herdwo

traces—a verbatim trace and a propositiotrace—and that participants vary in their preence for using the two traces. However, unReyna and Brainerd, the ACT-R model doesassume the differential decay although the batim trace is harder to reinstate at a delaycause it is more complex. The ACT-R moalso offers a systematic basis for deciding whstrategy participants will prefer.

An important consequence of the model’s rameter commitments was minimal inferenprocessing. Like other theorists (GraesSinger, & Trabasso, 1994; McKoon & Ratcli1992), we acknowledge that, given enough tipeople can elaborate what they are studwith a great many inferences. Indeed, we (derson & Reder, 1979; Reder, 1979) havegued that in many conditions where participaare trying to remember material they elaborichly on the material with great consequefor their memory. However, what is striking us is that such elaborations are not necessaaccount for much of the data. By simply estlishing a pointer to a referent, the participant both enhance memory for the target mateand prime retrieval of related material. It is necessary to make explicit inferences by mping over the information to the current conteNot only would the generation of such infences be time consuming but, unless we wato attribute special mnemonic properties to thinferences, they would be unlikely to be scessfully retrieved at delay. The way to get sstrong inferential effects in memory at delayto count on well-established referents alreadlong-term memory.

Graesser, Singer, and Trabasso (1994) laya set of different types of inferences that mibe made during comprehension and they csify different comprehension theories accordto which of those inferences a given theclaims that participants make. It is worth viewing how our own model stands with respto this set of inferences. The ACT-R mobuilds chunks that represent the role of the aments in the sentence. This might requiresolving the referent of a noun or pronoun or

alrdntsce to todif-

ciding the role of an argument—which Graesseret al. call local coherence inferences. However,

der

Page 28: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

D

ekb

e

c

s

t

ahg

e

u

ev

tm

2

eid

hn

ch

r

tedsef therites ar ofheithd in lote aen-tedeeditlyef-ente.g.,

anst

er-eretherre-he

sesnt.y isce.

ess-ces

hl,ing-

re-b-

.situ-urllynale-ea

364 ANDERSON, BU

our model will not build inferences if they rflect new propositions that require new chunThe only inferential elaboration postulated our model is the tagging of the chunks repsenting the proposition with a pointer to a refent. This might also be viewed as in the servof building local coherence. Except as impliin the referent link, our model does not magoal inferences, causal inferences, inferenceimplicit arguments, or any of the other infeences that Graesser et al. list.

Verification latency has been used to demine what inferences a participant has madea participant recognizes an inference as faststated proposition, the assumption is often mthat the inference must have been made wthe sentences are studied. While disagreeinjust what inferences are made, Graesser, Sinand Trabasso (1994) and McKoon and Ratc(1992) agree that such latency measures arestrong evidence that the inference has bdrawn during initial reading. This is a point thwas made earlier (Reder, 1979). This is becapostcomprehension processes cannot be rout. The ACT-R model presented here illustrathis point. Even though the inference is not gerated at study participants can sometimes ify an inference faster than a stated sentencecause the referent is more strongly encoded the sentence and so its components can be rapidly retrieved.

Much of the research on different inferentypes has used a word priming methodolo(e.g., Long, Golding, & Graesser, 199Magliano, Baggett, Johnson, & Graesser, 199If it can be shown that words appearing in ctain inferences can be recognized more rapit is assumed that these inferences were mduring comprehension. Research has domented that words from certain kinds of infeences are likely to be primed, particularly if tparticipants are of high knowledge (e.g., Loet al., 1992; Long & Golding, 1993). We thinthese results can be understood within the rent theory in terms of the probability that tparticipants have referent experiences for stories studied and the probability that these

erents have the inferences represented as pathem. If the referent experience can be fou

IU, AND REDER

-s.y

re-r-

iceit

ke of

r-

er-. Ifas adeile onger,liff noten

atuseled

tesn-er- be-hanore

cegy;

3).r-ly,

adecu-r-eg

kur-e

theef-

and the relevant inference is strongly associato the referent, spread of activation will cauthese terms to be primed as a consequence ocomprehension process. For instance, a favostory of Graesser and his colleagues involvestory about a dragon kidnapping the daughtea Czar. Presumably participants will vary in tamount of prior experience they have had wdragon stories and what facts are representetheir dragon stories. Participants who know aabout dragon stories are more likely to havstrongly encoded referent in memory that ables spread of activation to highly associaconcepts. Thus, in our view, this research nnot indicate that the inferences are explicdrawn; only that they are available from the rerents. This view is consistent with the recresearch on memory-based text processing (Cook, Halleran, & O’Brien, 1998; Gerrig &McKoon, 1998) that shows that, rather thmaking explicit inferences, participants juprime relevant background information.

Two of the experiments we modeled (Andson, 1972, 1974) involved sentences that wpresented out of a prose context while the oexperiments involved sentences that were psented in the context of coherent stories. Tdifference in our treatment of these two clasof experiments was the availability of a refereWe assume that the effect of a coherent storto help establish a referent for the sentenSuch a referent enables the inferential procing that tends be more substantial for sentenpresented in a coherent context.

It is worth comparing the ACT-R model witKintsch’s construction-integration (CI) modewhich similarly integrates sentence processwith a general theory of cognition. Kintsch emphasizes the notion of different types of repsentations and, unlike ACT-R, does attriute different mnemonic properties to themNonetheless, he represents the text and theation model in terms of propositions and opropositional representation can be basicaseen as an incorporation of his representatiotheory into ACT-R’s general chunk-based, dclarative structure. Kintsch emphasizes the id

rt ofndthat a separate situation model is created for thecurrent text in contrast to our simpler addition
Page 29: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

N

r

eo

o

pr

a-csglinisli

ib

wy

e

htefoaoh,vrs

ac-eto

essotesr-het-echates

jor(1)tion)ts

lelong isof

e

re-rm-

a-

t

-

.

9).

d

THEORY OF SE

of pointers to an existing referent. The repsentations postulated by Kintsch are usuacreated through a hand simulation of a setrules and so there is not a strong commitmto the processing time for individual stepscomprehension. In contrast, it is ACT-R’s commitment to processing time that forces us tominimalist position. The CI model assumesspreading activation process at study that oates over a network of propositions to conveon asymptotic values that play an importarole in determining the long-term memory fmiliarity of the propositions, which, in turn, influences recognition judgment. In contrast, avation in ACT-R [Activation Eq. (1)] operateat test to directly determine recognition judments. Sentence recognition itself is modein Kintsch’s theory as a familiarity judgmentwhich the probe evokes some global familiarresponse as a function of the strengths of aciations to elements in the probe. This is expitly an importation of the Shiffrin and Gillund(1984) SAM memory model. Our modelquite sensitive to strengths of associationattempts to explicitly retrieve the elementsthe original proposition rather than makeglobal judgment.

In general terms, it can be said that the tmodels use similar concepts in different waACT-R paints a picture of remembering a setence that is much more discrete (i.e., discrsteps due to sequential production firing) aSpartan than the one painted by CI. Nonetless, at least in the case of the Zimny data,two theories result in roughly equivalent prdictions. The Zimny data set is well chosenthe purposes of establishing that ACT-R coffer a competitive account in the domainlanguage processing where the CI theoryhad its most extensive application. Howeveris not well chosen to provide a discriminatitest of the two theories. The account of thesults depends on the existence of three typerepresentation—an assumption commonboth models and basically forced by the daFrom the ACT-R perspective, the most criticpredictions concern the details of the tim

course of processing and the CI theory has nbeen developed for such predictions. In co

TENCE MEMORY 365

e-llyofntf-uraer-gent-

ti-

-ed

tyso-c-

trast, the CI model has been elaborated tocount for priming and inference effects that whave not addressed. It would be a good ideadevelop both models toward tasks that addrissues in common. Until this is done we cannmake strong claims about the real differencbetween the two theories or their relative meits. However, given that we have advanced tACT-R theory here, we should say what atracts us to its account: It is committed to thmoment-by-moment steps of processing suthat it does all tasks from input of the wordsstudy to the production of memory responsat test.

In conclusion, this research has three maimplications for sentence memory research:It is not necessary to assume different retenfunctions for different types of information, (2it is possible to produce rich inferential effecwithout extensive elaborations or paralthreads of processing, and (3) the choice amdifferent ways of answering a memory probe

sut

ofa

os.n-te

nde-he-rnfasit

ee-of

tota.ale

strategic in response to the relative utilities these strategies.

REFERENCES

Anderson, J. R. (1972). A stochastic model of sentencmemory. Doctoral dissertation, Stanford University.

Anderson, J. R. (1974). Verbatim and propositional repsentation of sentences in immediate and long-tememory. Journal of Verbal Learning and Verbal Behavior, 13,149–162.

Anderson, J. R. (1974). Retrieval of propositional informtion from long-term memory. Cognitive Psychology, 5,451–474.

Anderson, J. R. (1976). Language, memory, and though.Hillsdale, NJ: Erlbaum.

Anderson, J. R. (1982). Acquisition of cognitive skill. Psy-chological Review, 89,369–403.

Anderson, J. R. (1983). The architecture of cognition. Cam-bridge, MA: Harvard Univ. Press.

Anderson, J. R. (1993). Rules of the mind. Hillsdale, NJ:Erlbaum.

Anderson, J. R. (2000). Cognitive psychology and its implications(5th ed.). New York: Worth.

Anderson, J. R., Bothell, D., Lebiere, C., & Matessa, M(1998). An integrated theory of list memory. Journal ofMemory and Language, 38,341–380.

Anderson, J. R., Fincham, J. M., & Douglass, S. (199Practice and retention: A unifying analysis. Journal ofExperimental Psychology: Learning, Memory, anCognition, 1120–1136.

otn-Anderson, J. R., & Lebiere C. (1998). The atomic compo-

nents of thought. Mahwah, NJ: Erlbaum.

Page 30: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

ef

c-

-

.

ont

a

-l,-

alng

eive

er,an-

f.

-S.

-

ofr-

--

366 ANDERSON, BUD

Anderson, J. R., & Matessa, M. P. (1997). A production stem theory of serial memory. Psychological Review,104,728–748.

Anderson, J. R., & Reder, L. M. (1999). The fan effect: Neresults and new theories. Journal of Experimental Psy-chology: General, 128,186–197.

Anderson, J. R., & Reder, L. M. (1979). An elaborative prcessing explanation of depth of processing. In L.Cermak & F. I. M. Craik (Eds.),Levels of processing inhuman memory(pp. 385–404). Hillsdale, NJ: Erlbaum.

Anderson, R. C. (1974). Substance recall of sentencQuarterly Journal of Experimental Psychology, 26,530–541.

Bower, G. H., Black, J. B., & Turner, T. J. (1979). Scripts memory for text. Cognitive Psychology, 11,177–220.

Brainerd, C. J., Reyna, V. F., & Kneer, R. (1995). Falsrecognition reversal: When similarity is distinctiveJournal of Memory and Language, 34,157–185.

Budiu, R. (in preparation) The role of background knowl-edge in sentence and discourse processing. Doctoraldissertation, Carnegie Mellon University.

Budiu, R., & Anderson, J. R. (2000). Integration of bacground knowledge in language processing: A unifitheory of metaphor understanding, Moses illusionand text memory. In Proceeding of the Third Interna-tional Conference on Cognitive Modeling, (pp. 50–57).Groningen, The Netherlands: Universal Press.

Cook, A. E., Halleran, J. G., & O’Brien, E. J. (1998). Whatreadily available during reading? A memory-baseview of text processing.Discourse Processes, 26,109–130.

Fletcher, C. R. (1994). Levels of representation in memfor discourse. In M. A. Gernsbacher (Ed.) Handbook ofpsycholinguistics(pp. 589–608). San Diego, CA: Academic Press.

Gerrig, R. J., & McKoon, G. (1998). The readiness is aThe functionality of memory-based text processinDiscourse Processes, 26,67–86.

Gillund, G., & Shiffrin, R. M. (1984). A retrieval model forboth recognition and recall. Psychological Review, 91,1–67.

Graesser, A. C. (1978). Tests of a holistic chunking modesentence memory through analyses of noun intrusioMemory & Cognition, 6, 527–536.

Graesser, A. C., Singer, M., & Trabasso, T. (1994). Costructing inferences during narrative text comprehesion. Psychological Review, 101,375–395.

Graesser, A. C., & Zwaan, R. A. (1995). Inference genetion and the construction of situation models. In C. Weaver, S. Mannes, & C. R. Fletcher (Eds.),Discoursecomprehension: Strategies and processing revisi(pp. 117–139). Hillsdale, NJ: Erlbaum.

Jones, G. V. (1978). Tests of a structural theory of the meory trace.British Journal of Psychology, 69, 351–367.

Kennan, J. M., MacWhinney, B., & Mayhew, D. (1977

Pragmatics in memory: A study of natural converstion. Journal of Verbal Learning and Verbal Behavio,16,549–560.

IU, AND REDER

ys-

w

o-S.

es.

in

e-.

k-eds,

isd

ory

-

ll:g.

l ofns.

n-n-

ra-A.

ted

m-

).

Kiernan, B. J. (1993). Verbatim memory and gist extractionin elementary-school children with impaired languagskills. Unpublished doctoral dissertation, University oArizona.

Kintsch, W. (1988). The use of knowledge in discourse proessing: A construction-integration model.Psychologi-cal Review, 95,163–182.

Kintsch, W. (1998). Comprehension: A paradigm for cognition. New York: Cambridge Univ. Press.

Kintsch, W., Welsch, D. M., Schmalhofer, F., & Zimny, S(1990). Sentence memory: A theoretical analysis. Jour-nal of Memory and Language, 29,133–159.

Lebiere, C. (1998). Cognitive arithmetic. In J. R. Anders& C. Lebiere (Eds.),The atomic components of though(pp. 297–342) Mahwah, NJ: Erlbaum.

Lewis, R. L. (1999, March). Attachment without competi-tion: A race-based model of ambiguity resolution inlimited working memory. Paper presented at the CUNYSentence Processing Conference, New York.

Lim, P. L. (1993). Meaning versus verbatim memory in language processing: Deriving inferential, morphologicaand metaphorical gist. Unpublished doctoral dissertation, University of Arizona.

Long, D. L., & Golding, J. M. (1993). Superordinate goinferences: Are they automatically generated duricomprehension? Discourse Processes, 16,55–73.

Long, D. L., Golding, J. M., & Graesser, A. C. (1992). Thgeneration of goal related inferences during narratcomprehension. Journal of Memory and Language, 5,634–647.

Magliano, J. P., Baggett, W. B., Johnson, B. K., & GraessA. C. (1993). The time course of generating causal tecedent and causal consequence inferences. DiscourseProcesses, 16,35–53.

McBride, D. M., & Dosher, B. A. (1997). A comparison oforgetting in an implicit and explicit memory taskJournal of Experimental Psychology: General, 126,371–392.

McKoon, G., & Ratcliff, R. (1992). Interference duringreading. Psychological Review, 99,440–466.

McKoon, G., & Ratcliff, R. (1995). The minimalist hypothesis: Directions for research. In I. C. A. Weaver,Mannes, & C. R. Fletcher (Eds.),Discourse compre-hension: Essays in honor of Walter Kintsch(pp.97–116). Hillsdale, NJ: Erlbaum.

Murphy, G. L., & Shapiro, A. M. (1994). Forgetting of verbatim information in discourse. Memory & Cognition,22,85–94.

Newell, A., & Rosenbloom, P. S. (1981). Mechanisms skill acquisition and the law of practice. In J. R. Andeson (Ed.),Cognitive skills and their acquisition(pp.1–56). Hillsdale, NJ: Erlbaum.

Reder, L. M. (1976). The role of elaborations in the processing of prose. Unpublished doctoral dissertation, Uni

a-r

versity of Michigan.Reder, L. M. (1979). The role of elaborations in memory for

prose. Cognitive Psychology, 11, 221–234.

Page 31: Journal of Memory and Languageralucav/papers/jml.pdfopment of the ACT theory (e.g., Anderson, 1976, 1983, 1993; Anderson & Lebiere, 1998) has been that human cognition emerges through

THEORY OF SENTENCE MEMORY 367

Reder, L. M. (1982). Plausibility judgments vs. fact re-trieval: Alternative strategies for sentence verification.Psychological Review, 89,250–280.

Reder, L. M. (1987). Strategy selection in question answer-ing. Cognitive Psychology, 19,90–138.

Reder, L. M. (1988). Strategic control of retrieval strategies.d-

r

iEn

n

d

y. t

Cognitive Science Society(pp. 923–928). Hillsdale,NJ: Erlbaum.

Salvucci, D. D., & Anderson, J. R. (2001). Integrating ana-logical mapping and general problem solving: Thepath-mapping theory. Cognitive Science, 25,67–110.

Sanford, A. J., & Garrod, S. C. (1998). The role of scenario

,

al-a-r

tic-

y of-

mr-

In G. Bower (Ed.),The psychology of learning anmotivation(Vol. 22, pp. 227–259). New York: Academic Press.

Reyna, V. F., & Brainerd, C. J. (1995). Fuzzy-trace theoAn interim synthesis. Learning and Individual Differ-ences, 7, 1–75.

Reyna, V. F., & Kiernan, B. (1994). The development of gversus verbatim memory in sentence recognition:fects of lexical familiarity, semantic content, encodiinstructions, and retention interval. DevelopmentalPsychology, 30,178–191.

Reyna, V. F., & Kiernan, B. (1995). Children’s memory ainterpretation of psychological metaphors. Metaphorand Symbolic Activity, 10,309–331.

Ross, B. H., & Bower, G. H.(1981). Comparisons of moels of associative recall.Memory and Cognition9,1–16.

Salvucci, D. D., & Anderson, J. R. (1998). Tracing emovement protocols with cognitive process modelsProceedings of the Twentieth Annual Conference of

y:

stf-

g

d

-

eInhe

mapping in text comprehension. Discourse Processes,26,159–190.

Schank, R. C., & Abelson, R. (1977). Scripts, plans, goalsand understanding. Hillsdale, NJ: Erlbaum.

Schustack, M. W., & Anderson, J. R. (1979). Effects of anogy to prior knowledge on memory for new informtion. Journal of Verbal Learning and Verbal Behavio,18,565–584.

Sulin, R. A., & Dooling, D. J. (1974). Intrusion of a themaidea in retention of prose. Journal of Experimental Psychology, 103,255–262.

Wickelgren, W. A. (1972). Trace resistance and the decalong-term memory. Journal of Mathematical Psychology, 9, 418–455.

Zimny, S. T. (1987). Recognition memory for sentences froa discourse. Unpublished doctoral dissertation, Univesity of Colorado, Boulder, CO.

(Received June 13, 2000)(Revision received October 16, 2000)Published online August 22, 2001