DC. rCSRTR-236 - ERIC · DOCUMENT RESUME. ED 215 313. CS 006 607. AUTHOR. Bruce, Bertram TITLE...
Transcript of DC. rCSRTR-236 - ERIC · DOCUMENT RESUME. ED 215 313. CS 006 607. AUTHOR. Bruce, Bertram TITLE...
DOCUMENT RESUME.
ED 215 313 CS 006 607
AUTHOR . Bruce, BertramTITLE HWIM: A Computer Model of Language Comprehension and
Production.INSTITUTION - Bplt, Beranek and Newman, Inc., Cambridge, Mass.;
. u y oReading.
SPONS'AGENCY Advanced Research Projects Agency (DOD), Washington,D.C.; National Inst. of Education (ED), Washington,DC.
REPORT NO rCSRTR-236PUB DATE Mar 82CONTRACT 400-\76-0116; ONR-N000-14-75-C-0533NOTE 37p.
EDRS PRICE MF01/PCO2 Plus Postage.DESCRIPTORS *Comprehension; *Computational Linguistics; *Computer
,Programs; *Computers; Discourse Analysis; *Language. Processing; Models; Progrdm Descriptions; Program
- Development; *Programing Languages; ReadingResearch
ABSTRACTA computer natural-language system called HWIM (Hear
What I,Mean), is described in this report. Noting that the system4. accepts either typed or spoken inputs and produces both typed- and,..spoken,responses, the report proposes HWIM as an example of a
Telatilielytomplete language system ,illustrating how'the manycomponents of languasae.processing interact. The report focuses anHWIM's d4coui.se processing capabilities and on the integration -ofdilVerse types of knowledge for the' purpose ,Of comprehending andproducing natural language. 01)
0
4a
. . '.ir.
.
**********************x********.********************0*'****o***.t ,* Reproductions supplied by EDRS are.the.best that ,can. be-made *
* from'the original. document. - *
*************************,*************"******.*************************,
'
CENTER FOR THE STUDY OF READING$
Technical Report No. 236
HWIM: A COMPUTER MODEL OFLANGUAGE COMPREHENSION AND PROUCTION
Bertram Bruce
Bolt Beranek and Newman Inc.
March 1982
University of4illinoisti at Urbana-Champaign
51 Gerty'Drive,
Champaign, Illinois 61820
U yS. DEPARTMENT OF EDUCATIONNATIONAL INSTITUTE OF EDUCATION
EDUCATIONAL RESOURCES INFORMATION
CENTER IERICI
This document has been reproduced arreceived from the person or organizationortgina ling it
, Minor changes have been made to improvereproduction quality
Pomp of view or opinions stated in this document do not necessartly rep! esenpotticial MEposition or policy
or
BBN Report too. 4800Bolt Beranek and Newman Inc.
.50 Moulton Street.Cambr'idge, Mas'sachusetts 02238
.e
Bill Woods, Bonnie Webber, Scott Fertig, Andee Rubin, Ron Brachm'an,Marilyn Adams; Phil Cohen, Allan Collins, and MartyRingle contributeduseful suggeWons and criticisms to this paper. Cindy Hunt helped inthe preparation. The system development and mubh of the writing was'supported by the Advanced. Research Projects Agency of the Departmentof Defense and was monitoled by5ONR under Contract No. NO0014-75-C-0533.Additional work on the pap4;itself was supported by the NationalInstitute of Education under Contract No. HEW=NIE-C-76-0116. VJ,ews
and conclusions contained here are those of the author and should notbe interpreted as representing the official opinion or policy of.1/4DARPA,NIE, the U.S, Gerernment, or any Other person pr agency connected withthem.
r
$
2
0
*
EDITORTABOARD
Paul Scse aid' Jim.MosenthalCo-td itors
Harry -Blanchard
Nancy Bryant
Larry Colker
Avon rismore
Roberta Fetrara.
Anne Hay
Asghar Iran-Aejad
Jill LaZansky
Ann Myers4.
Kathy Starr
Cindy Steinberg,
Willi Tirre
Paul Wilsbn4
Mi:ChaellqiVens,. Editorial Assistant),*
eN
,f4
f
\ .
HWIM: A Computer Model of Language
Comprehension and Production
People design computer programs to understand natural,
Language for two.reasons. One is to make computers more useful
.
by facilitating_ communication between the computer and computer,
users. The other is that, the 'design of such programs can, inforuk:
, \theor ies.
of human language use. The program Olen becomes a model-."-
for human language comprehension or production. Decisions made
in the design process in order to enhance efficiency, or sifnply
to make the program succeed, suggest -chaiacteristics Of thei
processes. humans carry out when performing similar.taskt. Al.se,'
t......- ftthe extent to which a theory of language case actuallyit"works".
when it is implemented as a computer model, is one measure of'1 its'
correctness, Moreover, the act_ of expressing a-theory infa-,
-
correctness,°
computer program forces us to be explicit and unambiguous about1(
\what the theory is. By examining the development, the strdc'ture,'
.
and the operation of' computer programs that successfuily'use.c.
. . -r--1 natural language, one may gain insights into human use of natural
,
language, including reading and writing', speaking and listeLng.0"
v--
r This paper discusses some general issues of. language'4. .
.
comprehension and production through examination of a completel
natural language system called "HWIM" (for "Hear what I mean40.
HWIM was -developed over a, five year period a6 -the Bolt Beranek .
4 1
.
Q n.
A CoMputer Model
and Newman speech understanding system. It was designed to
understand natural language (typed or spoRen); to answer
questions, perform palculationp, and maintain a data base; and to
respond in natural language (typed and spoken). Both its inputs \
and outputs used a, relatively rich grammar with ('a 1000 wordL
.vocabulary. Utterances were assumed to be part of an on-going
dialogue so that HWIM.had to have a model of bdth the discourse
apd the user. This paper focuses on the component of the system'
embodying semantic and pragmatic knowledge. -A fuller treatment-.
of the sitem 'can be found in Bruce (ih press) and in Woods,
Bates, Brown, 'Bruce, Cook, .Klovstad, Makhoul, Nash-Webber,
Schwartz, , Wolf, and Zue (Note 9). For discussions of other
speech understanding systems, see Erman, Hayes-Roth, Lesser, and
-RedUy --(1*30);41Lea (1980) ; and Walker (1978) .
9
HWIM's Job Responsibilities
HWIM was designed to serve as an assistant for the manager,
of a travel budget. The task of. he system was to assist the
travel budgdt manager, helping to .record the trips taken or
propose and to poduce such summary information as the totalo
money 0.1ocated to various budget items. In order to carry outAll*
its duties, HWIM new to converse with the travel budget ,
manager.. ghough their conversations were simplified'relatii; to.
natural-inter-personal conversation, the:design of the system can
t
10(
1HWIM:, A Computer' Model
4
help in formulating more -genera]: models of language
understa'ndirg. Some, salient features of HWIM are the following:
o the maintenance of dynamic models of the disbourse,
the task, the user, and specific facts abouf the
worldI
o the use of such knowledge to constrain npOgsible,
,
interpretations of. utterances, thus increasing the -
likelihood 'of successful, understanding.. ,
o the use of tbe'same ,knowledge in geherating both
t.
, .
p written ,and. spoken responses; a forMalism for
r'' expressing response generation rules which provides,
_
a framework for the application, of semantic and.
.... .
t
,discourse level knowledge
.480,.$ 6
e I.o facilities for making -the, systm's own khowledge.,
.
state known to the travel budget manager ,, .
o inference mechanisms and data base structures-that
,facilitate freer expression of commands and
'questions Sy, the travel budget manager
.
. , ,.
In this= section we see(
a',trace,of HWIM1g,procesEing. ',theTh.
4. , .
sentenc'es.ihown here' would normally occur as part of an ongoing
Steps in Processing
0
6
4
HWM: A Computer Model
5
'.dialogue. While their. interpretations and resulting responses
might besignificantly different in,context,.'What :Can be seen
here the flow of control within the system, the types of
interpretations produced, the inferenC'e' Capabilities, and
response generation. The structure of HWIM is shown in Figure 1
(anc) discussed further 'in Appendix A).
'Whpn HWIM erigage4s 4n a text dialogue, the system .reduces to
just two processes, Trip and Syntax. In that mode Syntax is a, ,-
_subl3rocegs to Trip, -though it may itself invoke Trip as its own'
gubpr cess to evaluate tests on arcs in the grammar. To .see one%posgib &flow of control, (Figure 2), consider the Sentence -
/
Enter a,trip for Jerry Wolf to New YOrk.
Trip reads the input and does a morphoi.ogical.' analysis of each,
word. It then calls Syntax to arse and interpret the Sentence.I e
During processing, Syntax may enco nter a test, e.g.', "Does , the
'name 'Jerry' Wolf' denote a know person?", which is to be
performed by Trip. ,Control passes to Trip ich answers 'yes,"
and then, back to Syntax" again. When the parse is Oimplete,
control returns to Trip., Ambiguity, in the input can 'cause a
.question to be formu.lated for the travel budget manajCr. The
managerJs'response would cause a return,to Syntax, and so on to,
the end of the s-e,ssion.'4
.
t
e9
4
)
a
if
t
I
&
a
wit
- .
HWIM: A Computer Model
r
4."
6'
SEMANTIC NETWORK
BASE
DICTIONARY
INTERPRETER TRIP
4119;41110WDICTIONARY
EXPANDER
,
EXPANDED DICTIONARY"
IL' SPEECH
imrtODERSTANDIgGs.
GRAMMA R
..
VERIFIER
PHONOLOGICAL
RULES
1
SPEECH OUTPUT
PARAMEA ICREPRESENTATION
OF SPEECH-SIGNAL
TALKER
e.
.
Figure 1. The seructure "of -HWIM.
1
r-,.
4..8
119
W
SPEECH INPUT- ..
4
l
i
9
Is
7) -04
sI
1
1. "Enter a trip for .
Jerry Wolf to New York."
Z. PARSE
((ENTER A TRIP...))
FIVTIM: A Compute r Model
7
dr.
6. "Ok."
igurt e 2.
3. GOODNAME?
(WERRY'WOLFLY
5. (FOR: --
(BUILD: TRIP --) )
One possible flow of control for a dialogueinte rchange.
-4
4
HWIM: A Computer Model
8
To see the steps,HWIM tak n reading, comprehending, and
then respOnding to some- text, imagine that' the travel budget,
nager types:
When did Bill go to Mexico?
The program makes a fist pass on the'Enput, converting the words'
as typed into words as they appear in its dictionary and removing'* e 1
' punctuation. For example, "$23.16" would become "twenty-three
dollar-s and sixteen cent-s". In this case the modified word
list is simply
.(WHEN DID BILL GO TO MEXICO)
For spoken inputs HWIM has to consider, many possible-word
lists beCause the inpiA is ambiguous. This could be viewed, as
. analogous to problems faced by a person in reading (see
Rumelhart, 1977; WOods, 1980b) . For typed' inputs; HWIM is a
perfect decoder; thus, only one word list needs to be considered.
In order to simplify the discussion, the example here will focus
on a single word list.
The parser in HWIM uses a pragmatic grammar which contains
significant static knowledge of' the travel,budget management
domain. Collapsing the task specifiC knowledge into the gramm
was viewed as an 'expedient to gain computational (processing).
.
efficiepcy. It helped to constrain, parses early and .enabled
s'f
Ot
HWIM: A Computer 'Model
9
A
risimultaneous parsing and semantic interpretation. This is in
contrast with the two phase method used in the LUNAR parser%
(Woods, Kaplan & 'NashWebber, Note -10) which produced purely,
syntactic parse trees that were subsequently given to. a semantic
interpreter' to tansforit into executabld interpretations. More
recently, techniques such a& the use of .cascaded ATN's (Woods,4-
1980a) as in the RUS parser (Bobrqm,4_Note 2)_ have been developed
to gain much of the efficiency of pragmatic ors semantic grammar
while maintaining- a clean separation between,general syntactic0
loowledge and task specific knowledge.
For the example sentence, the parser derives ,the 'syntactic,
structure:
S. Q -"
QADV WHEN
SUBJ NPR BILL,
AUX. TNS PAST
VOICE ACTIVE
VP V GO
PP PREP. TO
NP N ?R MEXICO'
ADV THEN
Af
Taking advantage of the. semantic knowledge in the grammar, we get,
t V..
;
in addition:
HWTM: A Computer, Model,
[FOR: THE A(/00.9 / (FIND: LOCATION (COUNTRY 'MEXICO))
10
: T ; ,(FOR: 'THE A0008 /*JFIND: PERSON JVIRSTNAME 'BILL))
6T ; (FOR: ALL A09104/ \-
(FIND: TRIP (TRAVELER A0908)
*(DESTINATION .A0009)
(TIME (BEFORE NOW)))
: T ; (OUTPUT: (GET: A0010 'TIME]
The interpretation is, a quahtifthd expression which can (almost)
-be directly executed. Translated it says, roughly: :1
.
Find the location which has the Country , name ' ,,
-,,
/ ..
"Mexico" and call it A0009. Find the person whose first',
mane is "Bill" and call him A0008. Find in turn, all
trips. whose traveler is A0008, whose' .destinatio is
A00b9, and which occurred prior to now. At each is
enumerated,' label it A0010 'and output its specific,4
.
time.4
The T's which appear in the' fnterpretation are there in Place of.
potential restrictions which might have been applied to the°
,1-location, person, or trip classes being sealched.
1Before the interpretation can be executed it undergoes an
optiii.zation (See Reiter, 1976) and modification ,to match the
't ,
r
°
.HWIM: A Computer Model
11
data base represientations. In this example ,there is no entityicalled "LOCATION' n the Semantic netx.ork data base which serves
, the purpose implied in the ..interpretation. Theresvare' cities,states, coasts, countries, etc. and there"LOCATION", but, no concept with instances at we might expect. Theinterpretation thus references a fictitious mode. The optipizerrecognizes this and modifies the inte rpretation so that a searchis --x-fone among all the possible "locations ". . In HWIM, this
is a link called
modification is speciftre to the fictitious node, that has benreferenced. More recent knowledge representation systems, such
as KL-ONE would formalize,the ability to form. abstractions suchas . "location" from. the more specialized concepts (Brachman,
Bobrow, Cohen, Klovstad, Webber, & Woods, Vote 3):
The optimized itterpreitation is placed on a queue of
to
-be- executed commands. The queue represents a firstt reSteptowards a "demand model of discourse" (see next section) . Inprinciple .it can contain previous queries or commeands,from the
,
speaker as well as, system initiated 'co ands. However, most
utterances9 translate into single commands Which are directly-.
executed.
-The (FOR:*--) expression above, is ,a LISP form which can . be
evaluated directly. The FOR: function manages quantification inthe expected way. Here it looks for a unique Ideation and a
/
HWIM: A Computer Model
12
uaique person ,which match the respectiv descriptions. There is a
single place whose country name is "Mexico," i.e., the country,
Mexico. however; there are 8 items in the data base representing
people whose first name is "Bill." Given the apparent
inconsistency between the data base and the command, FOR: is
forced to take the initiative of the dialogue and ask the manager
for help. The response generation program produces the question:
4
I55-7517 meanBilly Diskin, Bill Huggi*ns,4Bill
Levison, Bill Patrick, Bill Plummer, Bill Russell, Bill
Merriam, or Bill Woods?
When the system asks a question (a4,in Alhfs example) it
expects an answer. This means that otherwise "incomplete"
utterances may be accepted. Here the parser will allow a name aS
a complete utterance. In this context, the person types a secondto,
sentence:
Bill Woods.
This sentence undergoes a pre-pass to produce the word list
(BILL WOODS)
which the parser inEeprets as
(! THE A0011 PERSON ((FIRSTNA BILL)(LASTNAME WOODS)))
14
.:
The optimized interpretation is17 .
e .0r
HWIM: A Computer Model
(IOTA 1 A00 11./ (F/ND: PERSON4
(FIRSTNAME 'BILL)(LASTNAME 'WOODS),1,: T )
This interpretation uses the'IOTA operator, a function that
13'
. 4
returns the one item in the class defined by the (FIND: - -)
expresion which meets th4 restriction, T (i.e., no re,striction).
In this case there is .exactly one item matching the description,
namely BILL WOODS.
. ..
At this point, we are n the middle of what Schegloff (1972), '
.,has called an "insertron sequence." In'the piOcess of answering
'the user's question, 'the ,system began executing a4.
FOR:, txpresiion. EvalUating 'the FOR,k expression required
information that was obtainable only by asking the ,user a
question. HWIM's.question to the user, plus' the 'user's answer
constitute an il;csetpn sequence with respect to the primary'
question - answer sequence. The execution of the °FOR: expression
is' suspend&1 for the duration of the 'insertion sequence, but cah
be resumed. when' the /1.1.-.eded information is processed. Having nowresume.'I
4 %a unique :VISTINATION and a unique TRAVELER, FOR: can find all
.
- trips whose 1. is (BEFORE NOW) and for each; output its TIME. 0
.11
Finding 'the trips, checking TRAVELER, DESTINATION, and TIME
can 'require corisiderablesearch' in the data base. For example,
I
4
,r
HWIM: A Computer,Model
14
the DESTINATION of a TRIP is computed from the DESTI-NATIONs pf
its component LEG/OF/TRJPs. The DESTINATION of a LEG a /TRIP may
have to be computed from the PURPOSE of the LEG /OF/TRIP,
attending specific ,,.conference.. Information about trip
budgets; conferences, etc. is never assumed to be compleee s.ince
it is not So in the real world. Furthermore, the range
askable question& precludes cOmPluting in advance all the implicit
information. (The procedures .that do these compdtations are
giscuss&Lin Woods et al., Note '9.) In this example there is
'only one trip thatjOtches the description and.its,tdme is
printed out by the response generation 'programs:
Bill Woods's trj.pl from Boston, Massachusetts to
Juarez, Mexico was from October 15th to 17th, 1975.
-- Discourse Model%
In any conversation there are' ,many 'forces operating to
determine what will be said next, including the ,memories, the
intentions, and the perceptions of the participants in the., )
..
conversation` (Bruce, 1980, Note 4). A discourse model/ is an
idealized representation of such forces as they are used by a,
speaker in. constructing ,an utterance and by a listener in.,-
understanding one. . (See also ReichrmaA, 1978; Stansfield, 1974;
Deutsch, 1974; Sidner, Note 8).
;
16
1
i.
HWIM: A Computer Model
'15
The discourse ,context is an important factor in even the
restricted domain of persona- computer communication, but in many
_interactive natural language Understanding systeffis there, is no
explicit "discourse molder". °-Mostartificialproblem domains are
deliberately constructed so that only a few interaction modes are
°Sensible.. For example, th.e person may only ask questions;,the
°system may only answer the questions. This greatly simplifies
the determination 40e the speaker's intent, and hence, the full.
understanding of the utterance. There is also no need too
maintain a data base of'patticipants in a conversation (together
with their presumed pUrpose,! attitudes,- model's- of the 'world',
etc.) when the only other participant .in the-convesation is "the '-\
USER.". Even in those cases where the problem d f'nition permits
'a general discourse and tie systeM to cd with that
generality, the means o coping are rarely deli eated (Bruce,
19756..
Incremental simulations (see Woods & MakhOul 1973) of
dialogues between a travel budget' manager and HW M (a system,
builder -played the part of an ideal HWIM) suggested a discourse
model in which, at any point, the manager could b se :n as being
in one of several States, e.g. , trying to deter ine the
consequences of a proposed trip, examining the :St to of the
budget, br entering new trip information. We considered, several-'
forMulations of a discourse model in which these states would be
4.
HWIM: A Computer Model
16
made explicitand used to advantage.
One 'such formulation involved the concepts of m'odes of
interation and, intents% (Bruce, 1975c) . An .intent is the assOmea
purpose behind an ,utterance. Patterns of intents such as,
user,- enter-new-information-
system-apoint-out-contradiction
user-ask-questiron
system-answer-question
use ing-tangesystem- confirm - change,
.
constitute the modes of interaction.
k
. An augmented transition network (ATN) grammar was used to
represent same f the common 'modes of interact -ion found in travel'
budget management dialogues. Then Na modified ATN' parser 'was
written that steps: through the graMmer on the basis of the input,. ,r
sentence structure and the then-current State of the data base.
At any givents state the parser can predict the most likely next
intent and hence such things. as the. class of likely verbs for the
next utterance. This model was not used in the final version of,
the system, primarily because it was too rigid to function alone .ada discourse model.
Another formulation of the .user,/discourse model involves the
18
O
-HWIM: A 'Computer Model.
17
notion of demands made by participants in the dialogue. (Both of
these formulations are discussed more fully .;r1 Bruce, r975b).
Demands include _such things as unanswered questions and
contradictions which have been pointed out. This latter
formulation allows us to model how one computation of a response
can be pushed down, while a whole dialogue takes place to obtain
missing information, and how a computation can spawn subsequent-
expectations or digressions. Some elements of this deld model,
together with other aspects of the discourse model, are explained
below:`.6
(1) Demands: These are demands for service of some 6,,or''t
'made upon the systemby the user or by the system .itself. An
active unanswered question is a typical demand with high
priority. The fact that some questions cannot be -answerdd
without more ingbrmation leads to an embedding of modes of
-interaction. 'Demands of loWer priority include such things as a
notice by the system that the manager is over his budget. Such a
notice might not be communicated until: after direct questions had -
been answered.I 4.
'ea(2)' Counter - demands: These are questions the system has
explicitly or implicitly asked tq- user (e.g., "This trip !dill
put you over budget" implicitly asks the user to review the
:"
budget, canceling budget ?items or adjusting cost -estimatet) .
6'
A
HWIM: A Computer Model
18
While it should- not hold on to these as long as it does to
demands, nor expect too strongly that they will jbe met, the.0*
system can' reasonably expect that most counter- demands will be
resOlveCl'in Some may. Phis is an additional influence on the
discourse structure.
(3) Current discourse state:, The discourse area of the data
baSe also contains an assortment of items that define the current
discourse context, .including:
o LOCATION, a pointer to the current location of the
speaker, e.g., the city
.
o TIME, a pointer to the current time and date
o SPEAKER, a pointer to the current speaker
o the last mentioned person, place, time, t-eip,
budget, Conference, etc.
There is also a representation of the current TOPIC (see Figure. s \
3). This is the active focus of attention in thia dialogue. It'1
could be the actual budget, a hypothetical~ budget, a particular
trip, orja conference. The current topic is used as an anchor
point for resolving definite references and decidion hoW much
detail to give in responses. It cap also generate' certain modes
of-Jilteraation. For example, if the manager says "Enter a trip,". ,
4,0
11,
DISCOURSESTATE
A
INSTANCE/OF
HWIM: A Computer' Model
19
. )
SUBTOOIC SUBJECT-
SUBJECT
Figure 3. A discourse state.
r
4
21 s.
INSTANCE/OF
INSTANCE/OF
a,
NC
's
.1 I . . 4e 0, -. I.. .. \
., .
. , HWIM: A Cbmputer- Model0 ,,
' !.. 20.
. the 'system notes that_ the current 'topic' haS changed to an
I
..
incompletely described trip. This results in demands that cause.. /
,standard fill- qn questions to be asked.' -If lie. manager wants to.. .. .
,.
,
complete the trip descriytion later/ then the completiliof the., ,2t 4 o
P - 1 z','trip description becomes a low priority demand:. ',,,
,. . .
The system has a praitive one-queue implementation ot'the ..-..
' -'. ,.
, . 6-
"demand model. This queue contains executable procedures which4%,
represent the speake'r's previous queries and commands as. weld. as
commands initiated by the syStim to examine the consequences of
sits actions, give information to the user, or ch'eck for_data bass. , ,
consistency. These procedureg are pelatea by functional
dependencies and relative prioriti*es. The major .types Of demands
are' the following:
V
o, DO: means execute the specified c mmand.
0" TEST: means evaluate the form and answer "no", if NIT/
or ".yes" dtherwdse.
RESPOND: means give the user some information (which
may or May not be part of an 'answer to a direct
quvry).
4
o' PREVENT.: means monitor; for a subsequent, possible
action and block its .normal, execution (asi.r1:"Do not
allow more than three trips to Europe."}.
. "..
2
r-
a.
r-
HWIM: A.Computer Model
21
Understanding
The process of understanding written natural language. ,
i'requires the integration of diverse knowledge sources. An op mad.
prdCdss must Apply various types &-t-knowledge.in a balance hat
avoids over or undenLconstraining the set-of potential account
of the input.
Ultimately the understanding process should produce a. - - 2.
_ Meaning representation jor the natural language input, whe,t,her.
, -the. input be written or spoken.. This representation May be
. 1 .produced directly or via some number of contingent knowledge
....., .'
,
structure's (Bobrow El.-Brown, 1975). The exact form of these,
structures .vpries from one system to,another, but tyPically
includea such thingsas'parse trees.
In HWIM, integration of knowledge. sources for linguistic
L_pcocessing- is. accomplished by means of °(1) an ATN grammar (see.4.
,
,.muchFigure that incorporates much of the syntactic, semantic,
. and
pragmaticknowledge i' .the system, '(2) a semantic network (see
stash- Webber, 1975Y that represents remaining linguistic and world
knowledge, and (3) tests on arcs in the grammar used to link the
two representations. These structurps are used to produce the4
account of the input that serves: as a representation of its4
meaning.
..,
r
A
/
I
OA
;
JUMP ,
.
HWIM.: A Computer Model
/
A
. 22
(C..
.
POP
-AINTERP=
(I THE-LOCATION-)
NPR
\.6s
0 CALL TRIP FORK TO VERIFY THAT THIS MATCHESp .."
NI._
"ill' 4...
/
A-
.s,
"\.
..,-
cJigure 4. A small'poition of the pragmatic grammar. .
.........-----.... 0
274
..-
A
-7
HWIM ,A Computer Model
23 c.
The decision to include semantic knowledge-in a. grammar has
'both.its merits and' its defrcts. On the one hand there is an
obvious- advan age in terms of efficiency when semantic knowledge'
+Cali be applied early ondin the most direct way to constrain .theh' r-
possible interpretations_ of an utterance. Moreover, one avoids
the seemingly u'nnecessary step of building a syntactic., model,4
.
such as a parse tree, for an utterance. Ala of this points to
the desirability of a 'undfied linguistic processor as exists .in
HWIM (see also Erman, Hayes-Roth, Lesser, & Reddy, 1980;,Burton&
Brown, Note 5) . On the 'other hand, the fact that we could
incorporate semantic and pragmatic:knowledge in the grammar' and
Use that knowledge as if'it were fundamentally no different fr4om,
syntactic knowledge was a consequence of having a limited -domain
of- discourse. No computer natufal' language system has' yet
approached the variety and complexity that natural human4
'cbmniunication exhibits. As semantic and pragmatic knowledge
becomes more 'complex,' more.' fluid; and more' intricately
interconnected, it not clear to what extentthe pragmatic
gramm'ar approach will' work.
Despite the inclusion of non-syntactic knowledge, the
pragmatic grammar is not a comprete knowledge source by itselfN
for linguistic processing. It is, however, closely linked 'to the
semantic network via tests on the arcs These tests allow the
sqrammar to capture more volatile coniptraints on the input, such
HOIM: .A Computer Model
, 24
as those provided by the discourse model or the factual data
base. .For example:
1
o Is. "Jerry Wogfas" a valid .name?-0" #0
0 .Is 4an Francisco,.California" place?
o What words can 'go with "speech", as a project
descriptoi (e.g., "understarVinIg")? ir /o isDoes "What s the registration fee?" make sense in
this context?
o. Does "How much is in the budget?" make -sense in this
Context?
o Dcre-"When is that meeting?" Bake sense in thiS
context?(7?'
Does "November 15th" make sense in this context?
3
With the grammar encoding semantic and pragmatic, as well as
syntaC tic information, it is.possible for_the parser to build a
procedural" "meaning" representation as well as a purelysyntactic
one. The meaning repfesentations are in a command languAge whose
-expreSsions are atomic, consisting pf, a-functional operator
- applied to arguments, dr compound, making;u.se of a duantification
operator (FOR:) applied to another' expression. TIN operators in.,;
2t
O
HWIM: A Computer Model
25 .
atomic expressions specify operations to be performed on, the data'
Uase or interactions with the user.
The parser builds interpretations by accumulating in
registers the semantic head, quantifier, and links of the nodes
being, described in the sentence (see Woods et al., Note 9, for a
more.complete description) . For exempt, the sentence
go to the ASA meeting.
yields an interpretation in the command language (Section
COMMAND) of the form
(FOR: THE A0018 / (FIND: MEETING (SPONSOR 'ASA)) : T ;
(FOR: 1 A0019 / (BUILD: TRIP
(TO/ATTEND A0018)
(TRAVELER SPEAKER)
(TIME (AFTER NOW)))
: T ; T))'0
Thiq interpretation
constituent describin
is built up in the following way.
er son i,sfoundat.---the-.-:st-a-r--t-7-o-f--the
sentence. The. parser transforms, the pronoun. "I" into. the
link-node pair .(TRAVELE SPEAKER) and returns this as the
interpretation of that constituent. The word "will" adds (TIME
(AFTER NOW)) to the list of link-node pairs being accumulated.
(The grammir does not accept constructions like "will have gone,"
HWIM: A Computer Model
26
so "will" can always be interpreted'as marking a future event.)
The word "go," in the context of our travel domain, .indicates
that a trip is being discussed. Next, the constituent' "the ASA
meeting" produces
(TO/ATTEND (1 THE Y MEETING ((SPONSOR ASA)))).
(The ! indicates that as FOR: expression-Lwill have to be built as
part of the interpretation.) The top level of the grammar has
thus accumulated the link-node pairs
(TIME (AFTER NOW))
(TRAVELER SPEAKER)
(TO/ATTEND (! THE Y MEETING ((SPONSOR 'ASA))))\.
with the semantic head TRIP. The appropriate action (in this case
a BUILD:) is created, to produce
(BUILD: TRIPgo (TO/ATTEND (! THE Y MEETING ((SPONSOR,'ASA))))
(TRAVELER SPEAKER)(TIME (AFTER NOW)))
O
Finally, the necessary quantificational expressions are
expanded around the BUILD: expressions
Response Generation
Once a satisfactory theory for an utterance has been
constructed, it must be acted upon by the system. Regardless of
28.
.fr
O
.0_ HWIM:' A Computer Model
27
the type of action taken, some appropriate response-should also
be made. From the speaker's point of'view-the response should be
explicit, 'concise, and easy to. understand. he response may-1
affect1/4the speaker's way of describing entities in dOlpain asvs
well as illuminate the capabilities and operation of the system..,
° The effects of generated responsesre many. , In a domain
suchu as travel budget management there are many objects without
stanard names, e.g. , "the budget item for two trips to a West
Coast conference". Names chosen (constructed) by HWIM provide a
handle for the manager to use and thus strongly influence the
ways the objects are referred to- subsequently. Responses can
also indicate the capabilities of the program. tareful
construction of responses to exhibit' exaCtiy the knowledge
structures handled' by HWIM gives the managei the legitimateIN
confidence to pursue just the.paths ,which rely on those
structures: A response can also show the system's focus of
attention. the manager can thus get -a better idea, not only of
the facts 'she or he. seeks, but alo of how WelltheSY-SteliSA
understanding and wheremore-clarificatiod May be useful..4,
Types of Knowledge Used in HWIM
order to carry out.its role eiffeCtively, HWIM -must have. .
large amount o knowledge in a readily accessible form. This
knowledge is needed-for understanding both spoken utterances -and
a
HWIM: A Computer
28
written sentences, carrying out commands, answering questions,
and generating responses. HWIM's success in performing these
tasks is evidence that the types of knowledge embodied in the
system' are sufficient foriatural communication, at least at the.
level shown by sample dialogues. The-history of the development
of HWIM (N41,Webber & Bruce, 1976) suggest's that each. category
of khowledge is also necessary for natural communication to take-c.
place. The kinds of knowledge used by-HWIM can be categorized as*
follows:a
o Acoustic-Phonetic forms--Knowledg
their relation to aqodstic paramete
Phonological rules--KnoWledge of
and pronunciation variations describable by rules. .
phonemes and
honeme clusters !
o Lexical forms--Knowledge 'of words, their
inflections, parts of speech and phonemic spellings.
For example, "entered" is the past form of "to
enter".
A 0 Sententi41, forms--Knowledge. of1 p,
grammatica
structures at the phrase, clause, and sentencer .
level. For example, a sentence may start with a
dommand verb, such as "schedule:."
o _II-iscourse form--Knokg.edge of idealized discourse,
II
41
HWIM: A Computer Model
./"Q4g.,- what type of utterance is likely to
produced i a given s e of the discourse:
Seman is -- Knowledge of how words are related and.
used and structurally conveyed meaning. This is
uted fn parsing and in constructing responses: For
example, the benefactive case for'schedule:should be
29
'filled by a person who will be taking the trip being
scheduled.
"1:3 World--Knowledge of specific projects,' trips,
budgets and conferences. Whereas HWIM's semantic
knowledge is essentially fixed, its world knowledge
changes every time a trip is taken or money isIy
shifted from one budget item to another.
o On-Going discourse -- Knowledge of the current
discourse state, e.g., 'the current ptopic and the
objects available for anaphoric reference. This
knowledge is used to relm! constraints °in the
pragmatic grammar. For example, if a conference is.
under, discussion, then "What is the registration
fee?" is a meaningfdl utterance. If dot, then the
speaker would have to say something like, "What is
the registration fee for the ACL-conference?"
4)
1
t.
HWIM: FIComputer Model
30
,o Parameters of the communiaative'situationHWIM's
operation is a function of the modality in which it
operates;' speech of text input ,and speech or text
s output. In general, communicative situations can
vary along.a number f dimensions (see'. Rubin, 1980)
and a successful communicator must-use knowledge of
the situation to interpret and respond
appropriately.
'o User-Knowledge about each p iblv travel budget
manager, what groups .they belong to, and what they
may know about the'data base. For example, only
certain users may be allowed to modify budget
totals--
o Self-,Knowledge about the system's own knowledge
((often .called "meta-knoWledgeq. One example is
that data on trips and budgets is marked to indicate
-whether it came frol4 the travel budget manager or
was computer by HWIM. Anotha is that HWIM knows
what elements constitutfa.a complete description of
any object, such as a tr ipp Thus it knows when itse.
OWQ knowledge is incomplete.
o Task--Knowledge of the riask' dom'ain, e.g., that a
manager is Concerned with maintaining an accurate
tL
a HWIM: A Computer Model
31
record of all trips, taken or planned, and with
staying within the budget.
o Strategic--KTwledge'concerning the representation
and. use of other knowledge. This knowledge that
would be needed in even a "blank" systeml'i.e., one
that had no word- or domain-specific knowledge.
Conclusion4
One reason for building a computer model of language
understanding is that in the course of designing and debugging a
computer program, one must resolve theoretical question4rabout
details of the process that can be glossed over in less
procedure; oriented models. For example, desijners of computer ,
models of speech act generation (Cohen & Perrault, 1979) have
increased the prdcision of speech act definitions. A second
reason for computer models is that the task 6f the system can be
defined to require cdmplette use of language (wfthin a restricted
. domain, of course). . Thus,-one can See what components, each
repreSenting .aspects of language theory, are needed-and how they
might interact. Winograd's (1972) blocks world program, a system
which accepted typed questions,. commandsA , and assertions4..,
performed actions,' and generated natural' language responses,
could'' be put in the latter category. There is value in both of
these approaches; in fact, they tend to -complement each other,
0
350
.
HWIM: AComputer Model
32*
with the second showing What can and cannot be done and the first
addressing specific design questions.4
g in both categories, but its principal value probably,
lies more in 'the second till the first. To some extent, it
representsa synthesis of a number of lines, of research in areas
such as parsing, inference,- data base design, and generation. It
shows what can be dbne in a fairly, ompletz system' that
understands'natural language (typed or spoken), answers questions
and performs various actions, and responds 'in natural language .
(typed, and spoken).
blems that arose in the design of HWIM 'were precursors of
those t are central issues in AI research today. For example,
the speech act issues for HWIM are similar to thOse studied 'by
Cohen (Note 6), Cohen .and Perrault (1979), and A114n (Nlote 1).
Questions bf knowledge representation closely related to those
fAced in HWIM have been.puvsued by Bobrow, and Winograd,. (1977),
,,,Brachman (1979), and Fahlman (1979). (Also see Brachman &' Smith,,
J.980). Language gerneration in a discourse context similai to
HWIM'sas been studied by Mdbonald (Note 7), for instance. &
Finally, the issue of interactions among syntax, semantics, ,ond
pragmatics As crucial in the work of many; inqluding Schank and
Abelson (1977) ,. Woods (1980a) , and Bobrowi (Note 2). The
characteristics of HWIM reflect the goal of natural communication
34
HWIM: A Computer Mod4el
33.
,t
between a person and a computer assistant. Even in its limited. rs
domain, it illustrates the extent to which natural Communication' v. ,,.
s,r,
depends upon diverse kinds of knOwledge in bath communicants.,
. 1The structure of HWIM can. provide a 'useful framework for
.
''.
obtaining abetter undefstanding of:rtatuxal communication.,
35
33.WM' A computer Model
34
Reference
. 1.1. Allen, J. A plan- ased Approach, to speech act iecognition.. . az. 7
3(Tech. Rep. No. 13J/79) .; Tororlto: tYn pre r.s i tY of Toronto,Department of Computer ScienCe, January 19,79.
-
2. Bobrow, R. J.
no
The RtJS system . (BBN Report- No. 3878).Cambridge, Mass.: Bolt Beranek and Newman Inc.,
3. Brachm R. j:; Bobrow, R., Cohen, P., Klovstad, J. , Webber,B. L., & Woods, -W. A. . Research in natuAl languageunderstanding Annual re.Rort (1 Sept. 78-31 August 19.79)'.
, .04. Bruce, B. et. Belief systems and language understanding BBN-
Report No. 297J). Cambridge, Mass. Bolt Beranek And
Ne man, Inc. , 1975.- (a)
5t: Burton, R., & Brown,. J. S. Semantic grammar:, A Techique/ . .
p* for ponstr uc tin,g natural language:interfacesto, instructionalsystems (1EQ3N: Report No. 3587.) . Cambridge, Mass. Belt
.
.Beranek and Newman Inc., May 1977...-
0 . J6. Cohen, P. R. On knowing what to 'say: Planning speech acts
(Tech. Rep; NQ 118) . Toronto: Department of CoMpti,ter
Science, University 4of Torono, 1978.
a6
..HWIM: A Computer Model/
35
7. McDonald, D. 'Natural language production as a process of
decision making under constraint (MU AI Lab Tech. Rep.).
Cambridge, Mass.: Massachusetts Institute of Technology,
148o,Ms'
8. Sidner, C. L. Towards a computational theory of definite
anaphora comprehension-in English discourse (MIT AI LablTech..c
. awiro.
Rep. No. 537). Cambridge, Mass.: Massachusetts Institute of
Technology, 1979.
114,
9. Woods, W. A., Bates, M. Brown, G., Bruce, B., Cook, C.,
Klustad, J., Makhoul,rJ., Nash-Webber, B., Schwartz, R.
Wolf, J., Zue, V. Speech understanding systems: Final
technical progress report (30 October 1974--29 October 1976)..
Cambridge, Bdlt Beranek and Newman Inc., 1976.
/
Woods, W.A., Kaplan, R. M.; CNadh-Webber,. B. L. The, lunar
science natural language inforqation system: Final report
(BBN Report No. 2378) Cdmbridge, Mass.: Bolt Beranek and
- Newman, Inc., 1972.
a'
37
a
,HWIM: A Computer Model
36
Ref 4
Bobrow, D. G., & Winograd, T. An overview of ERL, a knowledge
representation language. Cognitive Science, 1977, 3, 3-46.I
row, R. J., & Brown, J. S. Systematic understanding:
Synthesis, analysis, and, contingent knowledgein specialized
undeKstanding systems. rri D. G. Bobrow & A. Collins (Eds.)
Representation and understanding: Studies in cognitive
science. New York: Academit Press,'1975.
Brachman, R. J. On the epistemological status of semantic
networks. In N. V. Findler (Ed.), Associative
networks: The representation and use of knowledge in
computers. New York: Academic Press, 1979.
'Brachmaril-R. J., & Smith; B. C. SIGART Newsletter, Special{ Issue
on Knowledge Representation, No. 70, February 1980.
BrUce, B.C. Pragmatics in speech un Proceedimgs
Fourth International Joint Conference on Artificial
Intelligence, TbiliSi, USSR, 1975.(b)
Bruce,- B. C. Discourse models and language comprehension.
American Journal for IClomputaional Linguistics 1975, 35,
19- 35. {c)
HWIZT: A 'Compute r Model
37
Bruce, B.C. Analysis of interacting planS as a guide to the
uqderstanding of story struCtuge. Poetics, 1980, 9,
29'5 -311.
Bruce, B. C. Natural communication between person and computer.
In W. Lehnert and M. Ringlel(Eds:), Strategies for natural
language processing. -Hillsdale, N.J.: Erlbaum, in press.
Cohen, P. R., &Perrault, C. R. Eletents of a plan-based theoryc
of speech acts. ,Cognitive Science, 1979, 3, 177-212.
Deutsch, B. G.The structure of task oriented dialogues. In1.- OF
. ,L.D. Erman (Ed.), Proceedings of\ the IEEE Symposium on
Speech Recognition, Pittsburgh, PA: Carnegie Mellon,
University, 1974.
A
Erman, L. D., Hayes-Roth, F.,lLesser, V. R., & Reddy, D. R. The
Hearsay-II speech-understanding system: Integrating
knowledge to resolve uncertainty. Computing Surveys, 1980,
12 213-253. \\,!,
Fahlman, S. E. NETL: A system for representing and usingcereal-world knowledge. Cambridge,, Mass.: MIT Press, 1979.
.
4ba, W. A. (Ed.) Trends in speech recognition. Englewood
Cliffs, N.J.: Prentice -Hall, ,1980.
0
3
HWIM: A Computer MOdelr 38
Nash-Webber, B. L. The role of semantics in automatic speech
understa4nding. In D. g./ abbrow & A. Cpiiins (Eds.) ,
Representation and understanding: Studies in cognitive
science. New York: Academic Press, 1975.
Nash-Webber, B., & Brice, B. C. Evolving uses of nowledge'in a
speech understanding system. Proceedings Of the COLING-76
Conference, Ottawa, Canada, June 1976.
Reichman, R. Conversational coherency. Cognitive Science, 1978,
2, 283-327.
Reiter, R. Query optimization for question - answering systems.
Proceedings Hof the COLING-76 Cclriference, Ottawa, Canada,-
' June 1976,
Rubin, A. D. A theoretical taxonomy of the differences between
oral and written language. In R. J. 'Spiro, B. C. Bruce, and
W. F. Brewer (Eds.), Theoretical issues in ileading
comprehension. Hillsdale, N.J.: -Erlbaum, 1980.
Rumelhart, D. E. Towards an interactive model of reading. In
S. Dormic (Ed.1, Attenti.cn and performance VI. Hillsdale,
N.J.: Erlhaum, 1977.
Schank, R. C., ,4 Abelson,. R. P. Scripts, plans, goals and
understanding; An inquiry into human knowledg. structures.
Hillsdale, N.J. Erlbaum,`1977...
4y
. ,
t
6 HWIM' A Computer Model
41.
39
Schegloff, E. A. Notes .on a conversational practice:
Formulating place. In D. Sudnow Ed.), Studies in socialI a
interaction. New York: Free Press, 1972. .
-Stansfield, 0", L. Programming a dialogue .teaching situation.
Unpublished doctoral disstrtation, Univefsity of Edinburgh,
1974.
Walker, D. E. (Ed.) Understanding spoken language. New
ElSevier°North-Holland, 1978.
Winograd, T. Understanding natural language. New York: Academic
Press, 1972.7,
Wobds, W. A.." Cascaded ATN Grammar'S. American Journal of
Compueational Linguistics, 1980, 6, 1-12. (h)
Woods, W. A. Multiple theory formation in high-leVel perception.
In R. J. Spiro, B. C. Bruce, & W. F. Brewer (Eds.),
Theoretical issues in re ding compreheh'sion. Hillsdale,
N)J.: 'Erlbaum, 19.80. (b)
Woods, W. A., & Makhoul, J, Mechanical inference problems in
'continuous4,speech* understanding. Proceedings of the Third
.e,//
,
Jnternatfonad Joint Conference on 'Artificial Intelligence,.
Stanford,, Calif., 1973.
41 .55'
o °
HWIM: A Computer Model
APPENDIX A
The Structure.of HWIM
40
The structure of HWIM is shown in Figure 1. Components of
the system are shown in ovals with thick arrows representing flow
of control between components. Major data structures are shown
in clouds with thin arrows indicating data flow. The system is
implemented on 'TENEX, a virtual memory time-sharing system for
the DEC PDP-10. Each component exists in a separate TENEX
proceqs (called a "fork"). Among the reasons for this
multiple-process job structure .(see Woods, et al., 1976) are that
most of the components ark so large, in terms of program and data
40structure requirements, that they cannot exist in the same TENEX
address space with another component. .
The Trip component is the one primarily discussed in 'this
paper. It controls the system and is the major component for
acting on andresponding to an utterance: If it`' is given a typed
input, it calls Syntax to parse the sentence. Otherwise it calls
Spillich Understanding Control.`I
, Most of the other components perform dingle functions:
Dictionary Expander expands a dictionary of baseform
pronunciations using a set.of phonological rules (applied once at
system loadup time) . Real Time Signal Acquisition' (RTIME)
42'
HWIM: A Computer Model
41
,
digitizes and stores the speech signal. Signal Processing (PSA)
converts the speech signal into a parametric representation.
Acoustic-Phonetic N:R§ssTizer (APR) operates on the parametric
represenq,tion of the speech signal to produce a set of phonetic
_hypotheses represented in the segment lattice. Lexical Retrieval
searches the expanded diCtionary and matches pronunclotions
against portions of the segment lattice. Verifier generates an
idealized spectral representation of a word (or words), using a
speech synthesis-by-rule program and matchesit against a region.
.
of the parametric representation of the speech.signal. Syntax4
.judges the grammaticality'of a given word sequence; predicts
possible extensions at each end of the'sequence; builds a formal
representation of an utterance. Interpreter (present in an eatly
version only) takes a syntactic representation of an utterance
and builds a' procedural representation of the meaning. Talker
generates speech from phoneme-prosodic cue strings.
IP
Q._