Semantic Authoring for Blissymbols Augmented Communication...

Semantic Authoring for BlissymbolsAugmented Communication

Using Multilingual Text Generation

Thesis submitted in partial fulfillment

of the requirements for the degree of

DOCTOR OF PHILOSOPHY

by

Yael Netzer

Submitted to the Senate of

Ben-Gurion University of the Negev

November 2006

Beer-Sheva

Semantic Authoring for BlissymbolsAugmented Communication

Using Multilingual Text Generation

Thesis submitted in partial fulfillment

of the requirements for the degree of

DOCTOR OF PHILOSOPHY

by

Yael Netzer

Submitted to the Senate of

Ben-Gurion University of the Negev

Approved by the advisor

Approved by the Dean of the Kreitman School of Advanced Graduate Studies

November 2006

Beer-Sheva

This work was carried out under the supervision of Dr. Michael Elhadad

In the Department of Computer Science

Faculty: Natural Sciences

Acknowledgment

During the course of life, we meet people who become significant to us and they change life in a meaningful

way. I feel lucky that I met my advisor, Michael Elhadad, from whom I learned about Natural Language

Processing and Natural Language Generation in particular. I thank Dr. Elhadad for his cleverness and

kindness. Michael agreed to enter the AAC research field with me and he cooperated with my excitement

about it. I admire his ability to translate thoughts into solvable problems, his patience and most of all his

belief in me, that kept me working.

I thank Yoav Goldberg for the implementation of the Bliss lexicon - no one would have done it better,

and Ofer Biller for the development of SAUT. Meetings of the NLP group in Ben-Gurion University were

always a joy, especially the discussions on music afterwards with Meni Adler and Oren Hazai.

The Department of Computer Science in Ben-Gurion University in Beer-Sheva hosted me for the last

15 years (for all of my studies), so it was one of the most stable things in my life - I especially thank

Prof. Abraham Melkman and Prof. Klara Kedem for their sincere concern for me, Dr. Mayer Goldberg for

answering my Lisp queries, dear Dr. Tzachi Rosen for the useful discussions and his true friendship, Ami

Berler for the coffee breaks, and Valerie Glass for being my friend and assisting me with the formalities of

the University. The lab people were always helpful.

I thank Prof. Nomi Shir for teaching me linguistics and her loving Attitude, and Dr. Judy Wine for

introducing me to the AAC world in her course in Shaare Zedek.

The remarkable personality of my late grandmother, Dr. Puah Menczel, and the devotion of my mother

Dvorah and my Sister Ruti to the society were the initial motivation for my drifting into the AAC field and

I’m grateful for that.

I thank my beloved sons Guy, Eitan, and Daniel for being such inspiring language users, and especially

Daniel who taught me not to take the acquisition and usage of language for granted.

My sisters Chana and Ruti, my brother Yosef, and especially my parents Dvorah and Ehud were always

available for me with love and support and I am grateful.

iii

This work is dedicated with love

to my parents

Ehud and Dvorah

iv

Contents

Abstract ix

List of Figures xiii

List of Tables xiv

List of Abbreviations xv

1 Introduction 1

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Background 7

2.1 The need for communication - AAC . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 What is Augmentative and Alternative Communication? . . . . . . . . . . . . 8

2.1.2 Who Needs AAC – Disability Types . . . . . . . . . . . . . . . . . . . . . . . 10

2.1.3 A Brief History of AAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.1.4 AAC Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.2 Speeding up Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2.1 Natural Language Processing and AAC . . . . . . . . . . . . . . . . . . . . . 20

2.2.2 Language Techniques for Assistive Systems . . . . . . . . . . . . . . . . . . . 21

2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

v

3 Objectives 32

3.1 Generation from Telegraphic Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.2 Generation as Semantic Authoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4 Usage Scenario 39

4.1 Maintaining a View of Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.2 Argument Structure Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

4.3 Referring Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.4 Lexical Choice and Syntactic Realization . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5 System Architecture 46

5.1 Infrastructure Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.2 Flow of Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.2.1 Changing Displays Dynamically . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.2.2 Lexical Choice and Syntactic Realization . . . . . . . . . . . . . . . . . . . . 52

5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6 Natural Language Generation and Syntactic Realization 56

6.1 Natural Language Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6.1.1 The Architecture of an NLG System . . . . . . . . . . . . . . . . . . . . . . . 57

6.1.2 Multilingual Generation (MLG) . . . . . . . . . . . . . . . . . . . . . . . . . . 59

6.1.3 AAC as an MLG Application . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

6.2 The Syntactic Realizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6.2.1 Input for Surface realization module . . . . . . . . . . . . . . . . . . . . . . . 62

6.3 HUGG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6.3.1 FUF/SURGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.3.2 SURGE input of a clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

6.3.3 Main Issues in Hebrew Generation . . . . . . . . . . . . . . . . . . . . . . . . 66

6.3.4 Hebrew Clause . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6.3.5 Subjectless Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

6.3.6 Existential, equative, possessive, and attributive clauses . . . . . . . . . . . . 68

vi

6.3.7 Morphology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

7 Lexical resources 72

7.1 Lexicons in NLG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

7.1.1 Levin’s verb classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

7.1.2 Online Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

7.1.3 Choice of Lexical Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

7.2 Bliss Lexicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7.2.1 Overview on Blissymbolics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7.2.2 The Design of the Bliss Lexicon . . . . . . . . . . . . . . . . . . . . . . . . . . 85

7.2.3 Bliss Lexicon Software Development . . . . . . . . . . . . . . . . . . . . . . . 87

7.3 Using Lexical Resources for the System Lexical Chooser . . . . . . . . . . . . . . . . 88

7.4 Integrating a Large-scale Reusable Lexicon for NLG . . . . . . . . . . . . . . . . . . 90

7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

8 Communication Boards 94

8.1 The SAUT Semantic Authoring Tool . . . . . . . . . . . . . . . . . . . . . . . . . . 94

8.1.1 Conceptual Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

8.1.2 Authoring Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

8.1.3 The SAUT Editor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

8.2 Bliss Communication Board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

8.3 Implementing a Communication Board . . . . . . . . . . . . . . . . . . . . . . . . . . 102

8.4 The Processing Method - Adopting the SAUT Technique . . . . . . . . . . . . . . . 104

8.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

9 Comparison with Existing NLG-AAC Systems 107

9.1 Blisstalk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

9.2 compansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

9.3 Transforming Telegraphic Language to Greek . . . . . . . . . . . . . . . . . . . . . . 111

9.4 pvi Intelligent Voice Prothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

9.5 cogeneration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

vii

9.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

10 Evaluation 118

10.1 Evaluation of NLG systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

10.2 Evaluation of AAC systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

10.3 Evaluation our System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

10.4 Evaluating SAUT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

10.4.1 User Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

10.4.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

10.5 Evaluating Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

10.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

11 Contributions and future work 131

11.1 Bliss symbols lexicon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

11.2 HUGG . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

11.3 Integration of a large-scale, reusable lexicon with a natural language generator . . . 133

11.4 SAUT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

11.5 Communication Board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

11.6 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

Bibliography 137

viii

Abstract

This work presents a new approach to generating messages in an augmentative and alternative

communication system, in the context of natural langauge generation.

Background

The field of Augmentative and Alternative Communication (AAC) is concerned with studying

methods of communication that can be added to natural communication (speech and writing), es-

pecially when an individual lacks some of the skills to achieve it. An AAC system is defined as an

“integrated group of components, including the symbols, aids, strategies, and techniques used by

individuals to enhance communication.” [ASHA, 1991]. In the absence of an oral ability, symbols of

various types are presented on a display (or a communication board). Communication is conducted

by the sequential selection of symbols on the display, until it can be interpreted and pronounced

by the partner of the interaction. If technology is present, artificial voice is used.

Natural language generation (NLG) is a subfield in Natural Language Processing (NLP). The term

NLG refers to the process of generating utterances in a spoken language from another representation

of data, based on linguistic resources. For all applications, the generated text can be in various lan-

guages, leading to applications of multilingual generation (MLG). Multilingual generation (MLG)

aims to generate text in several languages from one source of information, without using translation.

Objectives

This work presents a novel way to generate full sentences from a sequence of symbols, using NLG

ix

techniques and the notion of dynamic displays [Porter, 2000].

In this work, we investigate ways to exploit natural language generation (NLG) techniques for

designing communication boards or dynamic displays for AAC users.

The purpose of this work is to design an NLG symbols-to-text system for AAC purposes.

Previous works on NLG-AAC have adopted a technique of first parsing a telegraphic sequence,

then re-generating a full sentence in natural language. The main difficulty in this method is that

when parsing a telegraphic sequence of words or symbols, many of the hints that are used to capture

the structure of the text, and accordingly the meaning of the utterance, are missing, as well as the

lack of many pragmatic clues which makes semantic parsing of telegraphic style even harder.

The main question we address in this dissertation is whether generation is possible, not through

the process of parsing and regeneration, but through a controlled process of authoring, where each

step in the selection of symbols is controlled by the input specification defined for the linguistic

realizer.

In addition, we address the need to implement a wide coverage lexicon, which will not restrict

the system to a small vocabulary. We investigate how a reusable, wide coverage lexicon can be

integrated with existing syntactic realizers and within the AAC usage scenario.

The third aspect we address is multilingual (English/Hebrew) generation. In a continuation

of our previous work ([Dahan-Netzer and Elhadad, 1998a], [Dahan-Netzer and Elhadad, 1998b],

[Dahan-Netzer and Elhadad, 1999]) – the aim is to develop a system that can generate text in both

Hebrew and English from the same sequence of symbols.

We have chosen Bliss symbols as the input language of the communication board. Bliss is an

iconic language which is used world-wide by AAC users. Bliss is composed of a set of approxi-

mately 200 atomic meaning-carrying symbols. The rest of the symbols (approximately 2500) are a

combination of these atomic symbols. This compositionality is a very important characteristics of

Bliss as a language and we designed a lexicon which captures the strong connection between the

meaning and the form of the symbols. We investigate how the explicit, graphic meaning of words

can be used in the process of language generation.

Finally, a practical objective of our work is to provide Bliss tools for Hebrew speakers. Most

software developed in the world for Bliss (either commercial or experimental) can not be used by

Hebrew-speaking users. We have developed a set of tools (lexicon, composition) to work with He-

brew Bliss as part of this research.

x

Contributions

The construction of this project is based on a set of tools, which have been developed separately,

then integrated into the AAC system.

The underlying process of message generation is based on layered lexical knowledge bases (LKB)

and an ontology. Each LKB adds necessary information to the overall lexical knowledge. The main

developments of this work are the Bliss Lexicon and an English verbs lexicon.

We designed and implemented the Bliss symbols lexicon for both Hebrew and English. The

lexicon can be used either as a stand-alone lexicon for reference or as part of an application. The

design of the lexicon takes advantage of the unique properties of the language. Technically, only

a set of atomic shapes is physically drawn while combined symbols are generated automatically,

following the symbol’s entry in a database that was constructed from the Hebrew and English Bliss

Dictionaries. The lexicon was implemented in a way that allows searches through either textual (a

word), or semantic components (e.g., ”all symbols that contain a wheel”), or by forms (e.g.,

”all symbols that contain a circle”).

We have integrated a large-scale, reusable verbs lexicon with FUF (Functional Unification

Formalism) [Elhadad, 1991] / SURGE (A comprehensive generation grammar of English written

in FUF)[Elhadad and Robin, 1996] as a tactical component), so the knowledge is encoded in the

lexicon and can be reused, as well as to automate to some extent the development of the lexical

realization component in a generation application.

The integration of the lexicon with FUF/SURGE also brings other benefits to message genera-

tion, including the possibility of accepting a semantic input at the level of WordNet synsets, the

production of lexical and syntactic paraphrases, the prevention of non-grammatical outputs, reuse

across applications, and wide coverage.

An additional component of the system’s infrastructure is the syntactic realizer. HUGG (Hebrew

Unification Grammar for Hebrew) is a syntactic realizer (SR) for Hebrew generation, implemented

with FUF. HUGG inputs are designed to be as similar as possible to the inputs of the English SR

SURGE.

The core of the processing machinery of the AAC message generation system is based on SAUT

(Semantic AUThering Tool) [Biller, 2005] [Biller et al., 2005] – an authoring system for logical forms

xi

encoded as conceptual graphs (CG). The system belongs to the family of WYSIWYM (What You

See Is What You Mean) text generation systems: logical forms are entered interactively and the

corresponding linguistic realization of the expressions is generated in several languages. The system

maintains a model of the discourse context corresponding to the authored documents.

The overall purpose of this work is the development of an AAC system, namely a dynamic

(virtual) communication board for Bliss users. The communication board we designed is inspired

by both the semantic authoring technique as implemented in SAUT as well as from dynamic displays

as studied by [Burkhart, 2005].

The symbols displayed on the screen at each step depend on the context of the previously entered

symbols. For example, if the previous symbol denotes a verb which requires an instrumental theme,

only symbols that can function as instruments are presented on the current display. The general

context of each utterance or conversation can be determined by the user, therefore narrowing the

diversity of symbols displayed.

Finally, we review evaluation strategies for both NLG and AAC systems. Both fields struggle

with similar issues to define evaluation metrics that can be reproduced and can drive system

improvement in a predictable manner.

We present two aspects of the evaluation of the AAC system we developed: we first performed

a user evaluation of the coverage, efficiency, and usability of the semantic authoring approach, as

implemented in the SAUT system. We established a detailed evaluation scenario of the potential

rate of data entry of the system by analyzing a small corpus of Bliss sentences.

Keywords: Natural Language Generation, Augmentative and Alternative Communication, Lexical

resources, Blissymbols Language, Semantic Authoring, Dynamic Display.

xii

List of Figures

2.1 PCS board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2 Rebus board . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3 Comparison of symbols of concrete objects [CallCentre, 1998] . . . . . . . . . . . . . 18

2.4 Comparison of symbols of abstract concepts [CallCentre, 1998] . . . . . . . . . . . . 19

2.5 Minspeak c© changes in meaning of apple . . . . . . . . . . . . . . . . . . . . . . . . 20

3.1 Dynaxov c© sentence starters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.1 VerbNet entry for Play . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

4.2 Bliss sequences for to be yeS verbs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.1 General architecture and flow of information . . . . . . . . . . . . . . . . . . . . . . . 47

5.2 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

5.3 Ontology fragment for the concepts: pan, breakfast, girl, egg, serve . . . . . 50

5.4 Ontology fragment of relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

5.5 The display after the choice of the to play symbol . . . . . . . . . . . . . . . . . . . 52

6.1 A fragment of Hspell database for the word celev (dog) . . . . . . . . . . . . . . . . . 70

7.1 Wordnet entry for the word girl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

7.2 VerbNet entries for make - build-26.1 and watch . . . . . . . . . . . . . . . . . 78

7.3 FrameNet entry of the verb abstain . . . . . . . . . . . . . . . . . . . . . . . . . . 79

7.4 ComLex entry of the verb abstain . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

7.5 Hebrew and Bliss Medical Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7.6 Example for Bliss symbol types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

xiii

7.7 Usages of Pointers for Meaning Selection . . . . . . . . . . . . . . . . . . . . . . . . . 84

7.8 Example: mind, minds, brain, thoughtful, think, thought, will think. . . . . . . . . 85

7.9 Semantic modifiers: much, intensifier, opposite. . . . . . . . . . . . . . . . . . . . . . 86

7.10 Hebrew vs. English Representation of Symbols . . . . . . . . . . . . . . . . . . . . . 87

7.11 Hierarchy of Bliss Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7.12 A snapshot of the Bliss Lexicon Web Application . . . . . . . . . . . . . . . . . . . . 89

7.13 Lexicon entry for the verb appear . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

7.14 VerbNet make - build-26.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

8.1 Architecture of the SAUT System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

8.2 Linear representation of a Conceptual Graph . . . . . . . . . . . . . . . . . . . . . . 96

8.3 Snapshot of editing state in the SAUT system . . . . . . . . . . . . . . . . . . . . . . 98

9.1 The preferred semantic structure for the input Apple eat John . . . . . . . . . . . . 110

10.1 Output of LAM [Hill et al., 2001] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

xiv

List of Tables

10.1 Learning time measures of recipe writing in SAUT . . . . . . . . . . . . . . . . . . . 125

10.2 Translation vs. Semantic Authoring time. . . . . . . . . . . . . . . . . . . . . . . . . 126

10.3 Accuracy percentage of four documents written in SAUT . . . . . . . . . . . . . . . 127

10.4 Error analysis in subjects’ generated documents. . . . . . . . . . . . . . . . . . . . . 127

10.5 Sentences vs. SAUT representation, number of words . . . . . . . . . . . . . . . . . . 130

xv

List of Abbreviations

AAC Augmentative and Alternative Communication

CG Conceptual Graphs

FD Functional Description

EVCA English Verb Classes and Alternations

FUF Functional Unification Formalism

HUGG Hebrew Unification Grammar for Generation

LKB Lexical Knowledge Bases

MLG Multilingual Generation

NLG Natural Language Generation

NLP Natural Language Processing

PCS Picture Communication Symbols

SAUT Semantic AUthoring Tool

SR Surface (Syntactic) Realizer

SURGE Surface Realizer for Generation of English

xvi

Chapter 1

Introduction

The greatest problem in communication is the illusion that it has been accomplished. - George

Bernard Shaw

This work presents a new approach to generating messages in an augmentative and alternative

communication system, in the context of natural langauge generation.

1.1 Background

The human method of acquiring language is a complicated process that may last a life time.

Usage of language is an ever developing matter as well. For the great majority of human beings,

communication via natural language is an obvious act, but this is not the case for everyone. People

who suffer from severe language impairments lack the ability to express themselves and cannot

achieve various forms of communication.

The field of Augmentative and Alternative Communication (AAC) is concerned with studying

methods of communication that can be added to natural communication (speech and writing),

especially when an individual lacks some of the skills to achieve it. An AAC system is defined as

an “integrated group of components, including the symbols, aids, strategies, and techniques used

by individuals to enhance communication.” [ASHA, 1991]

Research of this area includes psychology, medicine, speech therapy, engineering, and education.

AAC devices refer to either manual or automated tools, and include all devices that, in some way,

support the process of production or understanding of spoken or written utterances [Langer and

1

Newell, 1997]: message generation devices, text simplification devices, TV subtitle generators, sign

language interpretation and reading aids for vision-impaired people.

An aided communication system is the actual device that a person uses to communicate with

his environment (a person may use more than one such system at different times). In the absence of

an verbal ability, symbols of various types are presented on a display (or a communication board).

Communication is conducted by the sequential selection of symbols on the display, which are then

interpreted by the partner in the interaction. If synthesis-speech technology is present, artificial

voice is used.

Natural Language Processing (NLP) is the field of computer science that studies how linguistic

knowledge can help develop text-based applications, such as machine translation, text summariza-

tion, expert systems, document production.

Natural language generation (NLG) is a subfield in Natural Language Processing (NLP), a field

lying at the intersection of Computer Science, Linguistics, and Cognitive Sciences. The term NLG

refers to the process of generating utterances in a spoken language from another representation of

data, based on linguistic resources.

NLG techniques are finding a growing range of applications. Systems where vast volumes of

data require expert interpretation can exploit NLG so that the data is summarized and explained in

spoken language. The use of NLG is, in general, (1) to make data understandable (expert systems,

reports) and (2) to produce routine documents that must be updated often.

In some applications, NLG fulfils part of the overall requirement and the NLG techniques are

combined with other NLP aspects: Machine Translation (MT) ([Dorr et al., 1998] [Temizsoy and

Cicekli, 1998]) and automatic summarization ([Barzilay et al., 1999], [Hovy and Lin, 1998]).

For all applications, the generated text can be in various languages, leading to application of

multilingual generation (MLG). Multilingual generation (MLG) aims to generate text in several

languages from one source of information, without using translation.

Both NLP and AAC relate to the use of language from two very different points of view, but

have very much in common. Both fields of research and assessment search a path to produce

language in a non-natural way, as well as making the text easier to understand when the ability to

understand is damaged or absent.

2

Using NLP techniques for AAC purposes (NLP-AAC in short) as a field of research has devel-

oped in the last decade, and a few dedicated workshops were organized (for instance, at the ACL

conference, 1997). A special issue on the subject was published by the Journal of Natural Language

Engineering in 1998. Several systems that integrate NLG techniques in aided communication sys-

tems have been developed in the past ([McCoy et al., 1998] [Vaillant and Checler, 1995] [Karberis

and Kouroupetroglou, 2002] [Copestake, 1997]).

This work presents a novel way to generate full sentences from a sequence of symbols, using

NLG techniques and the notion of dynamic displays [Porter, 2000].

1.2 Motivation

In this work, we investigate ways to exploit natural language generation (NLG) techniques for


The scenario we consider is the following: an AAC user selects a sequence of symbols; his

partner then reads out the sequence and utters a natural language sentence. We interpret this

scenario as a typical natural language generation process: content planning is performed by the

AAC user and content is expressed by the sequence of selected symbols; linguistic realization is

performed by the interlocutor.

The purpose of this work is to design an NLG symbols-to-text system for AAC purposes. In the

design of an AAC system, the main motivation is to provide the user with a communication tool

that enables a high rate of communication with as wide expressive power as possible. Another way

to consider the task we address, is to compare it to the task of expanding telegraphic style input

to fully articulated language – with function words (determiners, preposition) and proper handling

of morphology (such as inflections, plural markers, etc.). In this way, NLG techniques save the

user avoidable keystrokes and produce more fluent output. Moreover, the process of the message

generation is incremental, i.e. a partial linguistic representation is displayed to the user after each

choice of a symbol is made. This incremental method, along with the immediate feedback can

be used to aid the user not only to generate grammatical utterances, but also in the process of

planning a message that will be well understood to his companion and will represent correctly the

communication goal 1.1I thank the anonymous reviewer for this remark

3

Previous works on NLG-AAC systems ([Vaillant, 1997], [Copestake, 1997], [McCoy et al., 1998]

for example) have adopted a technique of first parsing a telegraphic sequence, then re-generating

a full sentence in natural language. The initial message is of a telegraphic nature because it

lacks the main cues of morphological and syntactic structure that exist in natural language. As a

consequence, reconstruction of the intended meaning is made difficult. Deep semantic and lexical

knowledge sources are required to recover the meaning. Such resources are not readily available

in general and, as a result, systems with only a reduced vocabulary have been demonstrated. The

main difficulty in this method is that when parsing a telegraphic sequence of words or symbols,

many of the hints that are used to capture the structure of the text and accordingly the meaning

of the utterance are missing.

Moreover, as an AAC device is used not only for typing text, but also for real-time conversations,

the interpretation of the utterance relies to a large extent on pragmatics – such as the time of a

mentioned event, omitting syntactic roles and making reference to the immediate environment. The

need to recover such pragmatic clues makes the semantic parsing of telegraphic style even harder.

1.3 Objectives

The main question we address in this dissertation is whether generation is possible, not through

the process of parsing and regeneration, but through a controlled process of authoring, where each

step in the selection of symbols is controlled by the input specification defined for the linguistic

realizer.






[Dahan-Netzer and Elhadad, 1999]), the aim is to develop a system that can generate text in both




mately 200 atomic meaning-carrying symbols. The rest of the symbols (approximately 2500) are a

4

combination of these atomic symbols. This compositionality is a very important characteristic of

Bliss as a language and we designed a lexicon which captures the strong connection between the



Finally, a practical objective of our work is to provide Bliss tools for Hebrew speakers. When

Bliss was adopted for use in Israel, a decision was made to write Bliss symbols from right to left as

in the Hebrew writing system, and consequently to invert the display of the symbols (or at least

of most of them). As a result, most software developed in the world for Bliss (either commercial

or experimental) could not be used by Hebrew-speaking users. We have developed a set of tools

(lexicon, composition) to work with Hebrew Bliss as part of this research.

1.4 Contributions

The construction of this project is based on a set of tools which have been developed separately,

then integrated into the AAC system.

The underlying process of message generation is based on layered lexical knowledge bases (LKB)

and an ontology. Each LKB adds necessary information. The main developments of this work are

the Bliss Lexicon and an English verbs lexicon.

We designed and implemented the Bliss symbols lexicon for both Hebrew and English. The

lexicon can be used either as a stand-alone lexicon for reference or as part of an application. The

design of the lexicon takes advantage of the unique properties of the language. Technically, only

a set of atomic shapes is physically drawn while combined symbols are generated automatically,

following the symbol’s entry in a database that was constructed from the Hebrew and English Bliss

Dictionaries. The lexicon was implemented in a way that allows searches through either textual (a

word), or semantic components (e.g., ”all symbols that contain a wheel”), or by forms (e.g.,

”all symbols that contain a circle”).

We have integrated a large-scale, reusable verbs lexicon with FUF/SURGE [Elhadad, 1991]

[Elhadad and Robin, 1996] as a tactical component, so the knowledge encoded in the lexicon can be

reused, as well as to automate to some extent the development of the lexical realization component

in a generation application.

The integration of the lexicon with FUF/SURGE also brings other benefits to generation,

5

including the possibility of accepting a semantic input at the level of WordNet synsets, the

production of lexical and syntactic paraphrases, the prevention of non-grammatical output, reuse


An additional component of the system’s infrastructure is the syntactic realizer. HUGG is a

syntactic realizer (SR) for Hebrew generation, implemented with FUF. HUGG inputs are designed

to be as similar as possible to the inputs of the English SR SURGE.

The core of the processing machinery of the AAC message generation system is based on SAUT

[Biller, 2005] [Biller et al., 2005] – an authoring system for logical forms encoded as conceptual

graphs (CG). The system belongs to the family of WYSIWYM (What You See Is What You Mean)

text generation systems: logical forms are entered interactively and the corresponding linguistic

realization of the expressions is generated in several languages. The system maintains a model of

the discourse context corresponding to the authored documents.

The overall purpose of this work is the development of an AAC system, namely a dynamic

(virtual) communication board for Bliss users. The communication board we designed is inspired

by both the semantic authoring technique as implemented in SAUT as well as from dynamic displays

as studied by [Burkhart, 2005].

The symbols displayed on the screen at each step depend on the context of the previously entered

symbols. For example, if the previous symbol denotes a verb which requires an instrumental theme,

only symbols that can function as instruments are presented on the current display. The general

context of each utterance or conversation can be determined by the user, therefore narrowing the

diversity of symbols displayed.

Finally, we review evaluation strategies for both NLG and AAC systems. Both fields struggle

with similar issues to define evaluation metrics that can be reproduced and can drive system

improvement in a predictable manner.

We present two aspects of the evaluation of the AAC system we developed: we first performed

a user evaluation of the coverage, efficiency and usability of the semantic authoring approach, as

implemented in the SAUT system. We established a detailed evaluation scenario of the potential

rate of data entry of the system by analyzing a small corpus of Bliss sentences.

6

Chapter 2

Background

“Communication is the essence of life”

2.1 The need for communication - AAC

For most people, communicating through language is an obvious act, which is done thoughtlessly

and naturally. Communication is required for various reasons, such as interaction, expressing needs

and desires, expressing knowledge, and inventing new ideas. Although communication between

human beings can be achieved with body gestures, facial expressions, or written messages – it is

mostly achieved with spoken words.

However, this is not the case for everyone. Estimations of the percentage of people with severe

language impairments across the world show that approximately one percent of the world population

suffers from severe communication impairment. 1[Beukelman and Mirenda, 1998][pp. 4-5]

People with severe language impairments cannot use language in a natural way, and must use

additional augmentative techniques in order to communicate. In some cases, people use facial

gestures or sign language (as do deaf people). In other cases, additional devices are required, either

hi-tech or low-tech (non-electronic devices) such as communication boards.

The study of augmentative and alternative communication and the use of communication boards

is a relatively new field of research and practice, a field that involves speech pathologists, physical1Figures vary between 0.8-1.2% in the USA down to 0.12% in Australia. Variations may be due to different

definitions of severe language impairments (the American figures probably include deaf people who are not consideredAAC users, and Australian figures exclude adults with acquired language disabilities such as aphasics)

7

and occupational therapists, assistive technology engineers, teachers, psychologists, medical experts,

and the social services.

2.1.1 What is Augmentative and Alternative Communication?

Augmentative and alternative communication (AAC) is concerned with studying methods of com-

munication that can be added to natural communication (speech and writing), especially when

an individual lacks some of the skills to achieve it. An AAC system is defined as an “integrated

group of components, including the symbols, aids, strategies, and techniques used by individuals

to enhance communication” [ASHA, 1991].

The objectives of an AAC system have been specified in a variety of manners. [Beukelman and

Mirenda, 1998] analyze communication with respect to the participants’ goals, the interaction con-

tent, scope and rate, and the participants’ tolerance for communication breakdown. For example,

they categorize the participants goals in these categories:

• Express one’s needs and/or wants

• Transfer information

• Achieve social closeness

• Meet social etiquette

When designing an AAC system, these aspects provide a way to evaluate the effectiveness of the

system and define its scope (what is the goal of the interaction, how fast it should be produced,

who the partner is, etc.).

[Porter, 2000] lists the requirements that an AAC intervention (i.e., the use of an AAC system

in a specific interaction setting) needs to fulfill to meet the communication needs:

• Intelligibility – the AAC system provides access to sufficient vocabulary to enable commu-

nication and to stimulate the further development of the interaction.

• Specificity – the AAC system provides access to vocabulary related to the current context.

• Efficiency – the AAC system provides easy and fast access to the vocabulary, overcoming

specific motor or physical difficulty.

8

• Autonomy – the AAC system provides the possibility to initiate an interaction with minimal

aid from a peer.

• Social value – the AAC system enables communication in different environments and with

different people.

[McCoy et al., 2001] refine these criteria to design an evaluation grid for AAC systems. When

comparing AAC systems, or comparing an AAC system with a non-assisted environment, the

following measures can quantify the quality of the system:

Intelligibility -

• better ability to express oneself

• more fluent (natural) conversation

• more natural interactions

Efficiency -

• faster communication

• fewer keystrokes

Social value -

• longer turns

• perception of communicative competence

[McCoy et al., 2001] also list the long-range consequences of the use of an AAC device on the

language impaired participant:

• Development of interaction skills

• Development of literacy skills

• Development of turn-taking skills

• Socialization

• Personal opportunities because of improved communication abilities

9

• Communicative competence

Different AAC techniques target different objectives. As a consequence, the environment of a

user must be engineered with a combination of devices, each enabling different forms of communi-

cation: social chatting requires a pool of socializing utterances with fast access; writing devices are

slower to operate but provide more expressive language. Different devices are used during bathing

time or at night time, when in hospital, or when shopping [McCoy et al., 2001].

2.1.2 Who Needs AAC – Disability Types

There are various physical and/or cognitive reasons for the disabilities and impairments of language

and speech. [Beukelman and Mirenda, 1998] define a dichotomy between developmental disabilities

and acquired physical disabilities:

Developmental Disabilities

Cerebral Palsy (CP) is a developmental neuromotor disorder that is a result of a nonprogressive

abnormality of brain development. CP is most commonly spastic, i.e., increased muscle tone

causes certain degrees of dysfunction of the limbs, but may be otherwise characterized with

abrupt, involuntary movements of the extremities, or be rigid or atonic. 60-70% of children

with CP have some degree of mental retardation. Approximately half of them suffer from

visual impairments; some from hearing loss and seizures.

Speech is affected too: dysarthria (the inability to use speech muscles) is very common, and

other speech problems exist that are caused by muscle dysfunction. Some speech disorders

are connected to the mental retardation, hearing state, and acquired helplessness.

Mental retardation is characterized with significantly low intellectual skills accompanied with

possibly limited communication, self-care, and other social skills. It may be defined by the

level of support a person needs [Beukelman and Mirenda, 1998][p. 250], assuming that ap-

propriate support can impact the abilities of individuals to live in a community. It is very

likely to be accompanied by other disabilities.

Developmental apraxia of speech (or childhood dyspraxia): children with articulation errors

and difficulty with volitional or imitative production of speech may also suffer from slowness

in motor skill development or mental retardation, probably due to neurological impairments

10

Autism and Pervasive Developmental Disorders (PDD). The main three characteristics of

autism are impairments in social interaction, in communication, and restricted and stereo-

typical patterns of behaviors. The range of communication skills of autists is wide: from

non-communicating skills to good ones (Asperger’s syndrome).

Acquired physical disabilities

Amyotrophic Lateral Sclerosis (ALS) is a progressive degenerative disease of unknown etiol-

ogy involving the motor neurons of the brain and the spinal cord. As the disease progresses,

patients completely lose their ability to speak.

Other brain diseases such as Multiple Sclerosis and Guillain-Barre Syndrome, may cause dysarthria.

In Parkinson’s disease, speech disorder first affects the voice and intonation, but its intelligi-

bility may be reduced or totally lost.

Spinal Cord Injury or Brain-Stem Stroke (cerebrovascular accidents) may cause a tem-

porary or a permanent loss of ability to speak. Writing or use of keyboards may be also

disabled due to the physical condition.

Aphasia is the state of a person’s inability to comprehend or generate language, due to a brain

stroke, an injury, a brain tumor, or other diseases.

Most forms of language impairment are associated with motor limitations. AAC has, therefore,

traditionally focussed on easing the selection of words or symbols from lists of pre-selected items.

2.1.3 A Brief History of AAC

AAC as a field of practice is continuously affected by social changes, psychological theories and by

the development of technology.

There are several aspects of AAC which have changed through time, including the identification

of the needing population (who merits AAC intervention), assessment (the decision of when to

intervene), means (what tools are to be used), and evaluation of AAC interventions.

Starting from the 1950s and 1960s, growing awareness of human rights led to efforts to increase

the integration of persons with disabilities into society. In these early years, assessment by speech

therapists was limited to individuals with particular skills, such as the ability to imitate sounds

11

and comprehend and learn a spoken language. This restriction was due to a failure to distinguish

between language and speech disabilities.

Practitioners started applying sign language to individuals with disabilities, in addition to deaf

people. This approach had the benefit of enabling fast communication. However, sign language

is not understood by most people, cognitive impairments affect the quality of the language, and

sign language requires accurate signaling, which in many cases was not possible since people with

speech disability also suffer from motor impairments.

Through the 70s, public schools of many countries were legally obliged to accept all children

with disabilities. This legal action encouraged more significant efforts on finding solutions for non-

speaking children. However, the professional attitude was to wait until there is certainty that the

person will not acquire spoken language, yet acquire prerequisite skills before they were considered

viable candidates for AAC services [Hourcade et al., 2004]. The focus in the intervention of aided

communication was on the pragmatic aspect, i.e., recognizing that the ability to communicate is

not only knowledge of a language but also to learn its functionalities. Aided communication was

still mostly sign language, gestures and picture symbols, but usually not both of them together.

Symbol sets such as Rebus [Beukelman and Mirenda, 1998] and Blissymbols [McDonald, 1982]

were developed. Electronic devices were introduced, such as message printing devices and scanning

devices. Augmentative and alternative communication methods were still restricted to people with

good cognitive skills or without severe motor disabilities.

Throughout the 80s, assessment of AAC for individuals was based on the Communication

Needs Model: the primary goal was to reduce an individual’s unmet communication needs. Speech

therapists first identified an individual’s current communication needs and then, the degree to

which those needs were being met. At first, candidacy for AAC was determined considering one’s

cognitive abilities, age, and motor-oral abilities. If a decision for intervention was taken, an aided

or unaided communication device was chosen, and finally goals of communication were determined.

With the development of computers, new means of communication become possible: voice

prosthesis, pointing devices, and assisting software. It was also understood that in order to enable

good communication, both aided and unaided techniques must be available for each individual,

with no regard to his cognitive and physical disabilities, and in any case, there is no prerequisite

for aided communication assessment.

Contemporary assessment is mostly characterized by the Participation Model, a model which

12

assumes that each individual is entitled to and can achieve enhanced communication. Following this

model, therapists identify the individual’s patterns of communication throughout the day and in

different contexts, then assess the future communication needs. The AAC system for an individual

is the overall answer and the devices for her needs [Hourcade et al., 2004].

Starting twenty years ago, [Hunnicutt, 1986] side-by-side with the development of Natural

Language Processing (NLP) research, many novel approaches were introduced to the field. These

will be discussed later in Section 2.2.2.

2.1.4 AAC Techniques

An aided communication system is the actual device that a person uses to communicate with his

environment (a person may use more than one such system at different times). In the absence of

an oral ability, symbols of various types are presented on a display (or a communication board).

Communication is conducted by the sequential selection of symbols on the display, which are then

interpreted by the partner in the interaction. If technology is present, artificial voice is used.

AAC devices are characterized by three aspects [Hill and Romich, 2002]:

1. Selection method

2. Input language

3. Output medium

In a computerized system, as [McCoy and Hershberger, 1999] mention, a processing method

aspect is added to this list. This method refers to the process which creates the output once symbols

are inserted.

Each individual’s condition determines which method will be chosen for these aspects. Adjust-

ing the right communication device is done by a team of professionals: a speech pathologist, an

occupational therapist, an engineer, and others.

This section elaborates on these three aspects – first the possible selection methods i.e., the

physical choice of symbols on the communication board; then the types of input languages that are

commonly used for aided communication and the various considerations in their initial choice; and

finally the output devices that are in use. The processing methods are discussed later in Sections

13

2.2.2 and 8.4, and Chapter 9.

Selection methods are strongly connected to the person’s cognitive and physical abilities, and

are affected by the device used to communicate.

Selection can be either direct or assisted. Direct selection is achieved by pointing with a finger or

using physical pressure on a display, a keyboard, or a touchscreen. It is also possible with eye-gaze

or with the use of an alternative pointing device such as a light-generating device on the head or an

eye-gaze tracking device. If direct selection is not possible, scanning techniques are possible with

the aid of a peer, by using a set of switches, or by an auditory scan. The display or the keyboard

is scanned at an adjustable rate and selection is made when the speaker indicates that the desired

symbol was reached. Effective scanning techniques are crucial since the process of selection is much

slower than in the direct one.

If speech intelligibility is good enough and if technology is available, voice recognition is very

useful, although it may be restricted to a small set of words and phrases.

Selection methods are tied with the type of display that is used. As displays vary in many ways,

so do the parameters of symbol selection: number of items presented, size of the display, as well as

symbols and their orientation.

Size is affected by the particular visual, motor and cognitive abilities of the user and by the

space that is available in his environment or by, for example, the motor skills for direct selection

[Woltosz, 1997].

Displays may be static or dynamic. Static displays consist of a board where the presented

symbols do not change automatically. They “provide a fixed set of symbols which are mechanically

affixed to an underlying layer of plastic or paper material.” [Woltosz, 1997] In dynamic displays,

the symbols displayed may change automatically in response to the use of the device.2

Low-tech paper static displays have the advantage of being cheap, easily made and highly

portable. Electronic static devices offer more novel uses such as semantic interpretation of symbol

sequence, lower cost, and mobility. Dynamic devices have the benefits of a wider vocabulary and a

decision-based usage of the display as opposed to a memory-based usage that may be cognitively

more demanding. Selecting the device is affected by the various factors that characterize AAC2Although the term dynamic display refers to electronic devices in most cases (for example

http://www.augcominc.com/articles/7 2 1.html), it is also used for booklet-style carton displays as in Porter’ssystem [Porter, 2000].

14

decisions – the special needs of the communicator. For a motor disability, which affects accuracy of

selection but not the language skills, a dynamic display which offers a reduced number of symbols

at any given time may be more appropriate than a static display with a large number of small

symbols. However, if selection itself is at a very slow rate, and navigation between displays creates

an additional load, a static display may be more useful [Woltosz, 1997].

Nowadays, there are several off-the-shelf computerized devices both for dynamic and static dis-

plays. Some are dedicated devices such as dynavox and some take advantage of laptop computers

as a basis for the communication device (Mayer-Johnson’s Speaking Dynamically c©, Don-Johnston

Talk-About c©).

Considerations in designing the layout of dynamic displays, especially for young children, are

described in detail by [Burkhart, 2005] and [Porter, 2000]. The main idea is to allow the com-

municator access to as wide a vocabulary as possible while keeping every presentation simple and

easy to use. This is achieved by easing access to each page using category buttons, as well as other

browsing options (next page, main menu, and similar). Another important issue is to leave space

for new acquired words and easy access to the alphabet, and the overall positioning of all symbols.

The input language for AAC purposes varies through countries, kinds of disability, and special

characteristics of the individual (who may use, for instance, both spelling methods and symbolic

displays).

A significant taxonomy for symbols is on the scale of transparency. Transparent symbols share an

immediately recoverable referent (icons) while opaque symbols require knowledge in order to under-

stand their meaning (e.g., written language), in between these two extremes, there are translucent

symbols [Beukelman and Mirenda, 1998], symbols that are not readily guessed without additional

information. The trade-off between expressibility and transparency is significant and affects the

decisions of which symbol system should be used, usually according to the speaker’s abilities.

I will not discuss here unaided symbols, which refer to body and facial gestures and signs, or

aided tangible symbols (real or miniature objects).

Aided representational symbols [Beukelman and Mirenda, 1998] refer to two-dimensional sym-

bols in various levels of abstraction (or transparency).

Representational symbols are further divided into:

15

1. photographs - colored, black and white;

2. line drawing symbols

Within line-drawing-symbols, the most common of all are the Picture Communication Symbols

(known as PCS).

Figure 2.1: PCS board

The PCS symbol system (from Mayer-Johnson Co.) is a line-drawing set of symbols, which

is accompanied with software (BoardMaker c©) (Figure 2.1). PCS has a set of 3900 symbols and

continue to develop intensively. PCS is effectively used by pre-school children without cognitive

disabilities and by adults with cognitive disabilities. The use of PCS seems to be acquired more

quickly than in Blissymbols ([Beukelman and Mirenda, 1998] p. 59).

Rebus symbols are also line-drawing symbols (Figure 2.2). To date, there are 7000 symbols 3

either colored or black-and-white, covering a vocabulary of over 20,000 words. Originally, the idea

of the symbols was to represent homophone words with the same symbol (i.e., the symbol for not

stands for both not and knot ([Beukelman and Mirenda, 1998] p. 60), however, this method is

not applicable anymore 4. Learnability of the Rebus symbols is as easy as PCS or a bit less.3http://www.widgit.com/symbols/about symbols/widgit rebus.htm4http://www.widgit.com/symbols/about symbols/literacy/02.htm

16

Figure 2.2: Rebus board

Other line-drawn symbol systems are the picsyms, DynaSyms, and Pictogram Symbols,

which are all translucent to some extent and are all easier to learn than Blissymbols.

We have chosen the bliss symbols as the set of symbols in this suggested implementation. The

reasons for this choice are (a) an immediate need for Bliss symbols tools for usage in Hebrew, as

was apparently told by Israeli practitioners; and secondly, (b) the internal structure of the Bliss

symbols can be used efficiently, for instance, in the search process. Bliss symbols are composed

of meaning carrying atoms and therefore graphical representation can be linked to the semantic

represenation directly. However, the architecture of the systems does not depend completely on

Bliss, and it is possible to use another set of symbols, if each symbol is linked to the matching

concept in the lexicon.

An extended review on bliss symbols is found in Chapter 7.2.1.

Figures 2.3 and 2.4 exemplify various representations of concrete and abstract words in the

above mentioned symbol sets.

On the opaque side of the transparency scale there are the orthography symbols – written

language, Morse code, Braille text, and phonemic symbols.

There are several possible outputs for communication displays. Electronic devices (VOCA -

Voice Output Communication Aid) have either

1. digital output – i.e., recorded utterances.

2. synthesized output – a text-to-speech system.

Digital output has the benefits of being more personal but requires recording in advance. Syn-

thesized output is flexible but may not be as pleasant for the user.

17

Figure 2.3: Comparison of symbols of concrete objects [CallCentre, 1998]

2.2 Speeding up Communication

A very important aspect of an AAC device is the rate of communication it enables its user. The

average number of words in a normal spoken conversation is 150-250 per minute, but for an AAC

user, it is less than 15 words per minute under most circumstances, and may even vary between 2

to 8 words per minute. Therefore, a major aim of designing communication tools is to find methods

to enhance this aspect.

Measuring rate enhancement is a complex matter since it is affected by various factors, which

vary among the individuals who use the systems. [Beukelman and Mirenda, 1998] lists the following

factors:

• Linguistic cost (average number of selections)

• Motor act index (number of keystrokes)

• Time or duration of message production.

• Cognitive processing time that is needed to make the selections.

• Productivity and clarity indices (i.e., measures of which meaning may be encoded and how

well it is encoded).

Three main factors of rate measurement include [Hill et al., 2001]:

18

Figure 2.4: Comparison of symbols of abstract concepts [CallCentre, 1998]

1. language representation method usage.

2. selection rate

3. errors

The first factor is measured by the number of words generated per minute, and enhancement

is measured by the ratio of words measured to the enhancement techniques. [Hill et al., 2001]

One option to enhance communication is to use message encoding. Encoding can be done by

letters, letter-category, alpha-numeric, or numeric encoding. For example: letter encoding (Please

open the door for me == OD).

Methods used for abbreviation expansion are elaborated in section 2.2.2.

A more sophisticated encoding method is the iconic encoding system, as realized in Minspeak’s

semantic compaction [Baker, 1984]. In this system, which contains 128 basic symbols, the same

symbol can be used for various meanings as context determines, and be prestored in an electronic

device so that when the sequence is chosen the right vocal output is produced.

For instance: apple by itself does not have any meaning, but apple + rainbow will refer to

the word red. apple + house means a grocery, and time + apple means what time do we eat

19

Figure 2.5: Minspeak c© changes in meaning of apple

(see Fig. 2.5). It is important to note that this system was intended for vocal output since the

complicated encoding system may not be understandable by a conversation partner [CallCentre,

1998].

2.2.1 Natural Language Processing and AAC

Natural Language Processing (NLP) is the field of computer science that studies how linguistic

knowledge can help develop text-based applications, such as machine translation, text summariza-

tion, expert systems, document production.

NLP research and application development can be divided into three related subfields:

1. Natural Language Understanding (NLU) – understanding the meaning of a given text

2. Natural Language Generation (NLG) – generating text representing a given meaning

3. Language-transformation (such as machine translation) – transforming a given text into an-

other textual representation

Underlying all three of these objectives, several lower-level tasks are required by all NLP appli-

cations:

Part of Speech (POS) tagging consists of assigning to each word in a text its part of speech (a

label such as verb, noun, pronoun, preposition). POS tagging is a crucial task for further levels

20

of text processing, such as shallow parsing (or chunking) – i.e., identifying phrases in a given

text. Attachment resolution is needed to assure correct syntactic parsing, i.e., finding the syntactic

structure of a sentence. Anaphora resolution – i.e., finding antecedents of referring expressions and

word sense disambiguation – finding the most likely sense of a given word, building the road to

semantic parsing – i.e., understanding the meaning of text.

NLP applications can be viewed from another perspective – how linguistic knowledge is encoded

and how it is acquired. Some applications rely heavily on statistical information and machine

learning techniques, and produce a program with opaque information encoding. Some statistical

applications produce a set of rules which is readable and understandable to a human reader. Non-

statistical methods (also called symbolic) rely on hand-written encoding of rules.

2.2.2 Language Techniques for Assistive Systems

NLP techniques have been used for AAC applications to enhance the rate of communication and

extend the range of expressions that can be generated. The key applications include message

generation, abbreviation expansion, word prediction and text simplification.

Enhancements brought by NLP techniques first of all focus on reducing the number of characters

typed by the user as much as possible.

[Boissiere, 2003] defines the coding principle, and accordingly distinguishes three aspects of

writing assistive systems:

• User’s point of view (with reference to the coding principle)

1. Abbreviation expansion – the user memorizes a set of abbreviated words and rewriting

rules

2. Word prediction with a list of possible words

3. Word prediction with letter guessing

• Designer’s point of view – how syntactic, statistic, lexical and semantic knowledge sources

are used to improve the coding principle

• Combined view

21

Word Prediction

Word prediction aims at easing word insertion in a textual software by guessing the next word that

should possibly be written, or by giving the user a list of possible options of words.

A similar process happens naturally in a human conversation between an AAC-user and a

speaking partner. In such a situation, a speaking partner is most likely to predict the word that

is to be said by using her knowledge about language and the context of the conversation [Garay-

Vitoria and Abascal, 2004].

The main purpose of word prediction is to speed up typing, but it can also help dyslexic people

in reducing writing errors. This field of research has seen a surge of interest with the development

of mobile phones (with their limited keyboard) and of handheld devices.

An example for word prediction in a given state of text insertion is as follows:

I play b

The system may offer the following words:

be born ball baseball brand

Now, if the next letter inserted is a, then the system narrows the offered words to:

ball baseball basketball baglama balalaika

The strategies taken in word prediction software are either to complete the current typed word

by, for example, calculating the probabilities with each new character that is inserted, or to offer,

in a pop-up menu, a choice of words which she probably meant to write, given the previous letters

or words already typed.

The process of prediction itself is made by using the following knowledge sources:

1. Statistic information - starting from unigrams i.e., taking into account probabilities of iso-

lated words, or by more complex language models such as Markov models. The most common

method in prediction applications is the unigram (see references in [Garay-Vitoria and Abas-

cal, 2004]).

2. Syntactic knowledge - considering part of speech tags, and phrase structures. Syntactic

knowledge can be statistic in nature or can be based on hand-coded rules [Garay-Vitoria and

Abascal, 1997].

3. Semantic knowledge can be used by assigning categories to words and finding a set of rules

22

which constrain the possible candidates for the next word. This method is not widely used in

word prediction, mostly because it requires complex hand coding or may be time consuming

and inefficient for real-time requirements [Garay-Vitoria and Abascal, 1997].

All methods require lexical data. Such data can be acquired from corpora along with word

frequencies and lexical databases (which may be incorporated into the system). A word-prediction

lexicon usually includes words frequencies. It may also include part of speech data and semantic

data. Lexicons must be adaptable, e.g., updated with the user’s vocabulary, and should be organized

in an efficient way (linear vs. tree structure, with the trade-off of insertion cost) [Garay-Vitoria

and Abascal, 2004].

In languages with a rich inflectional morphology, a statistical method based on frequencies only

is not efficient, and a wide variety of syntactic knowledge is required [Boissiere, 2003] [Garay-

Vitoria and Abascal, 1997]. A mixed approach which involves language models with part of speech

information is more appealing and has been implemented in various systems (see references in

[Boissiere, 2003]). For instance, [Garay-Vitoria and Abascal, 1997] present a system where, in

the beginning of the sentence, words are predicted using a statistical model, but then, parsing of

the partial sentence is used to predict words. Another possible method is by using two steps in

prediction – one of a root and then of its possible inflections [Garay-Vitoria and Abascal, 2004].

Syntactic approaches require a set of linguistic tools such as POS taggers and lemmatizers,

which are not available in all languages. Statistical methods are based on learning parameters from

large corpora. This is problematic when the language that is written with the aid of the word

prediction system is of a different style than the training data (which is, in most cases, obtained

from newspapers).

Since the personal language that is used may be very different than the one that was the base

for modelling, systems must have a good strategy for handling unseen words or sequences of words

(backoff models).

Some word predictors build their language model on-line and are updated as the user enters

more text. This strategy is an effective way to balance the mismatch with “off the shelf” language

models, but it suffers from the limited amount of data available to construct the individual language

model.

23

There are several heuristics which are claimed to reduce the number of keystrokes significantly:

1. Recency promotion either by increasing statistical parameters of recently seen words, or by

managing a file of the words used most recently.

2. The trigger and target method, where certain words can be used as a trigger to the possible

presence of another word within some distance.

3. Capitalization of proper nouns and at the beginning of sentences.

4. Inflecting words where needed (based on syntactic knowledge).

5. Writing compounds (in languages with rich compounding like German or Dutch).

The Drawbacks of the word prediction method lies mostly in the need to take an overt action

to verify the system selection. Typing is, therefore, not a fluent task and may be a cognitive load

[Shieber and Baker, 2003].

Evaluation of word prediction systems considers the keystroke savings, time savings, and cog-

nitive overload (length of choice list vs. accuracy). A predictor is considered to be adequate if its

hit ratio is high as the required number of selections decreases [Garay-Vitoria and Abascal, 2004].

Word prediction can save approximately 50% of the keystrokes (detailed analysis can be found,

for instance in [Garay-Vitoria and Abascal, 2004]).

Abbreviation Expansion

Another option to enhance communication is to use message encoding. Encoding can be done by

letters, letter-category, alpha-numeric, or numeric encoding.

For example: the letter encoding for Please open the door for me can be OD.

This is a very natural way to increase the communication rate. The naive and primary method

for text expansion is a pre-defined look-up table, which typically is defined by the user. This

technique requires memorizing the codes and maintaining a look-up table and may cause a cognitive

overload [Moulton et al., 1999]. An ideal system would allow the user to generate abbreviations

with no cognitive load or extra cost in keystrokes, process spelling, or typing error, and allow the

use of new words in the lexicon.

[Shieber and Baker, 2003] suggest a system that applies both prediction and compression of text

insertion, using a human-centered compression method. This is accomplished by allowing the user

24

to drop all vowels in a word (except for initial letters), as well as dropping one letter in consecutive

duplicate consonants.

A language model was learned on the Wall Street Journal corpus, in four stages: construction

of an n-gram model of the corpus (basically, this means that the model can predict the likelihood

of any sequence of n words), translating the language model to the compressed version of words,

taking care of unknown words, and handling numbers.

For example, consider the sequence: < An >< example >< of >< NUM >< words >. The

sequence gets a probability by applying the language model. It is then converted to a sequence

of characters and the unknown number is inserted – resulting in an example of 5 words. The

words are then compressed – an exmpl of 5 wrds. In this example, the last word can be also

the abbreviation of wards. However, probabilities are assigned for each possible sequence and the

algorithm finds the most likely possible source. The reduction of keystrokes was measured in the

number of characters of this system and was 26.5%, with a low error rate of 3%.

Symbols and Prediction

In addition to word prediction, several systems use prediction for sequences of symbols.

[Waller and Jack, 2002] used word prediction methods for translating Bliss symbols into English.

Based on the idea of language-independent word prediction [Claypool et al., 1998], a system for

language independent translation from Bliss to English was developed to combine the translation

module into a Bliss word-processor [Andreasen et al., 1998].

For this purpose, two dictionary files were created: a word association dictionary, containing

tri-gram information from a given corpus (a source text-file), and a file with information for each

Bliss symbol: the symbol translation in English, synonyms, and possible inflections of the word.

The word association dictionary contains balanced binary trees of three levels: each word is a

node in a balanced binary tree, and functions as a root for another binary tree which contains the

words that are found to follow it in the text, again, each node is a root of a third level of binary

tree for the third word of a sequence. Each node contains the frequency of the word as well.

Once these files are created, translation proceeds as follows: given a Bliss sequence to translate,

the program consults the Bliss dictionary and retrieves all synonyms/inflections for the given word.

For every possible sequence, using a Markov language model, the association dictionary is searched

and the probability of the sequence is calculated. However, because of the telegraphic nature of

25

the Bliss utterance, for a given sequence AB, the trees are searched for A Y B sequences as well.

For example, if the given input is: boy + to go – assuming lad is a synonym of boy and going is

a possible inflection of to go, possible strings that are computed are: boy + is + going or lad is

going.

Evaluation on different sizes of text input file, shows that for a 1,000,000-word file, the shorter

the sentences are, the better the translation is. However, the longer the sentences get, the more

mistakes are found (hence sentences of up to 5 words were translated well). Better results may be

achieved with different source files.

An additional system for symbol prediction is [Gatti and Matteucci, 2005], named CABA2L

(Composition Assistant for Bliss Augmentative Alternative Language), predicts Bliss symbols and

claims to reduce 60% of the time required to produce a message. The system’s approach is statis-

tic/semantic, using a discrete implementation of Auto-Regressive Hidden Markov Model (AR-

HMM). The Markov hidden states are the semantic category of the previous Bliss symbol. All

Bliss symbols were assigned one of six grammatical categories, and each category is further subcat-

egorized by semantics. Subcategories may share a logical connection, but substantive categories,

for instance, will be specified only if they have a parallel verbal category (for instance food and

feeding; however, there is no substantive animal category since there is no other corresponding

category).

CABA2L is integrated into BLISS2003, a communication software centered on the Bliss lan-

guage. CABA2L receives the last symbol entered from BLISS2003, and calculates the four most

likely symbols to be chosen next. These symbols are presented in a separate pane, and scanning of

symbols starts from this pane.

Tests with users of BLISS2003 showed a time reduction of 60%, a very short time needed for

adjustment to the system, and no significant delays of system calculations.

Text Simplification

Aphasia refers to a loss of communication skills of adults as a result of a stroke, brain tumor, de-

generative diseases, or a head injury. Disabilities may be in comprehension of language (Wernicke’s

Aphasia) or producing language (Broca’s aphasia), as well as in reading and writing. A total loss of

reading skills is called alexia and partial reading disorders - acquired dyslexia. Most aphasic people

display difficulties in sentence comprehension. The most acute problems for aphasic patients are:

26

1. comprehension of sentences with multiple verbs and their functional argument structures.

2. the tendency to read sentences in an SVO (Subject-Verb-Object) order makes passive clauses

problematic – especially when the meaning allows a reverse reading of the clause (i.e., a cake

was eaten by the boy will be understood correctly, but not the man was slapped by the woman)

[Canning et al., 2000].

3. Anaphora resolution.

Text simplification is a language-transformation task in NLP research. The purpose is to

rephrase a given text to make it comprehensible by aphasic readers while preserving the origi-

nal meaning of the text. Complex syntactic structures and non-frequent words are identified, and

text is generated anew with simpler syntactic structure and more frequent words.

There are several text simplification systems such as PSET (Practical Simplification of English

Text) [Carroll et al., 1998], SYSTAR [Canning et al., 2000], and ENDOCRINE [Liben-Nowell,

2000].

The typical architecture of a text simplification system is as follows:

Analyzer Syntactic analyzer and a partial disambiguator.

Simplifier Generator of text in simpler structures.

The Analyzer is composed of three main modules, structured in a pipeline:

1. a lexical tagger,

2. morphological analyzer - inflectional analysis of words in the text given the part of speech.

3. a parser - builds a syntactic tree and marks words with their grammatical relations.

The resources used for the system include a lexicon of places, organizations, institutes, and

similar for named-entity recognition. Quoted text as well as headlines are not simplified.

As was mentioned above, there are two tasks of simplification: lexical simplification – i.e., use of

more frequent or unsophisticated words, and syntactic simplification – i.e., transforming complex

syntactic structures into simpler ones. The aim of the system is not to summarize but to simplify

the source text, thus keeping it as cohesive as possible. Cohesion is kept in two manners: (1)

Ensuring that the resolved anaphors are not replaced if the original noun phrase (NP) appears

27

previously within the sentence, and (2) replacing original sentence-opening anaphors with NPs to

maintain the text style.

In the SYSTAR system, while regenerating the simplified text, cohesion is kept by filling elided

NPs after splitting compound sentences, and preserving the tense, mood, and aspect of passive

sentences [Canning et al., 2000].

The architecture of the simplifier also consists of three modules in a pipe-line:

1. Anaphora resolution (while considering context)

2. Splitting compound sentences and transforming passive to active sentences (single sentence

processing)

3. Replacement of some resolved anaphors.

Simplification is done using a set of rules - a given sentence is unified with left-hand pattern

and when a match occurs, it is transformed following the right-hand pattern of the rule.

Resolving and replacing anaphors is done when pronouns of a a given set (he, she, him, her,

they, his, hers, their) occurs in a sentence. The resolution is based on the CogNIAC system

[Baldwin, 1995], which returns a set of possible antecedents for a given pronoun. A sequence of

rules is then applied: (1) coreference information (gender, number, type); (2) subject/object picking

subject/object respectively; (3) pronouns with unknown grammatical function will pick the most

recent antecedent; (4) recency (up to two sentences window). A pronoun is replaced only if its

antecedent was detected in a previous sentence.

In [Liben-Nowell, 2000], the syntactic simplifier is based on a set of rules that are written by

linguists, then applied on parse trees.

The lexical simplifier [Carroll et al., 1998] finds the set of synonyms for a given word in Word-

net, calculates their frequencies and, with consideration of the simplification level required for the

user, consults the Oxford Psycholinguistic Database [Quinlan, 1992] for the synonyms’ Kucera-

Francis frequency. Finally, the most frequent word is chosen. Since true disambiguation requires

deep semantic analysis of text, in cases of ambiguity, the assumption is that frequent words won’t

have to be replaced and less frequent words tend to be less ambiguous.

28

Evaluation of the simplifier was calculated for the different modules, with 60% recall, 84%

precision of Anaphora resolution, 100% recall, 88% precision of complex sentences, 70% correct

activisation of passive clause. The user evaluations show that the new simplified text shortens

reading time.

Message Generation

Although most systems described above can be considered as message generation devices, I include

in this category systems that are not used to enhance or ease the text typing process, but are

being used as a computerized communication board with symbols, letters or words and capable of

generating a full sentence from a partial input sequence.

This section describes only sentence retrieval systems; section 8.4 and chapter 9 discuss in depth

those systems that use natural language generation (NLG) techniques for message generation.

A meaningful use for a sentence retrieval system is story telling. It is very important to give the

AAC user not only the ability to express his needs and wants, or to react to what was told to him

beforehand, but also to initiate a conversation or a social chat, as well as to tell his own stories.

The slow rate of communication and the limited range of symbols on a display can narrow this

ability. The use of story-telling systems encourages users to take a more active part in discussions,

and improves literacy skills, especially when the messages are edited online by the user [Waller

et al., 2000b].

[Waller et al., 2000a] have addressed this issue in their research, developing Talk:About, a

commercial system (produced by the Mayer-Johnson company), which enables a user to store

personal stories in advance and allow these stories to be quickly edited and retrieved during a

conversation. Selected sentences are vocalized with a speech synthesizer. Stories are categorized

by topics or people, and a list of possible stories are offered based on frequency and the history of

story telling. The system includes both the Quick:Chat (a Don Johnston, Inc. product) for quick

access of commonly used phrases, and Co:Write, a word prediction software.

[Pennington et al., 1998] discusses SchemaTalk [Vanderheyden et al., 1996], a software designed

to access large amounts of prestored text in an efficient and intuitive manner to speed up com-

munication in predictable conversation, based on psychological research. This program offers a set

of stereotypical conversation schemas with slots to be filled and for each slot a set of pre-defined

fillers. For example, a schema for “buying food at a store” would include the following template:

29

I want to buy <filler:food-item>. The <filler:food-item> would then be associated to a list

of candidate filler words for this slot.

[Vanderheyden et al., 1996] shows evaluation of SchemaTalk made by two subjects - an AAC

user and a speaking person, in an artificial job interview. Evaluation shows an increase, that

improved with the course of the study, of both the number of words per turn (from an average of

10 to 22.6 for the AAC user) and the overall speech rate (from 4.5 words per minute to 5.3).

2.3 Summary

NLP and AAC relate to the use of language from two very different points of view, but have very

much in common. Both fields of research and assessment search a path to produce language in a

non-natural way, as well as to make language easier to understand when the ability to understand

is damaged or absent.

AAC systems can be evaluated along three main dimensions: intelligibility (how the system

helps users be understood and understand converstions), efficiency (how the system speeds up

communication to overcome physical and cognitive impairments), and social value (how the system

enables users to participate in social interactions).

AAC systems vary according to the target population, the communication goals, the technology

used, the input language, its layout, the selection methods, the output, and the processing methods

in computerized systems.

NLP systems also vary in their goals, knowledge representation, and processing methods. Pro-

cessing refers either to different levels of understanding and analyzing text, or to the production of

texts for various communication goals.

This work investigates the potential of using NLP techniques to improve AAC systems. NLP

techniques are used to (i) enhance communication via prediction or expansion, (ii) to generate full

messages, and (iii) to simplify text.

We focus on the field of Natural Language Generation and, Specifically, on the construction of

a dynamic display to generate messages using the Bliss symbolic language, for both Hebrew and

English speakers.

The key objectives of the work are to improve intelligibility by providing an explicit represen-

tation of the semantic content of the interaction, and improve efficiency by exploiting semantic

30

representation to generate well-formed messages taking into account linguistic knowledge.

The next chapter presents in detail the objectives of this research and the approach of using

semantic authoring for message generation in the AAC context.

31

Chapter 3

Objectives

In this work we investigate ways to exploit natural language generation (NLG) techniques for


The scenario we consider is the following: an AAC user selects a sequence of symbols; his

partner then reads out the sequence and utters a natural language sentence. We interpret this

scenario as a typical natural language process: content planning is performed by the AAC user and

content is expressed by the sequence of selected symbols; linguistic realization is performed by the

interlocutor.

We use NLG techniques to produce utterances automatically from the sequence of symbols,

while the content determination is done by the AAC user.

Previous works on NLG-AAC systems ([Vaillant, 1997], [Copestake, 1997], [McCoy et al., 1998],

for example) have adopted a technique of first parsing a telegraphic sequence, then re-generating

a full sentence in natural language. The initial message is of a telegraphic nature because it

lacks the main cues of morphological and syntactic structure that exist in natural language. As a

consequence, reconstruction of the intended meaning is made difficult. Deep semantic and lexical

knowledge sources are required to recover the meaning. In general, such resources are not readily

available and, as a result, systems with only a reduced vocabulary have been demonstrated.

The main question we address in this thesis is whether generation is possible, not through the

process of parsing and regeneration, but through a controlled process of authoring, where each step

in the selection of symbols is controlled by the input specification defined for the linguistic realizer.


32





[Dahan-Netzer and Elhadad, 1999]) – the aim is to develop a system that can generate text in both




mately 200 atomic meaning-carrying symbols. The rest of the symbols (approximately 2500) are

combinations of these atomic symbols. This compositionality is a very important characteristic of

Bliss as a language, and we designed a lexicon which captures the strong connection between the



Finally, a practical objective of our work was to provide Bliss tools for Hebrew speakers. When

Bliss was adopted for use in Israel, a decision was taken to write Bliss symbols from right to left

as the Hebrew writing system, and consequently to invert the display of the symbols (or at least

of most of them). As a result, most software developed in the world for Bliss (either commercial

or experimental) could not be used by Hebrew-speaking users 1. As part of this research, we have

developed a set of tools (lexicon, composition) to work with Hebrew Bliss.

3.1 Generation from Telegraphic Input

Existing NLG systems for AAC purposes share a common architecture: a telegraphic input sequence

is first parsed, and then a grammatical sentence that represents the message correctly is generated.

The main difficulty in this method is that when parsing a telegraphic sequence of words or

symbols, many of the hints that are used to capture the structure of the text and, accordingly, the

meaning of the utterance, are missing.

Moreover, as an AAC device is used not only for typing text, but also for real-time conversations,

the interpretation of the utterance relies to a large extent on pragmatics – such as the time of a

mentioned event, omitting syntactic roles and making reference to the immediate environment.1See the report by Judy Seligman-Wine: http://www.blissymbolics.org/canada/pg10 30th.htm

33

In [McCoy et al., 1994], a set of examples from real therapists and AAC users’ conversations are

given: each example is a pair that includes the user’s telegraphic utterance and the therapists’ full

sentence interpretation (as confirmed by the user). The paper analyzes the syntactic and semantic

inferences that were made by the therapist. These data were used when designing the Compansion

system [McCoy et al., 1998].

For example:

S: <girl> <make> <in> <pan> <egg> <breakfast>

T: Girl will make the eggs in the pan for breakfast

In this example, the therapist (marked with T) added the future tense, plural number to the

egg and the for preposition.

In other examples, the original word order was changed by the therapist (1), missing agent (2)

or verb (3) were inferred, conjunctions (3) and even content words (4) were added.

(1)

S: <boy> <table> <dusting> <grandmom> <floor> <sweep>

T: Boy is dusting the table and the grandmom is sweeping the floor.

(2)

S: <wash> <clothes>

T: They are washing clothes.

(3)

S: <toys>

T: They have toys.

(4)

S: <girl> <make> <bed> <boy> <help> <girl> <bed>

T: The girl makes up the bed and the boy helps the girl make up the bed.

(5)

S: <girl> <help> <clothes> <up>

T: Girl clothes up. She’s hanging the clothes up.

The main questions at stake are: how good can a semantic parser be, in order to reconstruct the

full structure of the sentence and are pragmatic gaps in the given telegraphic utterances recoverable.

In order to answer the questions, we must investigate the knowledge and inference tools that

34

can serve the purpose:

1. Rich lexical information

2. Data representation and unification tools

3. View of context

Since telegraphic text contains mostly content words and lacks functional words as well as

morphological inflections that are used to identify the word’s part of speech, a most reasonable

method to parse the utterance is by using dependency relations and therefore using data structures

that support such dependencies.

Such methods were used in several works dealing with translating telegraphic text, mostly

military messages. In [Grishman and Sterling, 1989], the parsing grammar was enriched with

rules which allowed omitting prepositions. However, this method considerably increases structural

ambiguities (since all NPs can be interpreted as PPs as well) and, therefore, requires both rich

semantic coding in the lexicon and good scoring functions [Lee et al., 1997].

Rich lexical knowledge is needed to identify the possible dependencies in a given utterance,

i.e., to find the predicate and to apply constraints, such as selectional restrictions to recognize its

arguments.

In the sequence <girl> <make> <in> <pan> <egg> <breakfast> - the animate girl is most

likely the agent of the make process, and the egg is its theme. But in structurally similar sentences,

recovering the semantics of the process and the possible relations between its arguments can be

more complicated 2:

(1)

<to-be> <teacher>

There is a teacher.

(2)

<to-be> <teacher> <Dina>

The teacher is Dina.

(3)

<to-be> <teacher> <grumpy>

The teacher is grumpy.

2The star notation means the sentence following it is ungrammatical or semantically-ill

35

(4)

<to-be> <teacher> <room>

The teacher is in the room.

There is a teacher in the room.

<to-be> <table> <room>

The table is in the room.

<to-be> <room> <table>


*The room is in the table.

<to-be> <book> <table>

The book is on the table.

*The book is in the table.

*The table is in the book.

The verb to-be can be used for several semantic purposes: existential (1), equative (2), attribu-

tive (3), and locative (4). Furthermore, locative relations are recoverable by the nature of the

located and the location, and very particular attributes of the location must be known in advance

(such as surface, container, size) (4). For this purpose, a rich ontology, which supplies relevant

features (not necessarily lexical), is required in addition to the lexicon.

Inferring missing verbs depends heavily on the context of the utterance. While shopping for

food, the message <cheese> could be interpreted as “Lets buy cheese”; but during lunchtime at

school it may be “I have cheese in my sandwich.”

In a more immediate question of context, a system must also supply defaults and make inferences

on deictic properties of the references in the message. The question of the verb tense and definiteness

of nouns depends on the history of the conversation as well as on the immediate happenings in the

conversation context.

It is crucial to understand these obstacles, in order to conclude that full automatic generation

based on parsing only is not possible (as such a process cannot be fed with all context variables

which can be perceived by the senses).

The questions, therefore, are how can text be generated in an optimal way despite these obsta-

cles, while enhancing communication rate, keeping the process easy, and allowing a wide expressive

possibility.

36

Figure 3.1: Dynaxov c© sentence starters

3.2 Generation as Semantic Authoring

Our approach to the generation process in an AAC context is based on the scenario of semantic

authoring. In this method, each step of input insertion is controlled by a set of constraints and

rules, which are drawn from an ontology. The system offers, at each step, only possible complements

to a small set of concepts. Generation is an incremental process and the full utterance’s input for

the syntactic realizer is revised with each step taken. If the final input is not complete, missing

constituents are given default values – either syntactic (such as pronouns) or using a set of pre-

defined participants. The system also preserves a view of context by an underlying management of

references – both to entities that were mentioned in the conversation and to propositions in general.

Following the paradigm of dynamic displays as introduced by Gayle Porter in the Dynavox c©communication board (see Figure 3.1) – we also allow sentence starters such as I’m going to or I’d

like to. We view such sentence templates as pre-defined partial semantic structures.

This approach avoids the difficulty of semantic parsing by constructing a semantic structure

explicitly while constructing the input sequence incrementally. It combines three aspects into an

integrated approach for the design of an AAC system:

• Semantic authoring drives a natural language realization system and provides rich semantic

37

input.

• Communication board is updated on the fly as the authoring system requires the user to

select options.

• Ready-made inputs, corresponding to predefined pragmatic contexts are made available to

the user as semantic templates.

In the following chapters, we investigate each of the problems raised by message generation

with Bliss symbols using semantic authoring. We cover the definition of the input language, the

organization of the communication board, the definition and acquisition of the required ontology,

the use of a large-scale lexicon, and, eventually, the evaluation of the effectiveness of this approach

to (1) speed up communication and (2) extend the range of expressible content that an AAC system

can support.

38

Chapter 4

Usage Scenario

We present in this chapter sample interaction scenarios between users and the AAC system we have

designed. These usage scenarios illustrate the requirements that the system must meet.

Overall, the system appears to the user as a dynamic communication board, which shows

symbols in Bliss and produces fluent output in either Bliss, English, or Hebrew. As the user selects

new symbols, the communication board is re-organized to ease the selection of further symbols.

The sequence of symbols entered by the user is “translated” into fluent language incrementally –

so that at each stage, output text appears on the screen.

The main display is initialized in three possible manners: (1) most frequently used words, (2)

most frequent shapes/symbols (e.g., person, water, activity), (3) possible scenarios (e.g., school,

home, family). The board is initialized with a selection of candidate Bliss symbols and of references

to entities that are likely to be useful in the interaction (we call these “participants”). Each user

can tailor the initial set of participants for different scenarios. In addition, for different settings,

defaults, can be specified. Defaults provide values for attributes such as the tense or the mood of

the clauses. Defaults need not be repeated for each sentence, but can be specified only once at the

beginning of the session. They can be overridden in specific sentences (but at the cost of extra

typing). Defaults can be changed while editing.

To understand the manner in which our system works, we show how we deal with the following

issues:

1. Choosing an initial board.

39

2. Setting defaults - tuning participants and speech acts for various environments for the user

(school, home, shopping).

3. Selecting symbols and changing the board accordingly.

4. Generating text: adding function words, dealing with morphology, Aggregation, and referring

expressions.

Assume the desired sentence is “A girl is making eggs in the pan for breakfast.”1

The first word to be chosen can be either an action or a noun. Once the symbol <girl> is

chosen, the display changes and shows only symbols of verbs that have an animate agent. Since

the list may include more symbols than can be shown – they are sorted according to frequencies

(most frequent symbols are shown first).

If the food/drink category is chosen, the symbols that are shown are filtered by the constraint

of being labelled as food/drink. The food/drink category is made available as a selection if the

situation home or shopping is activated.

At any point while entering the input, it is possible to return to the main display and choose

symbols from other categories.

The determiner a will be generated by the grammar since an instance of the word girl was not

used previously in the interaction.

Once the verb make is chosen, the display will now show categories of nouns that can act as

the theme to this verb. Again, if the context food/drink was already chosen, then the symbols

displayed will be from this category.

Next, the system offers sentence complements that are realized as circumstentials or adverbials

such as where what for.

Consider the following descriptions of photos, given by a Bliss user:2

Pablo and I are playing. We are watching TV

The following sections describe how the system provides the tools to generate the desired sen-

tences.1This example is deliberately different from the example presented in the previous chapter, since the input insertion

is done differently, as will be explained.2Examples taken from

http://www.blissymbolics.org/canada/readingroom/english/text/filip contents.htm?Innhold.x=18&Innhold.y=32

40

4.1 Maintaining a View of Context

References (entities, symbols) that will be used frequently in the discourse are located on the board

and are given default properties. Entities mentioned in previous utterances are shown as well, and

in subsequent utterances these may be matched to complete an utterance if it has missing entities,

using selectional restriction constraints.

The symbols that appear on the board are not just words, but internally, they are connected

to a semantic representation, including attributes and types from an underlying ontology. In the

rest of the section, we present these semantic structures as expressions in the Conceptual Graphs

(CG) formalism (which is described in further chapters).

The user first selects the “participants” to the discourse s/he wants to introduce. This is

done by selecting entities from the default pane and adding them to the “participants” context.

If two participants are selected, the output shows a conjunction structure in the text pane. For

example, the selected participants are described by the following semantic description (using the

CG notation):

[Boy: #I]-(Name) --> [Word: "Felipe"]

-(Age) --> [12]

[Boy: #Pablo]- (Brother-of) --> [Boy: #I]

- (Name) --> [Word: "Pablo"]

- (Age) --> [10]

And the generated text will be Pablo and I.

Note how internally, the system encodes participants as complex conceptual graphs – and not

as words. The graphs displayed above indicate that Pablo is the brother of Felipe, who is now

speaking. Reference planning determines that the forms “Pablo” and “I” are appropriate in the

current discourse context to refer to these two entities.

4.2 Argument Structure Specification

After choosing the participants, the main pane shows symbols referring to verbs (activities) that

require a Boy or one of the super-concepts of Boy (up to the semantic type Animate in the

underlying ontology) as one of their arguments.

The search for candidate activities is driven by the semantic type of the selected participants (in

41

WordNet Senses play(5 34)

Thematic Roles

Actor1[+animate]

Actor2[+animate]

Frames

Intransitive (+ with-PP) "Brenda met with Molly."

Actor1 V Prep(with) Actor2

Intransitive (plural subject) "The committee met."

Actor1[+plural] V

Simple Reciprocal Alternation Intransitive () "Brenda and Molly met."

Actor1 and Actor2 V

With Preposition Drop Alternation () "Anne met Cathy."

Actor1 V Actor2

Verbs in same (sub)class

[consult, meet, play, visit]

Figure 4.1: VerbNet entry for Play

our example Boy). Since there may be many such verbs, they are filtered and ordered by frequency

of usage, or by context. If the symbol to play is chosen, the text ’Pablo and I are playing ’ will be

generated immediately, using the right inflection for the verb ’to be’ and the default progressive

tense.

The verb ’to play’ belongs to the ’meet’ class of verbs [Levin, 1993]. This class defines possible

syntactic alternations (e.g., the Understood Reciprocal Object Alternation) (see Figure 4.1). This

information is encoded in the lexicon of the system and drives the generation of the sentence in

one of the possible syntactic structures:

I played with Pablo,

Pablo and I played (with each other).

The second option is chosen for the default realization; however, the system can provide the

other alternation by a single push of a button.

Once all participants required for the verb categorization structure are given, the system offers

to generate circumstantial adjuncts (such as location and time).

4.3 Referring Expressions

In a subsequent reference to Pablo, either a pronoun will be used (he) or, in case of possible

ambiguity, the phrase my brother will be generated. We use the algorithm of [Reiter and Dale, 1992]

for choosing among the possible forms of referring expressions (pronoun, full definite expression,

partial definite expression, one-anaphora). This algorithm relies on discourse context information,

which we maintain in the form of the list of CGs referred to in the previous discourse.

42

The system also performs aggregation – and combines utterances into a single one following the

algorithm defined in [Shaw, 1995]. For example, for the above sentences – the system can generate

one clause: Pablo and I are playing and watching television.

4.4 Lexical Choice and Syntactic Realization

Internally, Bliss symbols are mapped to conceptual graphs. As symbols are entered, the graphs

are joined to form a larger graph depicting a complex entity or a situation. Each time the graph

under construction is modified (by typing a new symbol or modifying one of the symbols in the

concept), the full generation chain is re-executed. This happens with no noticeable delay – the

output sentence is just updated, in English and/or Hebrew.

Concepts are associated to a lexical entry in the lexical chooser. A default lexeme is specified

in the lexicon, together with synonyms. In most cases, the default lexeme is selected by the lexical

chooser, but sometimes collocation constraints override the default. For example, for the selected

symbols to see television, we generate “We are watching television” and not “We are seeing

television.”

To sum up – the user proceeds to produce the two sentences according to the following steps:

An initial board is populated according to the “family” scenario. Symbols corresponding to

family members become accessible in the “Participants” pane.

Select the symbols Pablo and I in the “participants” pane. The expression “Pablo and I ”

appears in the output pane.

Possible actions are presented in the main board. Candidate actions are selected based on

selectional restrictions, frequency information, and the current scenario. The clause “Pablo and

I are playing” appears in the output pane. The tense is selected by default (as specified by the

current scenario).

A new sentence is started. The action see is selected. The subject for this action is matched by

looking up the context – and the group “Pablo and I” is provided as a default. It is now rendered

in text as “we” – since the reference is now recoverable. The sentence “we are seeing” appears in

the output pane.

Possible complements for see are proposed on the board. We search the ontology for concepts

that match the selectional restriction of see (encoded in the VerbNet lexicon as the WordNet

43

synset stimulus). We narrow the search according to the current scenario (family).

The participant TV is now selected. The lexical chooser adapts the sentence to “we are watching

TV ” based on a collocational constraint.

Let us see how to deal with the set of sentences given in the previous chapter.

Figure 4.2: Bliss sequences for to be yeS verbs

(1)

<to-be> <teacher>

There is a teacher.

(2)

<to-be> <teacher> <Dina>

The teacher is Dina.

(3)

<to-be> <teacher> <tall>

The teacher is tall.

(4)

<to-be> <teacher> <room>

The teacher is in the room.

There is a teacher in the room.

44

<to-be> <table> <room>


<to-be> <room> <table>


*The room is in the table.

<to-be> <book> <table>

The book is on the table.

*The book is in the table.

*The table is in the book.

In the case of using Bliss symbols as the input language, the process is somewhat simpler. First,

there are distinct symbols for some of the possible relations that <to-be> represents. There is/are

is a distinct symbol from to be or to have (see Figure 4.2).

In the case of spatial relations such as on or in, the relation itself is available on the display

and the generation process generates the complete sentence accordingly. However, our ontology is

still not specific enough to recognize whether an object has the properties of being a container or

a surface.

4.5 Summary

As we have shown in this chapter, the generation of a message is done incrementally and without

the need for parsing. At each step during the message production, the system processes an internal

representation of the ontological and lexical data, generates a partial sentence and the display is

updated according to the context and according to selectional restrictions. The system uses default

values based on pre-defined settings and using reference expression planning.

The next chapter presents the system architecture and the flow of information among the

components of the system.

45

Chapter 5

System Architecture

A typical system of natural language generation contains two main components with distinct (but

strongly connected) functions: content planning addresses the question of what to say, i.e.,

producing content, and surface realization determines how to say this content (see Section 6.1).

In a system that automatically generates text from a telegraphic symbolic message (an AAC-

NLG system), content determination is practically performed by the speaker, and surface realization

is performed automatically by the system (see Figure 5.1).

This chapter presents the structure of our system by presenting in turn:

1. The infrastructure upon which the system is built, including a set of lexical databases, real-

ization grammars, and ontologies.

2. The User Interface presented to the AAC user.

3. The internal process of generating a message.

5.1 Infrastructure Development

The construction of this project is based on a set of tools, which have been developed separately,

then integrated into the AAC tool:

• Lexicons -

1. Bliss lexicon

46

Figure 5.1: General architecture and flow of information

2. Integrated verbs lexicon

• Ontology - concepts and relations database; we developed an ontology acquisition tool using

online lexical resources.

• SURGE/HUGG - English/Hebrew syntactic realizer for natural language generation.

• Semantic Authoring Tool (SAUT) - a platform to design semantic authoring tools, where

the user edits a semantic representation and is presented with realtime feedback in natural

language.

Each of these tools has been used in different contexts than the AAC system we present here,

and has been evaluated separately. The details of the tools will be presented in the following

chapters in turn. In this chapter, we explain the overall flow of data from one component to the

next within the AAC usage scenario.

The Bliss lexicon we have designed and built enlists the symbols (which we interpret as the

atomic conceptual units of the system) that are available in our system with their graphical rep-

resentation (see Section 7.2.2). The lexicon comes with graphical tools to display or create new

47

symbols, a search engine to retrieve symbols given parts of the symbol or their translation in En-

glish or Hebrew, and an engine to compute semantic relations among symbols based on their shared

structure.

The integrated verbs lexicon combines structural and semantic information from several sources

(the WordNet Lexical Database [Miller, 1995], English Verb Classes and Alternations (EVCA)

[Levin, 1993] and the COMLEX syntax dictionary [Grishman and Sterling, 1989]). The various

sources have been merged and we have built a single, rich lexicon of English verbs. This lexicon

has been formatted as an extension to the SURGE realization grammar for English [Elhadad, 1992]

(see Section 7.4). We use this rich source of knowledge also as part of the ontology acquisition

method.

We developed an ontology which serves as the basis for the semantic authoring process. The

ontology includes a hierarchy of concepts and the information it encodes interacts with the con-

ceptual graphs processing performed as part of content determination and the lexical chooser. The

ontology is described together with the other lexical knowledge bases in Chapter 7. We developed a

semi-manual ontology acquisition tool which relies on the lexical knowledge databases WordNet

and VerbNet [Kipper et al., 2000]. This module is presented in details in section 7.3.

In order to allow output text in both English and Hebrew, we have extended the development

of HUGG [Dahan-Netzer, 1997] – a syntactic realizer for Hebrew. Chapter 6 presents in details the

process of natural language generation and the extensions of HUGG.

Finally, chapter 8 describes SAUT [Biller, 2005] [Biller et al., 2005] – a system of semantic

authoring and then use of the SAUT technique in a communication board with additional design

decisions for the unique properties of the AAC usage, such as initializing the display and setting

conversation defaults.

5.2 Flow of Information

In this section, we describe how data flows from the user input, the various knowledge sources used

by the system and through the various processing components.

The initial display set is based on user configuration, and according to the desired situation of

use. The user begins her conversation by choosing a symbol (either from the main display or by

pressing a hyperlink key which leads to a specific domain display, then choosing a symbol). Once

48

Figure 5.2: System Architecture

a symbol is chosen from the display, the system looks up the symbol in the ontology and retrieves

its information. A fragment of the ontology is shown in Figure 5.3. Each entry in the concepts

ontology contains the object’s name, a string that is later used for lexical choice (including its

synonyms), its immediate parent in the hierarchy, and a binary value that indicates whether it is

an internal node i.e., it has no symbol to represent it. The hierarchic information was retrieved

from WordNet (see chapter 7 on the acquisition process). If a symbol represents a predicating

concept, it includes information about the outgoing relations (e.g., the serve-6 concept in Figure

5.3). A fragment of the relations ontology is shown in Figure 5.4.

Once the user has selected a symbol, the system creates an object of the concept type that was

chosen. Figure 5.5 shows a display after the choice of to play symbol.

After the object is created, the system triggers two processes:

1. Change of display.

2. Surface realization of the partial (or complete) utterance

5.2.1 Changing Displays Dynamically

For each symbol chosen, in accordance to its type (action, object, attribute, preposition), a new

display is structured. If the symbol is an activity, i.e., a concept with relations, then the new

49

<concept name="Object-1" inpstr="object" synonyms="physical_object" hidden="yes" /><concept name="Living_thing-1" inpstr="living_thing" synonyms="animate_thing"parent="Object-1" hidden="yes" />

<concept name="Organism-1" inpstr="organism" synonyms="being"parent="Living_thing-1" hidden="yes" />

<concept name="Person-1" inpstr="person" synonyms="individual, someone, somebody, mortal,human, soul" parent="Organism-1" hidden="yes" />

<concept name="Female-2" inpstr="female" synonyms="female_person"parent="Person-1" hidden="yes" />

<concept name="Girl-2" inpstr="girl" synonyms="female_child, little_girl"parent="Female-2" />

<concept name="Entity-1" inpstr="entity" hidden="yes" /><concept name="Object-1" inpstr="object" synonyms="physical_object"parent="Entity-1" hidden="yes" />

<concept name="Artifact-1" inpstr="artifact" synonyms="artefact"parent="Object-1" hidden="yes" />

<concept name="Instrumentality-3" inpstr="instrumentality"synonyms="instrumentation" parent="Artifact-1" hidden="yes" />

<concept name="Implement-1" inpstr="implement"parent="Instrumentality-3" hidden="yes" />

<concept name="Utensil-1" inpstr="utensil" parent="Implement-1"hidden="yes" />

<concept name="Kitchen_utensil-1" inpstr="kitchen_utensil" parent="Utensil-1"hidden="yes" />

<concept name="Cooking_utensil-1" inpstr="cooking_utensil"synonyms="cookware" parent="Kitchen_utensil-1" hidden="yes" />

<concept name="Pan-1" inpstr="pan" synonyms="cooking_pan"parent="Cooking_utensil-1" />

<concept name="Substance-1" inpstr="substance" synonyms="matter"parent="Entity-1" hidden="yes" />

<concept name="Food-1" inpstr="food" synonyms="nutrient"parent="Substance-1" hidden="yes" />

<concept name="Foodstuff-2" inpstr="foodstuff" synonyms="food_product"parent="Food-1" hidden="yes" />

<concept name="Egg-2" inpstr="egg" synonyms="eggs" parent="Foodstuff-2" /><concept name="Nutriment-1" inpstr="nutriment" synonyms="nourishment,nutrition, sustenance, aliment, alimentation, victuals"parent="Food-1" hidden="yes" />

<concept name="Meal-1" inpstr="meal" synonyms="repast"parent="Nutriment-1" hidden="yes" />

<concept name="Breakfast-1" inpstr="breakfast" parent="Meal-1" /><concept name="serve-6" inpstr="serve" parent="Act" showProperties="no">

<description descriptionNumber="0.2" primary="Transitive"secondary="implicit Theme" xtag="0.2"/>

<requiredOutRelations><rel name="Agnt" /><rel name="Theme" /><rel name="Recipient-animate" />

</requiredOutRelations></concept><concept name="play-1" inpstr="play" parent="Act" showProperties="no">

<description descriptionNumber="0.2" primary="Intransitive"secondary="+ with-PP" xtag="0.2"/>

<requiredOutRelations><rel name="Actor1" /><rel name="Actor2" />

</requiredOutRelations></concept>

Figure 5.3: Ontology fragment for the concepts: pan, breakfast, girl, egg, serve

50

<relation name="Agnt" display="Agent" hidden="yes" srcType="Act"dstType="Living" unique="yes" />

<relation name="Attr" display="Attribute" hidden="no" srcType="Object"dstType="Attribute" unique="no" />

<relation name="Actor1" display="Actor1" hidden="no"srcType="Act" dstType="Living" unique="yes" />

<relation name="Actor2" display="Actor2" hidden="no"srcType="Act" dstType="Living" unique="yes" />

<relation name="Benef" display="Beneficiary" hidden="yes" srcType="Act"dstType="Living" unique="yes" />

<relation name="Cmpl" display="Completion" hidden="no" srcType="TemporalProcess"dstType="Physical" unique="yes" />

<relation name="ActLoc" display="LocationOfAct" hidden="no" srcType="Act"dstType="Location" unique="yes" />

<relation name="Dest" display="Destination" hidden="no" srcType="Act"dstType="Location" unique="yes" />

<relation name="Dur" display="Duration" hidden="no" srcType="Act"dstType="Interval" unique="yes" />

<relation name="Untl" display="Until" hidden="no" srcType="Act"dstType="Situation" unique="yes" />

<relation name="Triger" display="triger" hidden="no" srcType="Act"dstType="Situation" unique="yes" />

Figure 5.4: Ontology fragment of relations

display will show symbols that are compatible with the selectional restrictions of its relations. For

instance, if the symbol to play is chosen, the ontology is looked-up for the concept:<concept name="play-1" inpstr="play" parent="Act" showProperties="no">

<description descriptionNumber="0.2" primary="Intransitive" secondary="+ with-NP" xtag="0.2"/>

<requiredOutRelations>

<rel name="Actor1" />

<rel name="Actor2" />

</requiredOutRelations>

</concept>

The entry indicates that two relations must be instantiated to build a valid conceptual graph

(Actor1 and Actor2).

The system looks up the relation in the relations ontology and retrieves the information which

determines that the relations are connected to concepts of type living:

51

Figure 5.5: The display after the choice of the to play symbol

<relation name="Actor1" display="Actor1" hidden="no"

srcType="Act" dstType="Living" unique="yes" />

<relation name="Actor2" display="Actor2" hidden="no"

srcType="Act" dstType="Living" unique="yes" />

Therefore, on the next display only symbols whose concepts are of type living are shown on

the display. In Figure 7.14, only girl matches this restriction).

Once a choice is made, the system generates the next display according to the selectional

restrictions of the second relation.

Since each symbol/word can be modified with an adjective, a preposition phrase, or an adverb,

each display contains an additional hyperlink button that leads to property symbols. Choosing

this button adds a relation arc to the conceptual graph and once a property symbol is chosen, the

display changes following the same process.

5.2.2 Lexical Choice and Syntactic Realization

Each time a symbol is chosen, the system converts the current expression to a conceptual graph

(CG), maps the CG to a FUF Functional Description (FD), which serves as input to the lexical

52

chooser; lexical choice and syntactic realization are performed, and feedback is provided in English

or Hebrew.

If the chosen symbols so far were I and to play, the conceptual graph built is:

[Play]-(Actor1)->[Person:{I}]

This CG is transformed into an FD of the appropriate form and is unified with the lexical

chooser, using the information on the verb play as embedded in the concept representation:

<description descriptionNumber="0.2" primary="Intransitive"

secondary="+ with-NP" xtag="0.2"/>

The intransitive structure is chosen since there is only one participant given, and the resulting

string generated is I play.

However, once Pablo was chosen as the second actor relation and the CG is complete:

[Play]-

(Actor1)->[Person:{I}]

(Actor2)->[Person:Pablo]

The system consults the lexical chooser again and unifies the given input with the verb’s possible

syntactic structures following its alternation, in this case:

alternation alternation-of-verb-play-simple_reci_intrans

[struct with-np] [struct subj-and-np-v]

The [STRUCT WITH-NP] argument structure means that the input specification that is given to

the syntactic realizer will be of the form:

struct with-np

cat clause

proc

type accompaniment

lex “play”

partic

located

cat personal-pronoun

person first

location

cat pp

prep[

lex “with”]

np

cat proper

lex “Pablo”

53

This syntactic alternation indicates that the clause: I play with Pablo can be generated. Alter-

natively, following the possible choice of the alternation which is available for the verb play with its

current sense, the structure (STRUCT SUBJ-AND-NP-V) can be chosen as well, with the final output

Pablo and I play . In the GUI of the system, a button can switch the generation of the clause from

one argument structure to the next, according to the alternations supported by the verb.

Next, the system offers the opportunity to add sentence modifiers such as time and location

and other possible circumstances.

Once the utterance is complete, the done button is pressed and the final sentence is generated.

The sentence is generated with reference to previous utterances, i.e., the system handles referring

expressions and performs aggregation (see Section 8.1 and 4.3). To this end, the system maintains

a data structure encoding the entities which are referred to in each clause. As the discourse pro-

ceeds, the discourse context is updated with the conceptual graph representation of each entity

that is mentioned. This context representation is used by the reference planning module to deter-

mine whether further references are to be realized as pronouns, definite noun phrases, or partial

descriptions.

5.3 Summary

Two aspects of the system’s architecture were discussed in this chapter: the underlying components

that compose the system, and the internal process of generating a message.

The main knowledge sources of our system are the lexicons (Bliss, English, and Hebrew), the on-

tology (derived from lexical resources) and the syntactic realization grammars (SURGE for English

and HUGG for Hebrew).

The flow of information in our system is typical of an NLG system, driven by a semantic

input interactively authored by the user. Bliss symbols are entered together with a minimal input

syntax. Internally, a semantic structure corresponding to the intended meaning is constructed in

the Conceptual Graph formalism (CG). The CG is then mapped to a lexicalized structure in English

or Hebrew using a lexical choice module. The structure is then realized into a fluent sentence using

a realization grammar. The discourse context is maintained as new utterances are entered, and

reference planning and aggregation are performed on each utterance during the generation process,

thus improving the fluency of the conversation, and speeding up the selection of entities to which

54

past discourse has already referred.

In the next chapter, we provide more details on the Natural Language Generation components

of the system, with special attention to the Multilingual Generation aspect and introduce our

contribution to NLG in Hebrew.

55

Chapter 6

Natural Language Generation and

Syntactic Realization

In face-to-face communication, a speaking partner and a speech disabled partner use paper-based

displays with lists of symbols. The speech disabled partner selects a sequence of symbols (by

pointing at them), and the speaking partner interprets and pronounces the desired sentences out

loud, according to the symbols that are chosen by the AAC user, while adding function words and

inflecting verbs and nouns following the syntax of the spoken language.

With computerized AAC systems with textual or vocalized output, dedicated devices, or soft-

ware in a personal computer, the speaker aims to reach autonomous communication. He must,

however, explicitly choose all symbols, including morphological inflections, function words or prepo-

sitions in order to get a full grammatical sentence. This may not be possible for those who lack

literacy skills, and in any case requires additional keystrokes and slows the communication rate.

Pre-stored sentence retrieval [Waller et al., 2000a] is a method which aims to avoid this burden.

However, the sentence retrieval method suffers from a restricted pool of utterances and limits the

user’s ability to express himself. It has, therefore, limited applicability (it is most useful when quick

responses or fluent conversation are required [Vanderheyden and Pennington, 1998]).

Natural language generation (NLG) techniques can be used to generate full sentences from

telegraphic-style messages. Merging this capability within AAC presents an attractive route of

investigation.

This chapter surveys the field of NLG and introduces our own contribution. We present the

56

typical architecture of an NLG system and methods used for multilingual generation. Section 6.2

focuses on the syntactic realizer, which is responsible for producing the linear form of the words.

Section 6.3 presents our implementation of HUGG, a unification-based grammar for the generation

of Hebrew. HUGG is the first available realization grammar for Hebrew. In the next chapter, we

present our further contribution in NLG in the form of a reusable large-scale lexicon for generation.

6.1 Natural Language Generation

Natural language generation (NLG) is a subfield of Natural Language Processing (NLP), studying

the process of language production from non-linguistic representation of data. The NLG process can

be viewed ([Reiter and Dale, 2000]) as goal-driven communication: the production of an utterance

in natural language is the attempt to satisfy a set of communicative goals of the speaker. The

generation process consists of making a series of decisions – starting from planning the content and

ending with lexical and syntactic decisions.

Use of NLG techniques is growing in various fields. For instance, systems which deal with

vast volumes of data, and need expertise to interpret it and rewrite it in spoken language, are good

candidates for an NLG system. The main uses of NLG are (1) to make data understandable (expert

systems, reports), and (2) to produce routine documents that must be updated often.

In some NLP applications, NLG techniques complement other NLP tasks, such as Machine

Translation (MT) ([Dorr et al., 1998] [Temizsoy and Cicekli, 1998]) or automatic summarization

([Barzilay et al., 1999], [Hovy and Lin, 1998]).

In all these applications, the generated text can be in various languages, leading to multilingual

generation (MLG). Multilingual generation (MLG) systems generate text in several languages from

a single source of information, without using translation.

6.1.1 The Architecture of an NLG System

Traditional NLG systems address the following tasks: content planning (content determination

and document structuring), and surface realization (lexicalization, aggregation, referring expres-

sion generation, and finally syntactic realization) [Reiter and Dale, 2000].

The content planner includes several sub-modules:

57

Content determination is the module that decides which information should be communi-

cated in a text. The decision depends on the communication goals of the intended text, on the

intended reader of the text (expert reader, children, etc.), the size of the text, and the nature of

the underlying information.

Document Structuring is the process of ordering and structuring the chosen information

in a text: such as deciding where to put paragraph boundaries, and determining the rhetorical

structure.

The surface realization module contains the following sub-processes:

Lexicalization is the process where content words are chosen to represent the meanings that

should be conveyed. This process may first aggregate data into meaning components (Concep-

tual Lexicalization) and then find the words in the target language to express them (Expressive

lexicalization).

Aggregation can be performed at various stages of linguistic generation (in addition to the

conceptual lexicalization) – several concepts can be expressed in a single word, two sentences can

be aggregated into one (if, for example, they differ in subject only).

Referring Expression Generation is the process of determining how to produce a reference

to an entity that should be mentioned in the utterance.

Syntactic realizer generates the final linear form of words, deals with morphology and is

responsible for the uttered output being syntactically correct. This process is elaborated more

below.

All of these missions are generally constructed in a pipe-line architecture:

Document Planner - Content determination, document structuring and conceptual lexicaliza-

tion.

Microplanner - responsible for the expressive lexicalization, linguistic aggregation, and referring

expression generation.

The two processes provide a text specification which is realized by the syntactic realizer. Section

6.2 further elaborates on the syntactic realizer.

58

6.1.2 Multilingual Generation (MLG)

Writing documents in different languages in parallel is a common task. This is done daily for

weather reports or software manuals. Automatic MLG refers to the production of documents in

several languages from a single database (and not translating from one source language to other

target languages). The production of technical manuals was found to be an effective application of

MLG ([VanderLinden and Scott, 1995] [Paris and Linden, 1996]).

A central question in MLG relates to the representation of the input to the syntactic realizer

(SR). This representation corresponds to the “interlingua” used in machine translation (MT).

Several recent systems have explored this issue to some depth (WYSYWYM [Scott et al., 1998],

Drafter [Paris and Vander Linden, 1996], kpml [Bateman, 1997], UNITRAN [Dorr, 1994]). The

question can be rephrased as – what is the highest level of information that can be common to

all languages? These questions concern the interface between the knowledge source (an ontology

for instance) and the lexicon (mapping from terms and concepts to lexemes) with reference to the

various languages.

In [Stede, 1996], multilingual generation is viewed as a paraphrasing problem in a single lan-

guage. This work refers, though, to the lexical level only, and does not refer to a unified input

specification for the SR. Since not all multilingual generation systems depend on a stable ontology,

and all knowledge that is used is of the specific application (like [Callaway et al., 1999]), a more

’shallow’ approach is often needed – that is, the interlingua must be established at a level closer to

the observed syntactic level of the various languages.

In [Dahan-Netzer and Elhadad, 1999] [Dahan-Netzer and Elhadad, 1998b], we have established

an input representation for the generation of Hebrew/English noun phrases, starting from the same

input structure but with different lexemes only. The methodology we pursued there was to express

the syntactic form of the noun phrase in the two languages as a set of minimal distinctions (for

example: does the noun phrase include a compound construct - smixut - in Hebrew? does the

determiner express a vague or an exact quantity, etc.?). We then analyzed the knowledge required

to make these decisions. The role of the input specification to the SR is to provide the minimal

set of answers that can guide the generation process. By comparing the set of decisions required

for Hebrew and English NPs, we were able to produce a compact set of semantic features which

provides answers to the decisions required by both languages in their various syntactic forms.

59

MLG Systems aim to be as domain-independent as possible (since development of such systems

is expensive), but are usually applied to a narrow domain, since the design of the interlingua refers

to domain information. MLG systems share a common architecture consisting of the following

modules:

• A language-independent underlying knowledge representation: knowledge represented as AI

plans [Rosner and Stede, 1994] [Delin et al., 1994], [Paris and Vander Linden, 1996], knowledge

bases (or ontologies) such as OWL, the Penman Upper-model, and other (domain-specific)

concepts and instances [Rosner and Stede, 1994].

• Micro-structure planning (rhetorical structure) - language independent - is usually done by

human writers using the MLG application GUI.

• Sentence planning - different languages can express the same content in various rhetorical

structures, and planning must take it into consideration: either by avoiding the tailoring of

structure to a specific language [Rosner and Stede, 1994] or by taking advantage of knowl-

edge on different realizations of rhetorical structures in different languages at the underlying

representation [Delin et al., 1994].

• Lexical and syntactic realization resources (e.g., English PENMAN/German NIGEL in [Rosner

and Stede, 1994])

6.1.3 AAC as an MLG Application

Our approach in this work is to consider the AAC application of message generation as an MLG

application: symbols are entered and are interpreted as a semantic specification of the intended

meaning. From this point on, we apply MLG techniques to translate the semantic specification

into several languages – a fully specified Bliss sequence, English, and Hebrew. The English and

Hebrew versions of the message are intended for the communication partner (who may not be fluent

in Bliss), and the three versions of the message are intended as a feedback tool for the disabled

partner producing the message, to confirm the validity of his input.

We call this approach semantic authoring – that is, our tool provides an environment where the

user can specify a semantic expression of the intended message interactively and in context.

60

As an MLG system, our system [Biller, 2005] [Biller et al., 2005] also includes similar modules.

We have chosen to use Conceptual Graphs as an interlingua for encoding document data [Sowa,

1987]. We use existing generation resources for English – SURGE [Elhadad, 1992] for syntactic

realization and the lexical chooser described in [Jing et al., 2000] and the HUGG grammar for

syntactic realization in Hebrew (see [Dahan-Netzer, 1997] and below). For micro-planning, we

have implemented the algorithm for reference planning described in [Reiter and Dale, 1992] and

the aggregation algorithm described in [Shaw, 1995]. The NLG components rely on the C-FUF

implementation of the FUF language [Kharitonov, 1999] [Elhadad, 1991] – which is fast enough to

be used interactively in realtime for every single editing modification of the semantic input.

6.2 The Syntactic Realizer

Syntactic realizers are best characterized by the structure of their input. The input for SR varies

from a pure syntactic structure (RealPro [Lavoie and Rambow, 1997]) to, at the other extreme,

semantic inputs founded on a generic ontology (called an upper model [Bateman, 1997]). We use

an intermediate approach, implemented in the fuf/surge [Elhadad, 1993] environment.

Syntactic realizers are also distinguished by their theoretical basis: Some are dedicated to a

single theory, like RealPro on MTT [Mel’cuk. and Pertsov, 1987], or kpml on SFL [Halliday,

1994]. surge uses the SFL theory, but also descriptive grammars [Quirk et al., 1985], and others

like MTT and HPSG [Pollard and Sag, 1987]. Nitrogen [Langkilde and Knight, 1998] is based

on an n-gram model learned from corpus analysis, but it still relies on the SFL approach for the

characterization of the input language (it is used in [Dorr et al., 1998]).

The syntactic realizer we use as a model, a basis for extension and a development environment

(both theoretical and implementation framework) is the English syntactic realizer surge [Elhadad

and Robin, 1996] implemented in FUF [Elhadad, 1993]. surge is a reusable grammar – it pro-

vides a compositional input specification language and defaults, determines function words based

on functional descriptions, orders constituents, performs morphological processing and syntactic

pronominalization.

61

6.2.1 Input for Surface realization module

The syntactic realizer (SR) is the component that maps an input set of communication goals into a

natural language utterance. The input contains knowledge, possibly at various levels of abstraction,

of a linguistic phrase.

Making a syntactic realizer available for many applications with different needs, requires it to

allow a flexible input specification, without a commitment to a single lexicon or ontology. This

flexibility allows one to plug the SR in to a system that might provide its own knowledge sources.

In natural language generation, surface realizers are the front-end modules that convert an abstract

semantic representation into a linguistic utterance. Several plug-in syntactic realization components

are available for English sentence generation: surge surge implemented with FUF [Elhadad and

Robin, 1996], NITROGEN, which uses a statistical model of lexical collocations and syntactic

relations [Langkilde and Knight, 1998], RealPro which is based on the MTT formalism, and

nigel, which evolved from the penman project [Mann, 1983]. nigel evolved into the multilingual

text generation workbench kpml [Bateman, 1997].

6.3 HUGG

In a text generation system, a syntactic realizer is the last module of the process and is responsible

for adding function words, controlling the linear order of the words in the utterance, and handling

morphology. The design of a syntactic realizer depends heavily on the type of input that is given

to it by preceding modules in the process. A basic assumption in developing an input specification

is to to keep syntactic knowledge in the syntactic realizer, i.e., the input should be as semantic

as possible so preceding modules can be as language-independent as possible, and relatively free

of re-coding linguistic knowledge. The motivation here is to consider, in advance, multilingual

generation as well as to allow non-linguist experts to design generation systems.

HUGG (Hebrew Unification Grammar for Generation) is a syntactic realizer for the generation

of Hebrew. We have developed HUGG as a Hebrew version of SURGE [Elhadad and Robin, 1996].

Our objective in designing the HUGG input specification was to keep the input given to the Hebrew

SR as similar to the parallel English SR - SURGE, with the exception of language-specific lexemes.

We have found that, although meaningful differences exist between Hebrew and English, it is

possible to use the input as was defined for SURGE with minor changes, usually by raising the

62

level of abstraction in the specification. We have reviewed some of these phenomena in the Noun

Phrase syntax in a previous work [Dahan-Netzer, 1997].

In this work, we have expanded the HUGG grammar to the clause level and found that by using

the transitivity system, as defined for verbs in SURGE, we can take care of various phenomena of

the Hebrew clause and especially the use of copula.

6.3.1 FUF/SURGE

FUF

In this formalism, all linguistic knowledge is represented as a set of features called a functional

description (FD). Each feature is a pair [a:v] – which is composed of a, a unique attribute and

v, a value, which is either an atomic symbol, an FD, or a path (which points to another feature

in the overall FD and means that the two features must share the same value at all times). If an

attribute is not present in an FD, it is equivalent to being present with the value nil.

The process of unification is defined on two FDs and, unlike structural unification, is not based

on size or order of the terms being unified. Basically, unification X∪Y is the smallest FD containing

both X and Y (i.e., while preserving the FD requirement preventing contradictory values for the

same attribute).

A grammar in FUF is a meta-FD – a set of FDs with additional features. These features control

unification, and further processing of the FD: CSET lists the immediate linguistic constituents of

the FD, for further unification. Pattern constrains the linear order of the constituents, and ALT

(ALTernation) allows non-deterministic decisions. Unification of an (ALT list-of-FDs) with an FD,

is done with the first FD of the list. If it does not succeed, it proceeds to the next one; if all fail

unification fails as well. Some special values are possible, such as ANY which enforces a value to be

instantiated by the end of the unification process, NONE which indicates that an attribute cannot

have any value other than NIL, and GIVEN which requires an attribute to have a value different

than NIL at the time of unification.

Further elaborations for the FUG formalism in FUF includes types, the usage of FSET, and

the special feature CAT (see [Elhadad, 1991]). Practically, we use CFUF [Kharitonov, 1999], a

time-efficient implementation of FUF.

63

SURGE

SURGE is a comprehensive, domain-independent portable syntactic realizer for the generation of

English, written in FUF. SURGE draws its linguistic sources mostly from the systemic-functional

theory [Halliday, 1994], but incorporates other linguistic theories such as HPSG [Pollard and Sag,

1987] and MTT [Mel’cuk. and Pertsov, 1987] as well.

The input specification of SURGE was designed in consideration of the overall text generation

process and especially with reference to the former process of lexical choice. The input FD con-

tains linguistic constituents with functional attributes which mark their functionality in the overall

context - such as process and participants. Each constituent includes a special attribute cat

which indicates the value of the syntactic category of its syntactic head.

6.3.2 SURGE input of a clause

In SURGE, linguistic constituents are labelled by their thematic role. Nuclear roles refer to the

process described by the clause, and its participants, and, therefore, depend on the type of pro-

cess described. Satellite roles are the adverbials that describe where/when/why/how the process

happened, and do not depend on the process type.

The Clause sub-grammar is composed of several orthogonal systems:

• Transitivity system - the ideational system - maps thematic roles into core syntactic roles.

• Voice system - the textual system - which handles possible syntactic alternation which changes

the order and function of core syntactic roles.

• Mood system - the interpersonal system - handles variations that are affected by the commu-

nication goal of the utterance - i.e., interrogative, declarative, imperative clauses - or by its

syntactic function (matrix or relative clause).

• Adverbial system - responsible for the ordering of satellite constituents of the clause.

The transitivity system is based on a basic dichotomy of verbs into simple and composite

processes. Simple processes can be events (such as material, mental, and verbal) or relations

(ascriptive, possessive, temporal, spatial).

64

cat clause

process

type [type]

lex [verb]

tense [tense]

polarity [polarity]

participants

agent [agent]

affected [affected]

Composite processes involve both an event and a relation. Following Fawcet’s unified analysis

of three role thematic structures as a causal superposition of two two-role structures sharing a

common role [Fawcett, 1987]:

cat clause

process

type [type]

lex [verb]

tense [tense]

polarity [polarity]

participants

agent [agent]

affected [1][affected]

possessor [1]

possessed [possessed]

An additional approach of designing inputs for surge is by using a lexical process. This approach

allows the input to define a process not in terms of transitivity, but with the use of subcategorization.

In this approach, based on dependency-grammar like Meaning Text Theory (MTT) [Mel’cuk. and

Pertsov, 1987], and following HPSG - a lexical head subcategorizes its constituents (SUBCAT in

short) and determines their order.

65

The input in this case has the following structure:

cat clause

process

type lexical

lex [verb]

tense [tense]

polarity [polarity]

lex-roles

role1 [role1]

role2 [role2]

We have further elaborated the possible inputs for surge to allow syntactic inputs as well:

cat clause

verb[

lex [verb]]

synt-roles

subject

cat [syn-cat]

lex [lex]

object

cat [syn-cat]

lex [lex]

6.3.3 Main Issues in Hebrew Generation

To date, HUGG is the only syntactic realizer that has been developed for Hebrew generation. One

of our objectives is to investigate constraints on the design of the input specification language to a

syntactic realization component through a contrastive analysis of the requirements of English and

Hebrew. By design, we are attempting to keep the input to HUGG as similar as possible to the

one we defined in the SURGE syntactic realization for English [Elhadad, 1992].

We referred to various problems specifically for Noun Phrase generation in previous papers

[Dahan-Netzer, 1997], [Dahan-Netzer and Elhadad, 1999], [Dahan-Netzer and Elhadad, 1998a]. We

have shown that since a variety of lexical, semantic, syntactic, and pragmatic constraints affect

the generation of construct-state (smixut), semantic information must be included in the input to

enable paraphrasing when such a construction is not possible. We have also refined the classification

of quantifiers and determiners into a new set of determiners, partitive, and quantifier words.

In this work, we have pursued the same methodology to the level of the clause.

66

6.3.4 Hebrew Clause

Hebrew’s unmarked order of words in a clause is SVO (Subject-Verb-Object) but word order is

relatively free. In addition, subjects are not always explicitly present and several clause structures

do not have any verb.

Hebrew verbs are inflected by gender, number, and person, and show agreement with the subject

as follows:

Past full agreement except for third person plural: hen/hem Axlu (they(fem/masc) ate);

Present agreement of gender and number;

Future full agreement except for second and third person plural.

Definite Objects are marked with a case marker et.

6.3.5 Subjectless Clauses

There are several cases in which a subject is not explicitly pronounced in Hebrew:

Subject Pro-drop is the case where subjects are dropped when they are recoverable. Since

Hebrew verbs are inflected for gender, number, and person, and show agreement with the

subject, the latter can be omitted in first and second persons in past and future tense.

Imperative clauses, as in English, no explicit subject is pronounced.

General subject - with plural third person subject ”bonym batym hadaSym ba-sxunah” ((Someone-

dropped) is building new houses in the neighborhood).

Raising verbs with sentential complement yaZA ba-sof S-lO hiZlaHnu le-hagyaw (it came to

be at the end that we didn’t manage to arrive).

Intransitive Clauses

Intransitive clauses have no objects and tend to have an SV order.

Unergative verbs are verbs with an agentive subject and express volitional acts ha-yeladym

ZaHaku (the children laughed). The SV order is not mandatory and VS is possible as well bA

ha-davar ve-ZilZel ba-pawamon (came the postman and-ringed the-bell).

67

Unaccusative verbs have a theme (non-volitional) subject and usually expresses a change of

state actions. Hebrew allows both SV and VS orders in this case.

6.3.6 Existential, equative, possessive, and attributive clauses

There is a variety of uses of the Hebrew copula, which are integrated into all kinds of relations in

the transitivity system.

ha-melex hayah/hu’/yihiyeh semel - ascriptive relation, equative mode. (The-king was/is/will-

be a-symbol)

ha-melex hu’/[NULL] werom - ascriptive type, attributive mode. (The-king is/[dropped] naked).

hayah/yeS/Eyn melex - existential type (There-was/There-is/There-is-no king)

hayah/YeS/Eyn le-yarden melex - possessive type of relation. (there-was/there-is/there-is-no

to-Jordan a-king)

Existential clauses are characterized with the use of the word YeS (”there is”) in the present

tense. YeS is considered to be an adverb (for instance, in Rav-Milim online dictionary), but is

mostly treated as a semi-verb (or a verboid). However, since modern Hebrew enables sentences

such as Yes ly Et ha-sfarym (there-is to-me [case-marker] the-books) - i.e., Yes with the objective

case marker Et, it is considered to be a verb as well [Henkin, 1994].

In the past and future tense, the use is of the Hebrew copula to be with inflection, i.e.,

haya/yihiyeh. YeS + Negation in the present tense is realized by the word Eyn. In past and

future tenses, negation is realized with the word lO and the tensed copula.

Possessives are also expressed with the word YeS but the possessor is realized with a preposition

phrase object, with the preposition l. Agreement is determined by the possessed:

haytah ly Hatulah (there-was to-me a-kitten)

hayu lanu Hatulym (there-were to-us cats)

hayu ly Hatulym (there-were to-me cats)

Order within possessive clauses is flexible and is affected by various factors such as definiteness

of the possessed NP and casuality of the speech act.

le-savtA yeS kapryzot (to-grand-ma there-are caprices)

YeS le-savtA savlanut biSvily (there-is to-grand-ma patience for-me)

Yes le-ImA carTysym la-sereT (there-are to-mother tickets for-the-movie)

68

le-ImA hayu Et ha-carTysym la-sereT (to-mother there-were [case marker] the-tickets for-the-

movie)

Eyn la-Hatul te’avon (there-is-no to-the-cat appetite)

The word yeS is also used for expressing modality, with an infinitive verb as a complement.

In ascriptive clauses, a copula is used as well to mark the relation, however, the present tense

is realized with the pronoun, which agrees with the subject (the carrier in an attributive mode or

that identified in the equative mode).

The word order in attributive clauses depends on the definiteness of the carrier – the subject of

the clause:

ha-sereT hayah mewanyen (the-movie was interesting)

hayah sereT mewanyen (was a-movie interesting – it was an interesting movie)

*hayah ha-sereT mewanyen (* was the-movie interesting)

*sereT hayah mewanyen (* a-movie was interesting)

In the present tense, if the carrier is definite then the unmarked structure is verbless:

ha-sereT mewanyen (the-movie interesting – the movie is interesting)

(agreement of noun and adjective would be understood as a noun phrase). In the marked clause,

a copula is used (and in the past/future tense as well):

ha-melex hu’ werom (the-king he-is naked)

In equative clauses, the unmarked structure is SVO with a copula as the verb, however - there

are some cases in the present tense where a the copula can be omitted.

ha-morah Selanu hy’ rynah (the-teacher of-us she-is Rina – Our teacher is Rina)

Samawatem? rynah ha-morah Selanu (Heard-you? Rina the-teacher of-us) – (Did you hear?

Rina is our teacher)

In summary, Hebrew relations are realized in most cases with a copula, but differ in the type

or relation in the present tense and word order. The distinctions that are defined in the inputs for

SURGE (which correspond to the SFL analysis of simple relational processes) enables the correct

realizations.

69

Figure 6.1: A fragment of Hspell database for the word celev (dog)

6.3.7 Morphology

Hebrew morphology is quite complex. Several broad-coverage and robust systems exist that handle

morphology: RAV-MILIM, 1 a commercial system developed by Yaacov Choueka, AVGAD, which

was developed in IBM [Bentur et al., 1992]; recently, Hspell has appeared as a very useful source

for morphology analysis2, and we use this resource in our generator. Hspell was developed neither

for analysis nor generation of Hebrew morphology, but as a speller in the IVRIX project – a free

open-source project that was initiated to establish Hebrew support in the Linux environment.

The developers of Hspell hand-coded a list of approximately 22,000 lexemes on which they semi-

automatically activated inflectional rules, resulting in a list of 444,400 words. In addition, they

collected a set of rules for word disassembling in order to identify prepositions, definite markers,

and other infixed words.

Hspell linguistic processing includes general inflection rules and a set of exceptions. During the

compilation of Hspell, all possible inflections are generated and stored in files (see Figure 6.1). We

have indexed the inflected words and their attributes (root, gender, number, possessive, construct1http://www.ravmilim.co.il2http://www.ivrix.org.il/projects/spell-checker/

70

state for nouns, and additional tense information for verbs) and in the linearization process, retrieve

them on request. This mechanism basically means we use a table-driven morphology generation

mechanism.

6.4 Summary

Natural Language Generation, in general, is the process of mapping communication goals to a sur-

face realization that satisfies the goals [Reiter and Dale, 2000]. Practically, it is used for generating

text in human/computer interfaces and to represent data in a readable manner.

An NLG system is traditionally composed of a content planner and a surface realizer. This

work is mostly concerned with the surface realizer - lexical chooser and syntactic realization.

We reviewed the field in general and especially the FUF/SURGE method. Since our system

is planned for generating both Hebrew and English outputs, we expanded the Hebrew syntactic

realizer HUGG to deal with sentences and took special care with clauses that represent relational

processes. We found that the existing input specification formalism used in SURGE is appropriate

to cover the wide variation of surface structures observed in modern Hebrew relational clauses, and

obtained an abstraction level for syntactic realization which can be mapped both to Hebrew and

English, for Noun Phrases, and for most clauses.

In the next chapter, we focus on the lexical knowledge bases that were built for the system –

these lexicons function in the basic process of the message generation (the ontology and the Bliss

lexicon) and in the lexical choice phase.

71

Chapter 7

Lexical resources

The process of generating text from a telegraphic message (textual or symbolized) is heavily based

on lexical information, whether the telegraphic message is being parsed and re-generated, such as

in the case of Compansion system [McCoy et al., 1998], or in the tactical approach we have taken

in the semantic authoring method.

The lexical knowledge encoded for this system is actually the heart of the system. We compiled

three lexicons for this work.1:

1. A Blissymbols lexicon

2. An ontology and a lexical chooser

3. A large-scale, reusable verb lexicon for text generation (joint work with Hongyan Jing, Michael

Elhadad, and Kathleen McKeown) [Jing et al., 2000].

Each lexicon in this list is intended for a different layer in the system architecture (see Figure

5.2) but all three are interrelated by means of origin and representation, and complete each other

in the overall knowledge acquired, resulting in a system with rich lexical information.

The Bliss lexicon was designed for the symbols to be presented in our AAC display. It was

designed in a way that considers the unique characteristics of Bliss, and although it explicitly

contains only the concepts/words, categories they belong to, their part of speech, and their graphical

presentation, the connections which they are based on provide semantic information as well. The1We include here the ontology since its content is derived from lexical sources (WordNet, VerbNet)

72

Bliss lexicon can also be used as a stand-alone Web application and is used as a basis for an editor

of the type of ”Writing with Symbols” c©. 2 Section 7.2.1 presents the Blissymbol language and

7.2.2 describes the Bliss lexicon.

The ontology is the backbone of the semantic authoring process; it provides a hierarchical struc-

ture for the concepts, relations, and properties that are used in the process (in our case, the words

represented by the Bliss symbols: verbs, nouns, adjectives, and adverbs), and contains information

on the conepts/words such as synonyms, parent in the hierarchy, and required relations if such

exists. The knowledge acquisition for the ontology originated from WordNet and VerbNet.

Section 7.3 describes this process.

The lexical chooser (Section 7.4) is partially hand coded (for nouns and adjectives) and partially

automatically built. It includes specific knowledge on the syntactic characteristics of the words such

as gender, countability, and subcategorization. Much effort was put into the verbs lexicon. A large-

scale and reusable lexicon of verbs, which draws on information from various lexical resources such

as WordNet, Levin’s verb classes, and ComLex, enables an input specification which contains a

verb and a list of arguments. The possible alternations for each verb are given together with shallow

information on selectional restrictions. Each alternation is mapped into a set of corresponding

sentence structures (called structs) and accordingly to surge inputs.

This chapter first gives general background on the use of lexicons in NLG, then presents the

lexicons that were compiled for our system (Blissymbols lexicons in Section 7.2.2, ontology in

Section 7.3 and verbs lexicon in Section 7.4). In each section we describe the lexical sources which

were used for their construction.

7.1 Lexicons in NLG

Contemporary grammatical theories are becoming more lexically-driven, realizing that lexical knowl-

edge and lexical semantics play a central role in the overall structure of an utterance, acting as the

interface between meaning and form (concepts and syntax) [Faber and Uson, 1999]. In general, the

link between the conceptual structure and the syntactic function is called the linking theory.

There are three main approaches to the lexical function [Faber and Uson, 1999]:

1. Role-centered approach (Government Binding Theory for instance): in this approach, a set of2http://www.widgit.com/products/wws2000/

73

thematic roles are considered to capture the generalizations concerning the relation between

syntax and semantics (which thematic role can be used as which syntactic function).

2. Predicate-centered approach (Levin’s verb classification for example): predicates are com-

posed of a set of primitive elements. Thematic roles depend on the primitives and on the

event structure of the word, and words are arranged in classes accordingly. The clause struc-

ture is determined by the composition of the primitives and the eventuality.

3. Constructionist approach: states of affairs are classified into states, events, actions – and

these determine the thematic roles and overall structure of the clause.

Whichever approach is taken, in an NLG system, the function of the lexicon is to mediate

between meaning and form. The lexicon must be adjusted both to the tactics of the syntactic

realizer (i.e., the level of abstraction of its input specification) and to the meaning representation.

In practice, the vocabulary and its coding depend strongly on the domain of the system realized,

since most systems are domain-specific; broad-coverage lexicons which could be adjusted to both

sides (meaning and form) are not available. Lexicons are usually hand-coded with the specific

senses of words in the system’s context.

Since our system is not domain-dependent and has a relatively rich vocabulary (approximately

2,200 words, the vocabulary found in the Blissymbols lexicons we use), we need to find and identify

available lexical sources which can be used both for semantic authoring, which relies heavily on

the thematic structure (the predicate-centered approach), but can be easily transformed to the

constructionist approach of the input specification of SURGE (the transitivity system). We have

achieved this objective by using existing lexical knowledge bases for constructing a robust, reusable

lexicon for generation and adjusting it to the input specification of SURGE/HUGG.

In the next sections, we survey the existing lexical knowledge sources which have provided

us with the required knowledge: Levin’s verb classes [Levin, 1993], WordNet, VerbNet, and

FrameNet.

7.1.1 Levin’s verb classes

Levin [Levin, 1993], in her influential work, has sorted English verbs into classes which share com-

mon syntactic structure. Levin showed that there is a very strong connection between the meaning

74

of the verb and the possible alternations it allows in a clause. For example, consider the Sub-

stance/Source Alternation:

1a. Heat radiates from the sun.

1b. The sun radiates heat .

This alternation is possible only for verbs of substance emission which take two arguments: a

source and a substance emitted by it. The object of the intransitive form (1a above) has the same

semantic relation to the verb as the object of the transitive form (1b above). These two arguments

must be expressed in both transitive and intransitive uses. The source is expressed as the subject

in the clause with the transitive occurrence of the verb, and as the object of the preposition from

in its intransitive use.

Levin defined 80 such semantic classes, listed all verbs of each class, the semantic constraints

of each, exceptions, and other idiosyncrasies.

¿From an NLG perspective, this knowledge is very useful as part of the transitivity system of

the syntactic realizer in mapping the various semantic classes to possible syntactic structures.

7.1.2 Online Resources

¿From a computational point of view, there are five main categories of knowledge that should be

included in a Lexical Knowledge Base (LKB) [Faber and Uson, 1999]:

1. Phonological information (e.g., sound system, intonation, stress).

2. Morphological information (part of speech, irregularities).

3. Syntactic information (subcategorization).

4. Semantic information (selectional restrictions, relationships with other words).

5. Pragmatic information (casuality, communicative intentions, register, and genre).

Choosing a representation of the lexical knowledge is a crucial step in the construction of an

NLG system in general and in our work in particular.

The choice must consider:

75

1. availability,

2. adjustability,

3. reusability,

4. multilinguality.

At this date, there are several online reusable lexicons (mostly of verbs) that are used for NLP

research. Not all lexicons contain all of the information specified above, and many lexicons are

structured in an application-driven manner, i.e., containing only words and information necessary

for the particular application. The most widely used lexicons are WordNet, which includes nouns,

adverbs, adjectives, and verbs, and some verb lexicons FrameNet, VerbNet, and ComLex.

Verb lexicons are meaningful since they link the semantic content of concepts that have to be

realized with the syntactic structure that determines subcategorization and, therefore, the sentence

structure.

WordNet

Wordnet ([Miller et al., 1990], [Miller, 1995]) 3 is an online lexical database which includes (in

version 2.1) 11,488 verbs, 117,097 nouns, 22,141 adjectives, and 4,601 adverbs in English. Each

entry in WordNet includes a list of synonyms (a synset), a glossary and some examples of usage.

Word entries are determined according to orthography and, therefore, different senses (as in the case

of bank or table) are enumerated and may belong to different synsets. The strength of WordNet

relies on the fact that words or synsets are interconnected with additional lexical-semantic relations

such as hyponyms (the subclass relation) and hypernyms (the superclass relation) between synsets,

and antonyms (opposites) between words. For verbs, there are two main additional relations: the

troponym (from events to their subtypes) and entailment (from an event to events they entail). In

this approach, WordNet forms a semantic net of synsets, and each synset actually represents a

semantic concept. The hyponymy (such as the relationship between a pear and fruit) relation is

transitive and forms a hierarchy with a supertype (entity). This overall structure is very useful

for knowledge acquisition.3http://wordnet.princeton.edu/

76

5 senses of girl

Sense 1 girl, miss, missy, young lady, young woman, fille -- (a

young woman; "a young lady of 18")

=> woman, adult female -- (an adult female person (as opposed to a man);

"the woman kept house while the man hunted")

Sense 2 female child, girl, little girl -- (a youthful female

person; "the baby was a girl"; "the girls were just learning to

ride a tricycle")

=> female, female person -- (a person who belongs to the sex that can have babies)

Sense 3 daughter, girl -- (a female human offspring; "her daughter

cared for her in her old age")

=> female offspring -- (a child who is female)

Sense 4 girlfriend, girl, lady friend -- (a girl or young woman

with whom a man is romantically involved; "his girlfriend kicked

him out")



=> lover -- (a person who loves or is loved)

Sense 5 girl -- (a friendly informal reference to a grown woman;

"Mrs. Smith was just one of the girls")



Figure 7.1: Wordnet entry for the word girl

A similar lexicon for European languages (MultiWordNet [Pianta et al., 2002]) is being

developed, as well as for Hebrew [Ordan and Wintner, 2005].

VerbNet

VerbNet[Kipper et al., 2000] 4 is a verb lexicon compatible with WordNet, and enriched with

additional semantic and syntactic information, mainly derived from Levin’s verb classes [Levin,

1993]. This knowledge connects semantic and thematic information of the verbs with their syntactic

structure and selectional restrictions. The syntax information is coded in the Lexicalized Tree

Adjoining Grammar (LTAG) [Schabes et al., 1988], formalism, and information is further expanded

with knowledge about the eventuality structure of each verb.

Each sense of a verb in VerbNet refers to a particular class. Selectional restrictions are

explicitly assigned, as well as additional semantic characterization if this was not captured by the

verb’s class. The additional semantic information refers to the eventuality of the verbs (i.e., whether

the predicate is true in the preparatory, culmination, or consequent stage of an event).4http://www.cis.upenn.edu/group/verbnet/

77

build-26.1-1 WordNet Sensesmake(6 11 12 17 28 32 33 39)

Thematic RolesAgent[+animate OR +machine]Asset[+currency]Beneficiary[+animate OR +organization]Material[+concrete]Product[+concrete]

FramesBasic TransitiveBenefactive Alternation (double object)Benefactive Alternation (for variant)Material/Product Alternation Transitive (Material Object)Material/Product Alternation Transitive (Product Object)Raw Material Subject Alternation ()Sum of Money Subject Alternation (Agent Subject)Sum of Money Subject Alternation (Asset Subject)Unspecified Object Alternation

Verbs in same (sub)class[build, carve, cut, make, sculpt, shape]

WordNet Senseswatch(1 2 3 4 5 6)

Thematic RolesExperiencer[+animate]Stimulus[]

FramesBasic Transitive () "The crew spotted the island"Experiencer V Stimulus

Verbs in same (sub)class[descry, discover, espy, examine, eye, glimpse, inspect,investigate, note, observe, overhear, perceive, recognize, regard,savor, scan, scent, scrutinize, sight, spot, spy, study, survey,view, watch, witness, sniff]

Figure 7.2: VerbNet entries for make - build-26.1 and watch

FrameNet

FrameNet [Baker et al., 1998] 5 is an online lexicon for English developed at Berkeley University,

based on frame semantics and supported by corpus evidence. As of October 2005, it contains about

8,900 lexical units, with about 6,100 of them fully annotated, with 625 semantic frames exemplified

in about 135,000 annotated sentences.

A FrameNet entry lists every set of arguments it can take, including the possible sets of

thematic roles, syntactic phrases and their grammatical function.

A lexical unit is a pair <word, meaning>. Each sense of a word belongs to a different semantic

frame: a structure that describes the particular type of event, object, or situation and possible

participants if the word is predicating. For instance, the Apply heat frame describes a common

situation involving a Cook, Food, and a Heating Instrument, and is evoked by words such as bake,

blanch, boil, broil, brown, simmer, steam. The roles of semantic frames are called frame elements

and they usually describe syntactic dependents of a word [Ruppenhofer et al., 2005].

Relations between words are expressed by several relations defined on frames: Inheritance

(IsA), Using relation (for instance, the Speed frame uses the Motion frame), and Subframe (e.g.,5http://framenet.icsi.berkeley.edu/

78

abstain.vFrame: ForgoingDefinition COD: restrain oneself from doing somethingFrame Elements and their Syntactic RealizationsThe Frame elements for this word sense are (with realizations):Frame Element Number Annotated Realizations(s)Desirable (15) PP[from].Dep (10)

PP[on].Dep (2)PPing[from].Dep (3)

Forgoer (15) NP.Ext (15)Valence Patterns: These frame elements occur in the followingsyntactic patterns:Number Annotated Patterns

15 TOTAL Desirable Forgoer(10) PP[from] Dep NP Ext(2) PP[on] Dep NP Ext(3) PPing[from]Dep NP Ext

Figure 7.3: FrameNet entry of the verb abstain

the Criminal process frame has subframes of Arrest, Arraignment, Trial, and Sentencing).

ComLex

Comlex [Macleod and Grishman, 1995] 6 is an English computational lexicon developed at New

York University, which contains approximately 36,000 lexical items (21,000 nouns, 8,000 adjectives,

and 6,000 verbs). Each entry is organized as a nested typed feature-valued list, with a predefined

set of possible features and complements. Each entry contains morphological data and subcatego-

rization for predicate words. Subcategorization is marked with syntactic features only such as the

complement phrase type and control features (e.g., NP, NP-PP).

(verb :orth "abstain" :subc ((intrans)(pp :pval (("from")))(p-ing-sc :pval (("from")))))

Figure 7.4: ComLex entry of the verb abstain

7.1.3 Choice of Lexical Sources

In our implementation, we have used Levin’s verb classes and ComLex (in the verbs lexicon for

generation), WordNet was used for both ontology and for the verbs lexicon. We have also mapped

each of the Bliss symbols in the Bliss lexicon to WordNet senses. This is a somewhat problematic

decision since it can narrow the variety of meanings that the Bliss symbol may represent. VerbNet

was used for the ontology since it refers to both lexical sources that were used in the verbs lexicon:

Levin’s alternations and WordNet senses. We have not used FrameNet since it does not refer

to jeither WordNet or Levin, which was impractical for our application.6http://nlp.cs.nyu.edu/comlex/index.html

79

Figure 7.5: Hebrew and Bliss Medical Words

7.2 Bliss Lexicon

The Bliss lexicon provides the list of Bliss symbols accessible to the user, along with their graphic

representation, semantic information, and the mapping of symbols to English and Hebrew words.

Bliss is constructed to be a written-only language, with basically non-arbitrary symbols. The

form of Bliss symbols is rooted in their meaning in an iconic manner [Ducrot and Todorov, 1983].

Because words are structured from semantic components, the graphic representation by itself pro-

vides information on words’ connectivity. For example, the written form of the words in Figure

7.5 indicates nothing about their meaning (for non Hebrew-readers) or their semantic relatedness.

In contrast, the Bliss forms of the words suggest a possible meaning connection (in this example:

doctor, nurseand hospital).

The next section provides a thorough description of the language.

Section 7.2.2 describes the implementation of the Bliss lexicon which is the basis for the graphic

representation and to the vocabulary of the communication board.

7.2.1 Overview on Blissymbolics

Blissymbols7 is an iconic language founded as a universal written language by Charles K. Bliss,

adopted in the 1970s for communication of non-speaking children.

Although located low on the transparency scale of symbols, we have chosen to implement our

system for Blissymbols for various reasons. While it is not used as much as PCS, for instance,

people that use it find it not as a set of symbols, but as a language. An Israeli user claims “I speak7I use the terms Blissymbols, and Bliss for Short, interchangeably.

80

three languages: Hebrew, English, and Bliss” [Nir, 2005]. From the linguistic point of view, Bliss

is a challenging language. Its semantic structure is appealing and provides a useful basis for the

structuring of a computerized lexicon for the process of natural language generation. In addition,

the lack of up-to-date software for Bliss in Hebrew without doubt affects the number of users of

Bliss symbols in Israel.

The History of Bliss

Blissymbolics (bliss in short) is a graphic meaning-referenced language, created by Charles Bliss

to be used as a written universal language. It was first published in 1949 and elaborated later in

1965 in his book Semantography [Bliss, 1965]. Bliss, a survivor of the Holocaust, was influenced by

the Chinese orthography system and his life experience, and wished to establish an understandable

written language that could be used by people of different nations and languages – as he believed

that language misunderstanding is a main cause of wars in the world.

In 1971, the bliss symbol system was first used for communication with severely language-

impaired children, as the staff of The Ontario Crippled Children’s Center (OCCC) realized that

a set of symbols, more abstract than pictures, will enable non-speaking children to communicate

more effectively. Shirley McNaughton from the OCCC found about Blissymbols and the center

adapted their use, and new symbols were especially developed, since many words that were in use

by the handicapped children were missing in the language. Charles Bliss visited OCCC in 1972

and helped to improve and revise the new symbols.

Ever since, it has been widely used for communication with children who cannot (yet) learn or

read sound-referenced words [McDonald, 1982].

Bliss usage is standardized and new symbols are added by an international committee of the

BCI - Blissymbolics Communication International. The authority of the BCI is based on its usage

since 1971, through legal agreements with Charles K. Bliss [(BCI), 2004].

Bliss is used in more than 33 countries worldwide and has been translated to use 17 languages

[Beukelman and Mirenda, 1998].

The use of Bliss symbols is possible with three approaches [Hunnicutt, 1986]:

1. The telegraphic style – word-to-word, no morphological or syntactical analysis.

2. Bliss syntax – the original intention of Charles Bliss was to make the language as simple as

81

possible: (I) SVO order, (II) negative marker placed before the verb, (III) modifiers precede

modified word, (IV) question marks are located as the first symbol of a sentence, the rest

is as a declarative sentence, (V) exclamations are prefaced with an exclamation mark and,

finally, (VI) place and time are located at the beginning of a sentence (place first, then time).

3. Natural spoken language syntax - following the language’s syntactic rules.

In most cases, the decision was to adopt the spoken language syntax with Bliss symbols, since

it assists with literacy skills and eases reading and writing acquisition later on.

In the adoption of Bliss to Hebrew (and in Arabic as well), the decision was to write Bliss

right-to-left like the written form of the spoken language. This forced not only writing a sequence

in the opposite direction, but also changing the direction of the symbols. However, not all symbols

were reflected - and the lack of uniformity caused problems in trials to adjust Bliss software to

Hebrew.

Most commonly, Bliss symbols were used on carton displays but [Waller and Jack, 2002] point

to two electronic devices that are recently being used by Bliss symbols users: dedicated devices

like Dynavox and Bliss communication board software such as WinBliss, Bliss For Windows with

Clicker.

However, none of these electronic devices generate full sentences, but only pronounce the names

of the symbols.

The Bliss Language

Bliss was designed as a “complete pictorial symbol language” [McDonald, 1982]. Bliss symbols are

meaning-referenced (as opposed to the sound of referenced symbols of the spoken language). Each

symbol represents a thing, an action, an evaluation, or an abstract meaning. Symbols are composed

from a relatively small number of atom symbols (“symbol elements”) of several types (see Figure

7.6).

Following BCI’s publication on the fundamental rules of Bliss, [(BCI), 2004] there are two

main types of Bliss symbols: Bliss-characters, which are the building blocks of the language and

are indivisible (such as book or medical), and Bliss-words, which can be Bliss-characters with a

particular meaning or a sequence of Bliss-characters, separated from each other with a Blissymbolic

quarter space (Bliss-words are separated from each other with a Blissymbolic full space, or a half-

82

Figure 7.6: Example for Bliss symbol types

space away from punctuation). It is important to note that there are no different fonts or character

variations (such as italic or Sans) since the meaning of the symbol can change with these small

variations.

The traditional types of bliss symbol words [(BCI), 2004] [McDonald, 1982] are:

Arbitrary symbols – symbols with no pictorial relationship between form and meaning (such

as a-an, the, that, digits 1, 2, .. and mathematical signs +, -, ×). Some of the arbitrary

symbols (which Bliss invented) were rationalized: such as “action” reminiscent of a volcano

shape.

Ideographs – symbols that create a graphic association between the symbol and the concept it

represents (such as before, after, in, on, down, up).

Pictographs – symbols whose drawing resembles what they intend to symbolize (such as house,

animal, flag) and usually refer to concrete objects.

Compound Symbols – group of symbols arranged to represent objects or ideas (such as home,

happy, angry, sad, school, university, teacher).

83

Figure 7.7: Usages of Pointers for Meaning Selection

The meaning of a symbol depends on four main parameters, in addition to its shape: size,

position, and configuration which is composed of the orientation and spacing. All four are relative

to a square with a grid (which can be of any size). The base of the square is the earthline and

its top is the skyline. Each symbol can appear in three sizes: full size, half size and quarter size.

Size changes meaning as in the case of a circle: a full size represents sun and half-size is a mouth

(see Figure 7.9.A). Position is also meaningful as in the case of belongs to, and/also, and with.

The configuration of symbols consists of direction (forward, backward, down, up, for instance) and

spacing far, near, high, low.

An important ideograph is the pointer, which is part of a symbol and is used to point to a

specific attribute of the whole meaning, i.e., a selector. For instance body, chest, waists, crotch,

shoulder (see Figure 7.7).

Symbols may be grouped together in order to form meaning. The two main modes of grouping

are by superimposing symbols (wheelchair, rain) or by a sequential position, either separated or

touching (aunt, school).

Indicators are special symbols that Charles Bliss invented to mark certain qualities of the

words represented, and aimed to reduce possible ambiguity of the grammar. Although indicators

can be identified as part of speech markers, they were not intended to be interpreted as such.

Indicators are symbols of quarter size and located above the skyline of the square.

Thing Indicator refers to a chemical thing as Charles Bliss defined it. This is an object that can

be seen, touched or weighed, i.e., the symbol corresponds to a concrete noun. Practically, it is

not required to use the thing indicator, unless it is essential for distinguishing with competing

abstract nouns (time vs. clock).

Action Indicator - a quarter-sized action symbol indicating actions taking place in the present

84

Figure 7.8: Example: mind, minds, brain, thoughtful, think, thought, will think.

(i.e., these symbols correspond to verbs of activity).

Past Action Indicator – a quarter-sized past symbol.

Future Action Indicator – a quarter-sized future symbol.

Description (evaluation) Indicator – evaluations or judgments of qualities (that may change

in time).

Plural Indicator – a quarter-sized multiplication symbol to indicate a plural number of things.

Figure 7.8 exemplifies the change of meaning through the use of indicators.

Several symbols function as modifiers, and are prefixed or suffixed to the meaning-carrying

symbols. Such symbols are the multiplier which is used for augmentations, the opposite symbol,

and the intensifier (see Figure 7.9.B). Indicators are not located above modifiers.

There is not necessarily a one-to-one mapping between symbols and words, and symbols may

have more than one meaning, depending on context. For instance, the meaning of the symbol to

speak may also be to say, to tell, to narrate, to talk, to report.

7.2.2 The Design of the Bliss Lexicon

We have designed and implemented a lexicon of Bliss-Hebrew-English words that takes into ac-

count the special characteristics of Bliss symbol language.The lexicon can be searched by keyword

(doctor), or by semantic/graphic component: searching all words in the lexicon that contain both

person and medical returns the symbols aiding tool, artificial insemination, dentist, doctor, nurse,

etc.

The design of the lexicon enables easy manipulation of the symbols (graphical editing, adding

new synonyms) and an easy way to insert new symbols (by combining existing symbols or by

85

Figure 7.9: Semantic modifiers: much, intensifier, opposite.

drawing a new one). It contains both Hebrew and English words and adjusts the representation

according to the language.

In addition to its structure and composing components, each word in the lexicon is, in addition,

assigned one or more domain tags which are ordered in a hierarchy: this addition was required

for a more efficient structuring of the communication board itself: if a user chooses the school

context, the system uses the subset of words that are labelled under the school tag in its dynamic

presentation of symbols.

A somewhat similar approach was taken in the development of BlissWord [Andreasen et al.,

1998], where symbols are represented in a picture format augmented with the symbol’s name,

synonyms, ISO number, basic shapes (e.g., wavy line), key components (e.g., water), indicators,

and categorization (e.g., Quick → being alive → things we do → Moving and Staying still →Moving). A symbol can be retrieved by specifying shapes or components contained in the desired

symbol, by searching the hierarchy of the symbol categories or by their combination, as well as by

the English reference or in the most frequent symbols list.

When Bliss symbols were adopted to Hebrew it was decided that they would be presented from

right to left, a decision that is valid for most but not all symbols. This forced us to add a marker

86

to each symbol to indicate whether it has to be drawn inverse. Figure 7.10 shows the diversity in

representation: sequences may be inverse, but also superimposed symbols as well as atoms.

Figure 7.10: Hebrew vs. English Representation of Symbols

The Bliss lexicon is available as a Web application – users can connect to the site and search

for words by drawing parts of the symbol or by English or Hebrew translation, and group words

according to topic. It is also possible to insert new words (including drawing the Bliss symbol).8

(See Figure 7.12)

7.2.3 Bliss Lexicon Software Development

Since the Bliss lexicon includes graphic symbols and we decided to make it available online as a

Web application, its development required specific attention.

The Bliss lexicon library is written 9 in Java 1.5, the front-end is implemented using JSP

and Applets. The visualization of symbols is done using SVG - an XML-based language for the

description of vector graphics. The mappings of symbols to words in natural language and back to

symbols, the basic relations between symbols, and the visual representation of basic symbols are

stored in XML files. The more complex relations between symbols and the visual representation of

complex symbols are inferred programmatically.8The lexicon is available at http://www.cs.bgu.ac.il/∼bliss9The Bliss Lexicon has been implemented by Yoav Goldberg, the department of Computer Science, Ben Gurion

University

87

Figure 7.11: Hierarchy of Bliss Objects

The vocabulary of our lexicon contains approximately 2200 symbols, as found in the Hebrew

and English Bliss lexicons [Shalit et al., 1992] [Hehner, 1980].

For implementation, all symbols in the Bliss lexicon were entered into a database, according to

their structure:

Symbols are either atoms (ideograph), pictographs, superimposed symbols, or a sequence of

symbols or symbols touching each other.

The database was then checked for coherency and was revised accordingly. In the written lexi-

cons (for both Hebrew and English) symbols are represented and interpreted by their components.

However, not all components are present in the lexicon and those had to be added to preserve

coherency.

Each symbol was implemented as an object with a unique ID. All symbols (whether atoms or

composite) can be manipulated in the same manner. The object includes information about the

graphic representation (but not the graphics), information about the Hebrew and English words,

and relatedness to other symbols. Visualization is done by a separate module.

7.3 Using Lexical Resources for the System Lexical Chooser

For the acquisition of the concepts/relations database, we use two main sources: VerbNet [Kipper

et al., 2000] and WordNet [Miller et al., 1990].

WordNet was chosen as the source of information for the concepts that are linguistically real-

88

Figure 7.12: A snapshot of the Bliss Lexicon Web Application

ized as nouns, adjectives, and adverbs since it provides hierarchy information (using the hypernym

relations of synsets). WordNet’s information on verbs was enriched with VerbNet’s data.

VerbNet was chosen as the source knowledge for the realization of processes for the following

reasons:

• It refers both to WordNet senses of verbs and to Levin’s alternations. These two sources

of information are easily mapped in the form of lexical chooser we wanted to implement.

• Thematic roles - the coding of selectional restrictions in VerbNet relies on a feature hierarchy

where for instance, animate subsumes animal and human, concrete subsumes both animate

and inanimate. This description of selectional restrictions fits the concept of ontology as we

constructed it.

We use the information in WordNet and VerbNet for bootstrapping concept and relation

hierarchies. For all words of the Bliss Lexicon, we manually choose the word’s sense (synset)

according to WordNet. For all nouns, we induce the hypernym hierarchy from WordNet,

89

resulting in a tree of concepts – one for each synset appearing in the list of words (see Figure 5.3).

In addition to the concept hierarchy, we derive relations among the concepts and predicates by

using the VerbNet lexical database. VerbNet supplies information on the conceptual level, in

the form of selectional restrictions for the thematic roles (see Figure 5.4). These relations allow

us to connect the concepts and relations in the derived ontology to nouns, verbs, and adjectives.

The ontology is used as the basis for the CG construction and supplies the selectional restriction

information that is needed in the authoring process.

The concepts hierarchy contains both objects (nouns) and events (verbs).

Separate, but strongly connected to the ontology, a lexical chooser is also structured to include

specific lexical information on the concepts that have to be lexicalized: their lexemes, syntactic

information (such as subcategorization, gender for nouns). For verbs, we use the integrated lexicon

(see below).

Information on nouns is retrieved from WordNet, and for Hebrew is hand-coded.

7.4 Integrating a Large-scale Reusable Lexicon for NLG

The lexicon of an NLG system is a significant component since it links the semantic content to its

final syntactic representation. Verbs determine the clause structure by constraining the arguments:

their number, order, and selectional restrictions. Nouns affect the selection of collocational adjec-

tives (e.g., strong tea and not powerful tea and not strong juice). In most NLG systems, knowledge

is hand-coded anew for the specific domains of the applications. We have integrated a large-scale,

reusable lexicon with the FUF/SURGE [Elhadad and Robin, 1996] syntactic realization system.

The integration of the lexicon with FUF/SURGE has various benefits, including the possibility

of accepting semantic input at the level of WordNet synsets, the production of lexical and syn-

tactic paraphrases, the prevention of non-grammatical output, reuse across applications, and wide

coverage.

Natural Language generation starts from semantic concepts and then finds words to realize such

semantic concepts. Most existing lexical resources, however, are indexed by words rather than by

semantic concepts. Such resources, therefore, can not be used for generation directly. Moreover,

generation needs different types of knowledge, which typically are encoded in different resources.

However, the different representation formats used by these resources make it impossible to use

90

them simultaneously in a single system. To overcome these limitations, we built a large-scale,

reusable lexicon for generation by combining multiple existing resources. The resources that are

combined are WordNet, Levin’s English Verb Classes and Alternations (EVCA), and COMLEX.

In combining these resources, we focused on verbs. The combined lexicon includes rich lexical

and syntactic knowledge for 5,676 verbs. It is indexed by WordNet synsets as required by the

generation task. The knowledge in the lexicon includes:

• A complete list of subcategorizations for each sense of a verb.

• A large variety of alternations for each sense of a verb.

• Frequency of lexical items and verb subcategorizations in a version of the Brown corpus tagged

by Wordnet synsets.

• Rich lexical relations between words

The sample entry for the verb “appear” is shown in Figure 7.13. It shows that the verb appear

has eight senses (the sense distinctions come from WordNet). For each sense, the lexicon lists all

the applicable subcategorization for that particular sense of the verb. The subcategorizations are

represented using the same format as in COMLEX. For each sense, the lexicon also lists applicable

alternations, which we encoded based on the information in EVCA. In addition, for each subcate-

gorization and alternation, the lexicon lists the semantic category constraints on verb arguments.

In the figure, we omitted the frequency information derived from the Brown corpus and lexical

relations (the lexical relations are encoded in WordNet).

The construction of the lexicon is semi-automatic. First, COMLEX and EVCA were merged,

producing a list of syntactic subcategorizations and alternations for each verb. Distinctions in

these syntactic restrictions according to each sense of a verb are achieved in the second stage,

where WordNet is merged with the result of the first step. Finally, the corpus information is added,

complementing the static resources with actual usage counts for each syntactic pattern. For a

detailed description of the combination process, refer to [Jing and McKeown, 1998].

91

appear:sense 1 give an impression((PP-TO-INF-RS :PVAL ("to") :SO ((sb, −)))(TO-INF-RS :SO ((sb, −)))(NP-PRED-RS :SO ((sb, −)))(ADJP-PRED-RS :SO ((sb, −) (sth, −)))))sense 2 become visible((PP-TO-INF-RS :PVAL ("to")

:SO ((sb, −) (sth, −)))...(INTRANS THERE-V-SUBJ

:ALT there-insertion:SO ((sb, −) (sth, −))))

...

sense 8 have an outward expression((NP-PRED-RS :SO ((sth, −)))(ADJP-PRED-RS :SO ((sb, −) (sth, −))))

Figure 7.13: Lexicon entry for the verb appear

((SENSE 1)(RALT STRUCTS-VERB-WATCH-SENSE-1(((STRUCT NP-ING-OC)

(ARGS((ALT SELECTIONAL-WATCH-1-NP-ING-OC

(((1 ((ANIMATE YES))) (2 ((ANIMATE NO))))((1 ((ANIMATE YES))) (2 ((ANIMATE YES)))))))))

((STRUCT NP-NP-PRED)(ARGS((ALT SELECTIONAL-WATCH-1-NP-NP-PRED

(((1 ((ANIMATE YES))) (2 ((ANIMATE NO))))((1 ((ANIMATE YES))) (2 ((ANIMATE YES)))))))))

((STRUCT NP)(ARGS((ALT SELECTIONAL-WATCH-1-NP

(((1 ((ANIMATE YES))) (2 ((ANIMATE NO))))((1 ((ANIMATE YES))) (2 ((ANIMATE YES))))))))))))

Figure 7.14: VerbNet make - build-26.1

92

7.5 Summary

We have presented in this chapter the three lexicons that were constructed for the system (Bliss

Lexicon, Ontology derived from lexical resources, and an NLG resource for lexical choice of English

verbs). The common ground for all lexicons is the use of a WordNet sense: in the Bliss lexicon,

in the verbs lexicon and in the ontology.

Referring to the knowledge that LKBs encode (Section 7.1.2), our system includes morphological

information (part of speech, irregularities), syntactic information (subcategorization), and semantic

information (selectional restrictions, relationships with other words). We have not yet encoded

phonological knowledge or pragmatic knowledge. Pragmatics in the system are expressed in the

choice of context on the communication board and the choice of defaults that are later encoded in

the SR.

The Bliss lexicon contains the graphic information of the symbols and their POS (which is

needed for the task of finding the right sense for the words when creating the ontology). The

ontology connects the meaning to the possible syntactic structure as it controls the possible symbols

through selection constraints and communicating with the lexical chooser.

The next chapter presents SAUT - the semantic authoring system and the communication board

that uses SAUT as its processing engine.

93

Chapter 8

Communication Boards

This chapter describes the overall generation process in the AAC system as a form of semantic

authoring. We present the SAUT system as a general prototype for semantic authoring, and its

adaptation to a dynamic communication board for Bliss.

Section 8.1 describes first SAUT as a general tool. Section 8.2 presents the NLG-AAC commu-

nication board, based on the authoring tool; the overall layout of the display is presented.

8.1 The SAUT Semantic Authoring Tool

SAUT [Biller, 2005] [Biller et al., 2005] is an authoring system for logical forms encoded as con-

ceptual graphs (CG). The system belongs to the family of WYSIWYM (What You See Is What

You Mean) [Scott et al., 1998] text generation systems: logical forms are entered interactively and

the corresponding linguistic realization of the expressions is generated in several languages. The

system maintains a model of the discourse context corresponding to the authored documents.

The user edits a specific document by entering utterances in sequence, and maintaining a

representation of the context. While the user enters data, the system performs the standard steps

of text generation on the basis of the authored logical forms: reference planning, aggregation,

lexical choice, and syntactic realization – in several languages (we have implemented English and

Hebrew and discuss Bliss below). The feedback in natural language is produced in real-time for

every modification performed by the author.

The architecture of the system is depicted in Figure 8.1.

The two key components of the system are the knowledge acquisition system and the editing

94

Figure 8.1: Architecture of the SAUT System

component. The knowledge acquisition system is used to derive an ontology from sample texts in

a specific domain (see section 7.3). In the editing component, users enter logical expressions on the

basis of the ontology.

8.1.1 Conceptual Graphs

Overview

Conceptual graphs are logical knowledge representation developed by John Sowa [Sowa, 1984]. CGs

are an understandable model that can express natural language utterances, unlike, for instance, first

order logic predicates. CGs enable authors to model linguistic phenomena such as quantification,

determination in a formal way. Sowa based his work on the existential graphs of Charles S. Peirce

[Roberts, 1973] and the semantic networks of artificial intelligence [Sowa, 1987]. Conceptual graphs

are widely used in various research fields such as information retrieval, NLP, and expert systems.

A conceptual graph is a directed bipartite graph with two kinds of nodes:

• Concepts

• Relations

95

[Cat] -> (On) -> [Mat].

Figure 8.2: Linear representation of a Conceptual Graph

In a graphical representation, concepts are drawn as squares and relations are circles. In a

linear representation of a graph, square brackets are used for concepts and curved parentheses for

relations. 1

Concepts represent objects, events, and abstract entities, while relations represen the relation-

ships among concepts. Concepts and relations are typed; types are taken from an ontology and are

structured in a hierarchy. Concept types are ordered in a lattice, with Entity (or the supertype,

universal type) at top and Absurdity (the absurd type) at the bottom.

In a conceptual graph, a concept node can represent either an entire class, a type or a referent

to a particular instance of the class. The ] symbol represents a definite article i.e., [Cat:#] means

the cat. A node containing [Cat] only means the indefinite a cat, and represents a generic type.

It can also contain a referent to a named entity as in the case of [Cat:Mitzi]. A concept in a

graph may contain additional features and the value of a feature may be a conceptual graph as

well.

Each relation has a type which determines an arity 2 which is expressed with the number of

concepts it is connected to. The arcs between the nodes are directed, and direction is influenced

by the meaning of the relation. A concept of a form

[Con1] -> [Rel] -> [Con2]

is to be read: the REL of Con1 is Con2 [Mann, 1996]. Monadic relations such as (NOT) are

attached to one concept only, but most relations have a larger arity. The type of relation also

determines the type of concepts to which it connects.

Possible basic operations on conceptual graphs to form new graphs from existing graphs, were

defined by Sowa [Sowa, 1984]: restrict, join, simplify, and copy, and of higher operations such as

projection and unification (maximal join).1Examples are taken from http://www.jfsowa.com/cg/cgexampw.htm2Arity is the number of arguments to a term

96

8.1.2 Authoring Tools

The input data to an NLG system can be either derived from an existing application database or

it can be authored specifically to produce documents. Applications where the data are available

in a database include report generators (e.g., ANA [Kukich, 1983], PlanDoc [Shaw et al., 1994],

Multimeteo [Coch, 1998], FOG [Goldberg et al., 1994]). In other cases, researchers identified

application domains where some of the data are available, but not in sufficient detail to produce

full documents. The WYSIWYM approach was proposed ([Power and Scott, 1998], [Paris and

Vander Linden, 1996]) as a system design methodology where users author and manipulate an

underlying logical form through a user interface that provides feedback in natural language text.

The effort invested in authoring logical forms – either from scratch or from a partial application

ontology – is justified when the logical form can be reused. This is the case when documents must

be generated in several languages. When documents must be produced in several versions, adapted

to various contexts or users, the flexibility resulting from generation from logical forms is valuable.

WYSIWYM

In an influential series of papers [Power and Scott, 1998], WYSIWYM (What You See Is What

You Mean) was proposed as a method for the authoring of semantic information through direct

manipulation of structures rendered in natural language text. A WYSIWYM editor enables the

user to edit information at the semantic level. The semantic level is a direct controlled feature,

and all lower levels which are derived from it are considered presentation features. While editing

content, the user gets feedback text and a graphic representation of the semantic network. These

representations can be interactively edited, as the visible data is linked back to the underlying

knowledge representation.

Using this method, a domain expert produces data by editing the data itself in a formal way,

using a tool that requires only knowledge of the writer’s natural language. Knowledge editing

requires less training, and the natural language feedback strengthens the confidence of users in the

validity of the documents they prepare.

The semantic authoring system we have developed belongs to the WYSIWYM family. The

key aspects of the WYSIWYM method we investigate are the editing of the semantic information.

Text is generated as a feedback for every single editing operation. Specifically, we evaluate how

97

Figure 8.3: Snapshot of editing state in the SAUT system

ontological information helps speed up semantic data editing.

8.1.3 The SAUT Editor

To describe the SAUT editor, we detail the process of authoring a document using the tool. When

the authoring tool is initiated, the next windows are presented (see Figure 8.3):

• Input window

• Global context viewer

• Local context viewer

• CG feedback viewer

• Feedback text viewer

• Generated document viewer

The user operates in the input window. This window includes three panels:

98

• Defaults: rules that are enforced by default on the rest of the document. The defaults can

be changed while editing. Defaults specify attribute values which are automatically copied

to the authored CGs according to their type.

• Participants: a list of objects to which the document refers. Each participant is described

by an instance (or a generic) CG, and given an alias. The system provides an automatic

identifier for participants, but these can be changed by the user to a meaningful identifier.

• Utterances: editing information proposition by proposition.

The system provides suggestions to complete expressions according to the context in the form

of pop-up windows. In these suggestion windows, the user can either scroll or choose with the

mouse or by entering the first letters of the desired word; when the right word is marked by the

system, the user can continue, and the word will be automatically completed by the system. For

example, when creating a new participant, the editor presents a selection window with all concepts

in the ontology that can be instantiated. If the user chooses the concept type ”Dog” the system

creates a new object of type dog, with the given identifier. The user can further enrich this object

with different properties. This is performed using the ”.” notation to modify a concept with an

attribute. While the user enters the instance specification and its initial properties, feedback text

and a conceptual graph in linear form are generated simultaneously. When the user moves to the

next line, the new object is updated on the global context view. Each object is placed in a folder

corresponding to its concept type, and will include its instance name and its description in CG

linear form.

In the Utterances panel, the author enters propositions involving the objects he declared in the

participants section. To create an utterance, the user first specifies the object which is the topic of

the utterance. The user can choose one of the participants declared earlier from an identifiers list,

or by choosing a concept type from a list. Choosing a concept type will result in creating a new

instance of this concept type. Every instance created in the system will be viewed in the context

viewer. After choosing an initial object, the user can add expressions in order to add information

concerning this object. After entering the initial object in an utterance, the user can press the dot

key which indicates that he wants to enrich this object with information. The system will show the

user a list of expressions that can add information to this object. In CG terms, the system will fill

the list with items which fall in one of the following three categories:

99

• Relations that can be created by the system and their selectional restrictions, such that they

allow the modified object to be a source for the relation.

• Properties that can be added to the concept object, such as name and quantity.

• Concept types that expect relations, the first of which can connect to the new concept. For

example the concept type ”Eat” expects a relation ”Agent” and a relation ”Patient.” The se-

lectional restriction on the destination of ”Agent” will be, for example, ”Animate”. Therefore

the concept ”Eat” will appear on the list of an object of type ”Dog”.

The author can modify and add information to the active object by pressing the dot key.

An object which itself modifies an object previously entered, can be modified with new relations,

properties, and concepts in the same manner. The global context is updated whenever a new

instance is created in the utterances. When the author has finished composing the utterance, the

system will update the local context and will add this information to the generated natural language

document.

The comma operator (“,”) is used to define sets in extensions. For example, in Figure8.3, the

set ”salt and pepper” is created by entering the expression #sa,#pe. The set itself becomes an

object in the context and is assigned its own identifier.

The dot notation combined with named variables allows for easy and intuitive editing of the CG

data. In addition, the organization of the document as defaults, participants, and context (local

and global) provides an intuitive manner for organizing documents.

Propositions, after they are entered as utterances, can also be named, and therefore can become

arguments for further propositions. This provides a natural way to cluster large conceptual graphs

into smaller chunks.

The text generation component proceeds from this information, according to the following steps:

• Pronouns are generated when possible, using the local and global context information.

• Referring expressions are planned using the competing expressions from the context infor-

mation, excluding and including information and features of the object in the generated

text, so the object’s identity can be resolved by the reader, but without adding unnecessary

information.

100

• Aggregation of utterances which share certain features using the aggregation algorithm de-

scribed in [Shaw, 1995].

Consider the example cooking recipe in Figure8.3. The author uses the participants section to

introduce the ingredients needed for this recipe. One of the ingredients is ”six large eggs”. The

author first chooses an identifier name for the eggs, for example, ”eg”. From the initial list of

concept types proposed by the system, we choose the concept type ”egg”. Pressing the dot key

will indicate we want to provide the system with further information about the newly created

object. We choose ”quantity” from a given list by typing ”qu”, seeing that the word ”quantity” was

automatically marked in the list. Pressing the space key will automatically open brackets, which

indicates we have to provide the system with an argument. A tool tip text will pop up to explain

the function of the required argument to the user. After entering a number, we will hit the space

bar to indicate we have no more information to supply about the ”quantity”; the brackets will be

automatically closed. After the system has been told no more modification will be made on the

quantity, the ”egg” object is back as the active one. The system marks the active object in any

given time by underlining the related word in the input text.

Pressing the dot will cause the list box to pop up with the possible modifications for the object.

We will now choose ”attribute”. Again the system will open brackets, and a list of possible concepts

will appear. The current active node in the graph is ”attribute”. Among the possible concepts we

will choose the ”big” concept, and continue by clicking the enter key (the lexical chooser will map

the “big” concept to the collocation ”large” appropriate for ”eggs”). A new folder in the global

context view will be added with the title of ”egg” and will contain the new instance with its identifier

and description as a CG in linear form.

Each time a dot or an identifier is entered, the system converts the current expression to a

CG, maps the CG to a FUF Functional Description which serves as input to the lexical chooser;

lexical choice and syntactic realization are performed, and feedback is provided in both English

and Hebrew.

The same generated sentence is shown without context (in the left part of the screen), and in

context (after reference planning and aggregation).

When generating utterances, the author can refer to an object from the context by clicking on

the context view. This enters the corresponding identifier in the utterance graph.

101

8.2 Bliss Communication Board

The description of the Bliss communication board involves three aspects which are described sep-

arately in this work:

1. The input language - in this case Bliss symbols (presented above in section 7.2.1).

2. The overall layout of the display (section 8.3).

3. The processing method - adjusting the SAUT methods to the communication board (section

8.4).

8.3 Implementing a Communication Board

The main objective in the design of a communication board is to make it efficient: both in reducing

the number of selections (especially when selection of symbols is not direct) while preserving a

logical order of selection and keeping the attention of the user tuned, and allowing a wide expressive

capability.

There are several strategies to enable this desired design:

1. Displaying most frequent symbols first – symbols that are rarely used should be reachable but

not be placed in main or initial displays, to avoid overload and to reduce the list of choices.

2. Designing displays by categories – conversations are conducted in different contexts and there-

fore distinct vocabulary may be used. Specifying the context in which the conversation is

conducted, again, reducing the possible symbols to be displayed and is therefore desired.

3. Displaying symbols through paradigmatic relations – the paradigmatic axis – displaying sym-

bols which are possible in the current context (for instance using selectional restrictions).

4. Displaying symbols through syntagmatic relations – progress on the syntagmatic axis, i.e.,

displaying the symbols according to syntactic context.

5. Hierarchical view of ideas – taking into consideration the structure of a conversation; for

instance, focus first on the representation of an event, then provide more specific details

about it.

102

6. Hierarchical view of form – identifying a rhetorical structure and the manner in which it

affects possible syntactic structures.

7. User and context dynamic adjustment – learning the user’s preferences both of vocabulary

and grammatical structures.

Taking advantage of the special characteristics of Bliss symbols, our system provides an answer

to the first four strategies in this list.

Frequency lists of words are available online 3 or can be calculated through keeping track in log

files (as was done, for example, by [Copestake and Flickinger, 1998]).

An initial (main) display is set in advance for each particular user.

There can be three main methods of display arrangements. The first one is the typical AAC

method of pre-defined boards (dynamic boards following [Burkhart, 2005]). The second is by

using Blissymbols unique characteristics, i.e., a virtual keyboard with atomic shapes displayed,

and when a symbol is selected, all connected symbols are retrieved (i.e., if the money symbol

is chosen then bank, business, cheap, clerk, coin, convenience store, expensive, fee,

poor, price, prostitution, rich, shekel, store, to buy, to earn, to finance, to pay,

to sell, wallet are displayed). Both methods are implementable with the tools built in this work.

The third method dynamically changes the symbols on the display following the SAUT author-

ing method. We now provide details about this approach.

As in the regular SAUT system, the display is divided into four main areas:

1. A list of participants and of defaults.

2. Buttons (as will be elaborated below)

3. A text pane where the chosen symbols are displayed as the sequence is entered and the

(possibly partial) sentence generated.

4. Text in context.

When initializing a conversation, the display can be set to the participants of the conversation

(to allow quick reference) and defaults can be set – such as the tense or mood of the conversation.

These contexts can be saved to a file and can be reloaded as needed.3For instance,

http://www.aacinstitute.org/Resources/ProductsandServices/PeRT/040615GeneralCoreVocabulary.txt

103

The display contains the set of buttons (or keys) which are further divided into three types:

1. function keys

2. hyperlink to other displays

3. symbols

.

Function keys control editing function such as delete, back (previous screen), reset.

Pressing a hyperlink button leads the user to other displays according to his desire (context

displays such as home, food, and school, properties for modifying symbols or utterances (with

adjectives or adverbs), sentence starters (following the dynamic displays approach), which allow

generation from pre-defined sentence structures (represented as CGs at the authoring level) that

can be filled as templates.

The symbols buttons display Bliss symbols (with the Hebrew/English word written below).

To allow control on grammatical factors, each display presents a constant set of language function

symbols (Bliss indicators, such as past/future/present indicators) as well as mood symbols functions

(to indicate whether the sentence is a question or an imperative).

8.4 The Processing Method - Adopting the SAUT Technique

Our objective is to view the operation of selecting Bliss symbols in context as a form of semantic

authoring. We aim to adapt the general semantic authoring method implemented in SAUT to the

context of symbol selecting in Bliss. However, SAUT is a textual system and follows conventions

that are common in computer language editors, such as Intellisense in Microsoft Visual Studio

application. Since a Bliss communication board is not a textual system, the conventions must be

adjusted to the symbols set.

In SAUT, the dot key is used when the user wants to add information about an entity he has

chosen, for instance Boy.Attribute(American) for generating the phrase “An American boy” or

Boy.Plural for “Boys”. SAUT will offer in a pop-up menu all possible relations that can have

“boy” as their argument and all concepts which can stand in relation with the given word.

In the case where the chosen symbol refers to a concept that requires one or more arguments,

the SAUT system opens brackets and offers in the pop-up menu all items that can be inserted as

104

the argument (following the selectional restrictions and following the arguments order as given in

the ontology).

In the case of a communication board, neither the dot key nor the space bar are used. When a

symbol is chosen and is looked up in the ontology, the same two possibilities exist: if the sym-

bol requires arguments, the next display will show symbols that are compatible with the se-

lectional restrictions. Problems arise when an argument needs to be modified, such as in the

case of the sentence The boy I met yesterday lives here. The SAUT input representation is

Live(Boy.Meet(I).Time(yesterday) Here). If dot/space are not used, and the display changes

according to the arguments, the symbols of the display that will be generated once the symbol

“boy” was chosen will contain the location of symbols (as the selectional restrictions of the verb

live requires). This problem can be solved in three manners:

1. By using a dot-like button to add properties to the last chosen symbol

2. By semantic parsing

3. Using editing options in the presentation of the symbols in the text pane

The second option was rejected since we did not want to use parsing in the process of the

text generation. This could have been done by building partial conceptual graphs for each symbol

inserted, then using unification to find the best possible assembly and generating the sentence

following the existing process.

The third option of using editing options on the text pane requires additional keystrokes and is

inefficient for the purpose.

The compromise solution we have adopted is to add a properties hyperlink button to the

display, which can be chosen after the symbol that needs to be modified. The properties that are

displayed are determined in the same manner the dot key offers the possible complementizers in

the SAUT system. This method is also compatible with existing dynamic displays which use a

hyperlink button of properties that link to pages with qualities (such as adjectives, adverbs).

Once the main verb and its arguments are chosen, the system offers possible modifiers (imple-

mented as circumstentials and adverbials in the SURGE/HUGG syntactic realizers [Elhadad and

Robin, 1996]) and represented as relations in the ontology.

105

Symbols that were used in previous sentences in a current conversation are added to the par-

ticipants list and can be referenced again with a direct selection. Using these symbols enables

generation of reference expressions and aggregation where needed.

8.5 Summary

Computerized AAC devices are characterized by four aspects [Hill and Romich, 2002] [McCoy and

Hershberger, 1999]:

1. Selection method

2. Input language

3. Processing method

4. Output medium

In this chapter we have discussed our implementation of the processing method that is used in

the display.

We surveyed the process of NLG through Semantic Authoring and the methods that are used

in SAUT system for that purpose. Adoption of SAUT technique for the processing method of the

communication board and the overall properties of the Bliss dynamic display distinguishes our work

from previous NLG-AAC systems.

The next chapter compares such systems from these two aspects (input language and processing

method) and compares the overall techniques to this current work.

106

Chapter 9

Comparison with Existing NLG-AAC

Systems

This section surveys existing NLG-AAC systems. We highlight the common architecture underlying

the various systems, and compare the elements in which our system differs from existing ones.

9.1 Blisstalk

A first attempt to generate text from a sequence of Bliss symbols was done by Sheri Hunnicutt

[Hunnicutt, 1986], in a system called Blisstalk. BlissTalk is a dedicated communication board

with a grid of 504 squares. Most of them are dedicated for lexical items (Swedish) and a few are

reserved for general system functions and tuning. The symbols are arranged according to their

part of speech. Names and words without a known symbol can be added to the display. Additional

symbols on the board refer to functions such as Bliss indicators (that can be used to modify the

part of speech of a chosen symbol), syntactic functions such as tense or number, and a special

set of symbols that can add information to the concepts on the board, such as combination or

similar-to. In addition, the display includes letters and digits.

The underlying lexicon includes pronunciation information for the words represented by the

symbols on the board, their part of speech, and additional morphological information.

The strategy that was taken in Blisstalk was to adapt the speaker’s syntax as the input

language. Blisstalk uses a phrase structure grammar to parse the given sequence of symbols.

107

The parsing is done gradually by introducing phrase markers, grouping the input symbols into

verb or noun phrases. Each phrase type determines which words it can contain. Noun phrases

can be further processed, using ordering conventions, into double objects, subject-object pairs, or

both. Delimiting the phrases is intended to avoid ambiguity in the processing of the complete

sentence: if a symbol that represents a noun is located in a verb slot, it is inflected as a verb with

the morphological rules.

Blisstalk relies on a syntactic parsing solution to complete and revise the input sequence and

make it more fluent. This approach suffers from two limitations: syntactic parsing on noisy input

can only have limited success; in particular, because semantic information is not used (but only

part of speech data for each symbol), there is not enough knowledge to recover from parsing errors.

The second limitation is that the revision approach works only after the input sequence has been

composed – and as a result can only improve fluency, but cannot improve the input and selection

rate.

In contrast, our approach relies on semantic authoring, and provides tools To both assist input

composition and produce more fluent output. Semantic authoring avoids the complexity of parsing

(syntactic or semantic) by controlling the input composition process.

9.2 compansion

Compansion (Compression-Expansion) was developed to expand un-inflected sequences of content

words (telegraphic text in other words) into syntactically and semantically well-formed sentences

[McCoy, 1997].

For example, John go store yesterday is transformed to the well formed sentence John went

to the store yesterday [Pennington and McCoy, 1998].

The system was originally developed for enhancing the communication rate of people who use

telegraphic/iconic inputs, but can be also used for supporting literacy skills or for correcting writing

errors.

The main difficulties in accurately transforming an ill-formed sentence into a well-formed para-

phrase is in detecting multiple errors in one utterance, as well as the possible ambiguity of inter-

pretation. For example, John gone to the store can be interpreted as John went to the store or

John had gone to the store. A possible solution to this problem is to generate all possible sentences

108

for a given input. The selection of the best suggestion can rely on the history of the inputs produced

by the user.

The solution applied in compansion is to perform semantic parsing of the input sequence.

The process begins with a “word order parser” which groups words into sentence-sized chunks

and indicates words’ part of speech. Modifiers are attached to words they most likely modify.

The semantic parser of the compansion system is based on the use of case frames, i.e., con-

ceptual structures that represent the meaning of the content words and the relationships among

them.

More specifically, the parser builds the case frame structure of the verb in the utterance, filling

the slots with the rest of the content words given. The case frames are similar in spirit and definition

to the structures encoded in FrameNet (see section 7.1.2).

A list of semantic roles was chosen for the purpose:

AGEEXP - Agent/Experiencer (no intentionality required) - John is happy

THEME - object acted upon

INSTR - object used to perform the action of the verb

GOAL - a receiver of the action

BENEF - the beneficiary of the action. e.g., John gave the book to Mary for Jane

LOC - event location (can further be decomposed to TO-LOC, FROM-LOC, and AT-LOC).

TIME - time of event and tense.

The semantic parser constructs the most likely interpretation of the given input, using a set

of scoring heuristics based on the semantic types of the input words (such as preferring animate

agents for actions), and a preference set of the most important slots that should or must not be

filled for a verb (for telling apart transitive/intransitive verbs for example). The parser generates

all possible structures, which are later scored. Some options are discarded if their score does not

reach a pre-defined cut-off measure. The rest of the candidates are ordered by scoring rules (e.g.,

“prefer animate agent for the verb eat”).

A further improvement for the system is handling the choice of most likely syntactic structure

for a given input. The input Apple eat John can be generated as John ate the apple or John was

109

eating an apple, etc. For this purpose the system was combined with statistical information from

corpora, frequencies of subcategorization from [Ushioda et al., 1993], and lexical information from

WordNet [Miller, 1995].

In the absence of a verb in a telegraphic message, the slot is filled with either to be or to have,

and in the absence of an agent - the pronouns I/You are inserted.

Once the semantic structure was determined, a translator/generator generates the compatible

sentence in English.

The approach of semantic parsing investigated by compansion has much higher relevance to

parsing the style of telegraphic input characteristic of AAC situations than syntactic parsing. As

in our approach, the system relies on a semantic representation to re-generate fluent text, relying

on lexical resources and NLG techniques. Our approach differs in that, with the model of semantic

authoring, we intervene during the process of composing the input sequence, and thus can provide

early feedback (in the form of communication board composition and partial text feedback). We

have not yet performed user evaluation to assess the difference in performance between seman-

tic parsing and semantic authoring. Eventually, we expect that even semantic authoring in the

context of an AAC application will require some semantic parsing (to avoid introducing even the

“simple” addition of syntax upon which we relied in SAUT for the non-AAC application – use of

disambiguating operators such as the dot, comma, or parentheses in the SAUT input language).

[McCoy et al., 1998] further investigated the integration of the Compansion technique into

Minspeak c© [Baker, 1984], and more specifically into Communic-EaseTM , one of Minspeak’s Ap-

plication programs (MAPsTM ), which contains vocabulary for children. It contains 580 words

classified into 38 general categories, which are coded in the traditional Minspeak method (see

section 2.2). In addition, it handles some morphology, for example, by adding plural marks. The

system runs on a the PRC LiberatorTM dedicated AAC device, with an Interface Display to present

the textual output. The method takes advantage of the Icon Prediction of the Minspeak device,

(70 DECL(VERB (LEX EAT))(AGEXP (LEX JOHN))(THEME (LEX APPLE))(TENSE PRES))

Figure 9.1: The preferred semantic structure for the input Apple eat John

110

but adds an engine with a simplified version of Compansion as an Intelligent Parser/Generator

(IPG). The IPG works incrementally and in parallel, as the icons are selected, and provides fur-

ther constraints on the Icon Prediction process. Based on an analysis of logged transcripts of

Communic-EaseTM users, it was found that very simple sentence structures are mainly used. A

set of transformation rules was developed (which can be tailored later for individuals). If several

interpretations are found for a given input, all possible realizations are offered on the display.

The pros and cons of incremental and non-incremental processing are discussed in the paper. In

the case of incremental processing, if the system generates, for instance, a definite article instead of

an indefinite as was intended by the user, it can be fixed before the rest of the sentence is entered.

In contrast, if the process parses the complete message, the revision will be done only in retrospect.

On the other hand, constraints that are enforced on the Icon Prediction can become a burden to the

user, especially children (who form the target population of the system): the assumption is that it

is unlikely that the intended user will be able to keep the sentence in his mind word-by-word, select

icons, and evaluate the system’s output, and, therefore, the system offered uses the non-incremental

method.

Our system does use incremental processing, but as a controlled process and without doing

parsing. We have not tested it on any particular AAC population. However, we checked usability

of the SAUT system and the semantic authoring system was found to be easily learned. In addition,

we have put much effort on developing a wide-coverage vocabulary which is presented both in the

ontology and the lexical choice modules. The results of the Compansion evaluation certainly need

to be pursued in the context of our proposed semantic authoring approach.

9.3 Transforming Telegraphic Language to Greek

The system presented in [Karberis and Kouroupetroglou, 2002] generates full sentences from tele-

graphic input, possibly from a sequence of Blissymbols or MAKATON symbol set.

The system includes the following components:

• An input device for telegraphic input (either text or symbols)

• The TtFs telegraphic-to-Full-Sentence module – the main component of this system which

transforms a compressed, incomplete, and ill-formed Greek text into a full grammatical Greek

111

sentence

• The output device - either a text-to-speech or a written device (e-mail, printing device)

The two main components of the TtFS module are:

1. Pre-processor: assigns each word its part-of-speech and adds function words to the sequence.

The output of this component is a full but agrammatical sentence. This process itself has

three substeps:

(a) dividing the sentence into sub-clauses if it includes conjunction words

(b) adding missing articles to nouns located before the verb

(c) adding missing articles to nouns that follow the verb, according to their semantics

Verb transitivity features and noun semantics are encoded in the lexicon, as well as each

word’s morphological data.

2. Translator/Generator: applies a set of syntactic and semantic rules on the un-grammatical

input and generates a well-formed Greek sentence.

This process has the following substeps:

(a) The lexicon is consulted to assign five features for each word: tense, case, gender, person,

and number.

(b) A set of syntactic patterns are checked to find the syntactic functions of each word, or

to add words if those are missing. For instance, in the absence of a subject, the pronoun

I is added. For subordinated clauses with no subjects, the subject of the main clause is

assumed.

(c) After the assessment of syntactic functions, the five features attached to each word

are processed to form the right structure of agreement (for each part of speech some

particular features are assigned, and the rest are NULL).

(d) Finally, words are inflected according to their features.

The system assumes a fixed word order, i.e., SVO and the presence of all content words in

the input. This approach combines Bliss symbols with syntactic and semantic parsing. It is quite

112

similar in scope and intention with the system we have presented. The method is quite different in

that it does not rely on an explicit NLG framework, and relies on specific pattern matching rules

for syntactic and semantic parsing. As in the previous systems reviewed, it does not attempt to

intervene during the composition and selection stage, thus it does not attempt to improve the input

rate, but only the output fluency.

9.4 pvi Intelligent Voice Prothesis

The PVI ([Vaillant and Checler, 1995] [Vaillant, 1997]) system is a communication tool that aims

to expand a sequence of Bliss symbols into sentences in French. The underlying assumption of

the system design is that textual icon-to-word is not sufficient to represent the meaning of the

desired message, nor can a Context Free Grammar distinguish the different structures that should

be generated from very similar input without semantic parsing, for example:

boat to eat (I eat in a boat)

steak to eat (I eat a steak)

Therefore, in order to achieve good automatic interpretation and re-generation in natural lan-

guage from a sequence of symbols, semantic analysis must be made, finding the best words that

convey the meaning of the icons, and a syntactic realizer to produce the full utterance in a natural

language is required.

To prepare the system, a thorough corpus analysis was performed. The corpus was collected

from speech acts of children with Cerebral Palsy in the Kerpape Rehabilitation Centre in France,

in distinctive pragmatic situations (spontaneous speech, training situations, supervised communi-

cation sessions). The set of chosen icons became the basis of the lexicon.

The words were then analyzed through their paradigmatic dimension, and divided into taxemes

of semantic domains such as food, alimentation, movement, game, and more.

A syntagmatic analysis was also conducted, i.e., the occurrence of words in a sequence. For

example, this analysis identified that the to eat symbol subcategorizes for two casual functions,

the agent and the object.

Each icon was assigned a semantic content which includes its taxeme (classification item) and

the elementary features that are specific and unique for its taxeme. Each icon was also assigned

113

its own features that distinguish it from other icons, mostly binary features which constitute the

semantic primitives of the system. The set of features that belong to an icon and those of the word

that may represent it are not necessarily identical.

Once a sequence of icons is entered in the system, a semantic analysis process attempts to

recover its meaning by building a meaning representation of the utterance. First, the analyzer

scans the input sequence from left to right, searching for a predicative icon. Then, the rest of

the sequence is searched for an icon that can be a filler to one of the free predicate slots, using a

process of unification, conforming the semantic features of its functors with that of the identified

arguments. There may be a recursive situation where an icon that was identified as a functor of

one predicative icon is predicative by itself, therefore, unification proceeds until all possible free

slots are instantiated. This process continues until the entire sequence is processed.

The compatibility of features during unification process is a binary relation of two types:

1. Inclusion: if the semantic constraint expressed by the case feature is mandatory, i.e., C(a,b)

= 1 if all features of a are present in the features of b and have the same value, 0 otherwise.

2. A scaled product of the two sets of features if more or less acceptable solutions are found. In

that case, C(a,b) = number of features of a with the same value as in b / the total number of

features in a

During the parsing process, all possible solutions are evaluated, each is scored using the above

mentioned measures, and the best solution is eventually chosen. The output of this process is a

linear form of a semantic network, based on the order given by the input utterance.

Since there is no one-to-one mapping between icons and words, or between the structure of the

semantic network and the syntactic structure, a lexical choice module is crucial before syntactic

realization. The output of the parsing process consists of semantic networks (semioms in the

author’s terminology), which are clusters containing semantic features. These clusters may not

match with any linguistic entity. PVI’s lexical choice strategy is either short-circuit, i.e., distinct

semantic entities are unified into a single word, or derivation, i.e., icons with rich content are

expressed with more than one word.

Once a lexical choice is accomplished, three mechanisms of syntactic realization are applied:

• word order determination

114

• inflection

• insertion of functional morphemes (determiners, prepositions, etc.)

The lexicon contains elementary syntactic trees representing possible phrase construction, each

tree contains information of the lexeme corresponding to the scheme, and the morphosyntactic

structure expressing its casual structure. The semantic network is traversed following the semantic

links. For every node (sememe), a corresponding elementary tree is selected, later assembled using:

• substitution (for mandatory functors)

• adjunction (for optional functors)

These two operations define a Tree Adjoining Grammar (TAG) [Joshi, 1987], eventually generating

the output in French, pronouncing it with a voice synthesis device.

The PVI display (communication boards) includes the symbols presented with their correspond-

ing word and possibly a sound. In addition to the symbols, the display includes action buttons

that refer to the display parameters. PVI is designed for multiple access options: direct through

pointing devices, keyboard or scanning with switches.

The system was evaluated [Vaillant, 1997] with a lexicon of 300 icons only, which is a very

limited set. The results show 80% success as measured by levels of correctness of both semantic

and syntactic analysis.

PVI is the system most similar to our system, differing mainly in the following aspects: we

rely on a standard meaning representation framework (the Conceptual Graph formalism) and the

operations CG provides, instead of the specific mechanism used in PVI; we use existing NLG

resources (lexical chooser, syntactic realization) and, in particular, a large-scale lexicon for both

Bliss symbols and for the English and Hebrew realization. Finally, as in the case of all the systems

discussed above, the approach of semantic authoring, as opposed to semantic parsing, allows us to

intervene in the input construction process.

9.5 cogeneration

Cogeneration [Copestake, 1996] [Copestake, 1997] was developed as a tool to enhance communica-

tion of people who suffer from ALS (Amyotrophic Lateral Sclerosis), and tend to prefer using their

language with textual AAC means and not to use symbolic communication.

115

The system combines template-based sentences, statistical Information, and NLG techniques.

A set of pre-defined templates are stored and categorized by particular dialogue situation labels.

A user chooses a template and a list of slots to fill is offered to him. Some slots have default values

(tuned by previous inputs), some slots are optional. Word or phrase prediction is possible while

instantiating the slot. The cogenerator combines the constraints on the given slots with syntactic

constraints and, with statistical information, eventually generates the desired output utterance.

For instance, in a template for request, the user may enter the sequence open kitchen window

so that the generated text will be Please, could you open the window, or, if an urgent label is

chosen, the output will be Open the kitchen window! The information of the underlying structure

can, later on, instruct the voice synthesizer about the appropriate intonation of the utterance.

The knowledge that is required for this process includes:

1. a set of application dependent and independent templates

2. statistical information about collocations and preferred items

3. a syntactic realizer and a lexicon, syntactic structures

While input is being entered to the template, a word prediction program uses statistical infor-

mation to find the most probable word and to offer completions. Compounds and collocations (such

as kitchen window) are recognized so they are not split, and the right stress is given in vocalizing.

However, in order to recognize collocations or compounds that were not seen earlier in a corpus,

statistical information is backed off with lexical-semantic information.

The cogeneration system addresses the objective of speeding up entry rate using NLG and

machine-learning techniques. The techniques presented are mostly orthogonal to the approach of

semantic authoring we investigate. The integration of these techniques together with our approach

seems a promising avenue of research.

9.6 Summary

The four systems described above share the common ground of using NLG techniques for message

generation.

Our system, PVI, and the Greek system were implemented for Bliss symbols (they also claim

to be compatible with other symbolic languages). compansion was implemented for textual tele-

116

graphic input, and the Cogeneration system possesses characteristics of systems with prestored

sentences, but since it combines NLG techniques with templates, it has evolved into an NLG sys-

tem.

Although they differ in the above mentioned manners, all systems share a common architecture

of a typical information flow:

• Insertion of iconic/telegraphic input

• Identification of the internal semantic representation (through parsing or unification)

• Re-generation in natural language - lexical choice and syntactic realization

Our system differs from this approach in the processing method: the semantic representation

is built during the insertion of the symbols and therefore no parsing is conducted. As discussed in

previous chapters (see 3), parsing telegraphic text, especially with a free vocabulary, causes a variety

of problems. Our system imposes a less natural manner of symbol choice and insertion, but the

method of semantic authoring we investigate provides both the potential to improve input rate and

output fluency, while avoiding most of the difficulty of semantic parsing inherent in post-processing

approaches.

117

Chapter 10

Evaluation

The hardest task we encountered in evaluation was to determine what should be evaluated. For

both aspects of the system, NLG and AAC, the definition of measures for evaluation remains the

topic of ongoing discussion in both research communities.

An AAC system must address three main functions: allowing a user sufficient expressive power,

enhancing the rate of communication, and improving the ease of communication [Cornish and

Higginbotham, 2000a]. Expressive power depends on the set of symbols that are used and the

vocabulary that is offered and depends very much on the cognitive and physical abilities of the

user. In addition, the scope of the vocabulary offered depends on the device and its limitations.

Rate enhancement is required since the average rate of communication of an AAC user is very low

relative to a speaking person.

Since our system is constructed from distinct modules such as HUGG, Bliss lexicon, integrated

verbs lexicon, ontology, and the SAUT authoring system, it would have been desirable to evaluate

each constituent by itself. Instead, we focused on an evaluation of the integration of these tools

into a single AAC application, which is the main focus of this work. As evaluation of the AAC

application proceeds, we intend to pursue more component evaluation as well.

The next two sections present the difficulties in evaluation of NLG and AAC systems. Section

10.3 discusses the aspects that are to be evaluated in our system. Section 10.4 presents an experi-

ment that was conducted to evaluate the SAUT system - as the closest simulation of the full AAC

scenario we intend to eventually support.

118

10.1 Evaluation of NLG systems

Evaluation of an NLG system is a difficult task - and there are still no definite criteria for doing

so. The difficulties are due to several aspects in the nature of an NLG system such as the various

forms of inputs which are affected by the purpose, domain, and knowledge sources to which the

system refers.

It is also unclear what to evaluate in the output, as evaluation in terms of quality and coverage

are not always appropriate.

[Dale and Mellish, 1998] discuss the main questions the NLG evaluation raises, and primarily

divide the evaluation task into three main categories:

1. Evaluation of properties of the theory – measuring properties of the linguistic underlying

theory such as coverage and domain-independence

2. Evaluation of properties of the system – comparing characteristics of a system such as speed,

coverage and correctness with similar systems

3. Application potential – evaluating the system in an appropriate environment to determine

whether NLG provides a better solution than alternative systems.

[Bangalore et al., 1998] distinguish between:

• Intrinsic evaluation: judging quality criteria of the generated text and its adequacy relative

to the input. This is usually performed by asking human judges to evaluate these criteria

and assessing agreement among the judges. The key criteria tested are accuracy, fluency, and

coverage.

• Extrinsic (task-based) evaluation: judging the way the generated text helps people perform

specific tasks. For example, in our case, an extrinsic evaluation would consist of measuring

the time it takes an AAC user to order goods over a chat conversation with an on-line store.

• Comparative evaluation: comparing the performance of the system with similar systems, by

comparing the output (one system’s output is used as a benchmark or gold standard) and

the performance of the systems.

119

These three aspects were measured in [Miliaev et al., 2003] for a system producing technical in-

structions on how to operate electronic equipment. A similar large-scale evaluation was performed

in the AGILE system [Hartley et al., 2000].

When available, a corpus of data, and parallel text representing it, can serve as a basis for

comparisons but this is most often not available. Even in a corpus-based technique, differences

between the computer-generated texts and written human ones can occur in the various levels of

generation (such as misinterpretation of data, wrong choice in lexicalization, usage of other syntactic

structures). Evaluation in this case can be measured by human judges or in terms of post-editing

[Sripada et al., 2005]. Stochastic-based methods of generation such as [Bangalore et al., 2000],

enable evaluation methods that are more similar to evaluation of natural language understanding

systems. Two methods for a quantitative evaluation were defined: string-based metrics and tree-

based metrics, however, these metrics are only possible when a corpus is available and dependency

trees of the target sentences are structured.

[Callaway, 2003] [Callaway, 2005] provide thorough analysis of the coverage and performance

of the SURGE realization grammar. To this end, the author used the parsed corpus of the Penn

Treebank and automatically converted the syntactic parse tree to a SURGE input FD. SURGE

was then used to regenerate the sentences from these input FDs and the generated sentences were

compared with the original sentences. The analysis measures a coverage of 98.5% for SURGE,

69.3% of the sentences were generated with an exact match. In approximately 50% of the sentences

without an exact match, the errors were caused from the transformation process of the inputs. The

main errors that are caused by the SR are syntactic (handling inversions, missing verb tense, mixed

conjunctions, mixed type of NP modifiers, direct and indirect questions, mixed level quotations,

complex relative pronouns, and topicalization). The rest of the errors are due to mistaken ordering

of the sentence constituents or wrong punctuation.

10.2 Evaluation of AAC systems

Evaluating an AAC system is also challenging. Like in an NLG system, an AAC system is also

composed of several components and it is not always clear which component is responsible for the

results of the evaluation [McCoy and Hershberger, 1999]. Moreover, an AAC system is defined by

several aspects (such as the input language, selection method, processing technique, and its output,

120

see section 2.1.4), while a specific prototype AAC system may have to focus on one particular aspect.

As a consequence, it is often not possible to evaluate the full system performance.

In compansion, for example, a system that focused on processing the telegraphic input, the

evaluation was in terms of keystroke savings. Each root word was considered a keystroke and

inflection morphemes were considered additional keystrokes. The measurement of the system’s

performance was the ratio between the number of words in a full sentence and the number of

words that were inserted in the telegraphic message [McCoy and Hershberger, 1999]. However, it

was recognized that the evaluation must also refer to the quality of the text generated and the

adequacy of its meaning.

This kind of evaluation was performed in the pvi system. The evaluation in [Vaillant, 1997] refers

to the number of utterances that were interpreted correctly and did not refer to the enhancement

of communication rate. [Vaillant, 1997] showed 80.5% acceptability of sentences generated from a

set of 300 Bliss icons. Sentences were considered acceptable if they managed to represent meaning

correctly (i.e., correct semantic analysis), but were not necessarily realized correctly (including

clumsy generation).

An additional point to consider in evaluation is its performance by AAC users. In [Higginbotham,

1995], it is argued that use of nondisabled subjects in evaluations of AAC research, when ap-

propriately used, can be easier and cheaper to obtain, and in some cases viable and preferred.

Higginbotham’s claims are that the methods of message production are not unique to AAC users

and analogous communication situations can be found for both disabled and nondisabled users.

Nondisabled subjects can contribute to the understanding of the cognitive processes underlying the

acquisition of symbol and device performance competencies.

On the other hand, when the pvi system was tried on AAC subjects (who suffered from cere-

bral palsy) some problems were encountered that could not have been detected with nondisabled

subjects:

1. Frustration from error in generation

2. Lack of vocabulary

3. Interface adjustability

[McCoy and Hershberger, 1999] analyzed AAC user-therapist conversations to identify the va-

121

riety of cases to be considered and processed in a message production system, as a basis for further

evaluation. However, the limitations of this method are also listed: previous knowledge of the

therapists affects sentence production; people may show changes in behavior when interacting with

a computer; telegraphic style may be intentional or it may be caused by lack of syntactic skills or

due to absence of syntactic markers on the display.

[McCoy and Hershberger, 1999] conclude that evaluation must be made by choosing a specific

population and the interface must be tailored to each individual in order to assure smooth usability

of the system. If a system claims to enhance communication rate, it must be realized that the rate

may be reduced (or enhanced) by factors that are not necessarily the novel processing method.

Moreover, a system may be found to slow communication but to increase literacy skills. As a

possible solution, the authors suggest taking advantage of system components that were proven as

useful for a given population and to change only the tested component.

Rate measures have been expressed in terms of number of selections, switch activation, or lin-

guistic units in a time unit (minute or second), mostly for typing tasks [Cornish and Higginbotham,

2000b] and in non-interactive experimental environments.

[Cornish and Higginbotham, 2000b] offer a segmentation method to distinguish omission of

small and large units in a utterance, in order to calculate rate measures with reference to the full

sentence. Big units (BU) are full phrases (such as a transitive verb with both subject and object,

an adjunct preposition phrase or idiomatic expressions) and small units (SU) are unique function

words such as determiners or prepositions. The proposed analysis is to use BUs to determine

communication efficiency by calculating BUs for time unit per user in an interaction or in a turn

of interaction. SUs can be used in calculation of message complexity in a BU/SU ratio. Omissions

can be calculated per time unit as well. Calculating efficiency gained by omission can consider

the amount of BU and SU in the message against its full sentence interpretation [Cornish and

Higginbotham, 2000b].

[Cornish and Higginbotham, 2000a] define a selection savings measurement and compare four

systems. In addition, four linguistic metrics to measure quality of message generation are defined.

Test utterances are taken from natural speech corpora. The metrics are:

1. Surface match – number of shared words between the corpus and the generated message.

Articles are excluded, and lexical match is calculated with another measure.

122

20:37:00 "I need "20:37:05 "*[VOLUME UP]*"20:37:06 "*[VOLUME UP]*"20:37:07 "*[VOLUME UP]*"20:37:14 "something "20:37:16 "to drink "20:37:19 "i"20:37:20 "m"20:37:28 "ediately"

Figure 10.1: Output of LAM [Hill et al., 2001]

2. Pragmatic function match – measures the ability of the message generated to fill the same

communicative goal as the utterance from the corpus. Utterances were tagged with speech

acts tags such as statement, reply, and answer, then percentage of matching calculated.

3. Lexical item match – this measures relatedness between words in the source and produced

utterances such as synonym, hypernym, and hyponym, and coordinate terms (which are

hyponyms of the same lexeme).

4. Perceived match – criteria given to nondisabled judges to rate (i) surface form match, (ii)

pragmatic function, and (iii) an overall match.

Measuring communication rate is done in many cases with the Automated language activity

monitoring (LAM) performance tool that is used to collect language quantitative data (see Figure

10.1). Fourteen measures such as average communication rate, peak communication rate, and

selection rate were defined [Hill et al., 2001] [Hill and Romich, 2001].

Measuring the usability of a system, a la [Cornish and Higginbotham, 2000a], can be done by (i)

asking subjects to produce a set of utterances, (ii) giving subjects a general task which requires the

generation of the message, (iii) giving a device to be used as the communication tool in simulated

situations, and (iv) use in real situations.

10.3 Evaluation our System

We evaluate our system as an AAC application for message generation from communication boards.

From an NLG evaluation perspective, this corresponds to an intrinsic evaluation.

Since the prototype of our system is not yet adjusted to interact with alternative pointing

devices, we could not test it on actual Bliss users, and could not perform a full extrinsic (task-

based) evaluation.

123

Moreover, the approach we offer for generation of messages is novel and requires a user to plan

a sentence in advance. It cannot be compared with systems of NLG-AAC. In any case, as we have

shown above, there is no uniformity in evaluation techniques in such systems.

[McCoy et al., 1998] discuss the possible usability of incremental generation of a message in a

system designated for children and assumes that the need to plan a message in advance and the

cognitive load of the possible Icon Prediction will be too much of a burden on the user.

We have evaluated the use of semantic authoring on nondisabled subjects and can give an

approximation of the possible learning curve and usability of the system in general.

Section 10.4 presents an evaluation of the SAUT system which provides a good indicator of the

usability potential of our AAC system.

Section 10.5 defines a detailed evaluation scenario for calculating efficiency (or enhancement

rate) to be carried in the future.

10.4 Evaluating SAUT

The objectives of the SAUT authoring system are to provide the user with a fast, intuitive, and

accurate way to compose semantic structures that represent the meaning s/he wants to convey,

then presenting the meaning in various natural languages. Therefore, an evaluation of these aspects

(speed, intuitiveness, accuracy, and coverage) is required, and we have conducted an experiment

with human subjects to measure them. The experiment measures a snapshot of these parameters at

a given state of the implementation. In the error analysis we have isolated parameters which depend

on specifics of the implementation and those which require essential revisions to the approach

followed by SAUT.

10.4.1 User Experiment

We conducted a user experiment in which ten subjects were given three to four recipes in English

(all taken from the Internet) from a total pool of ten. The subjects had to compose semantic

documents for these recipes using SAUT.1 The ontology and lexicon for the specific domain of

cooking recipes were prepared in advance, and we tested the tool by composing these recipes with

the system. The documents the authors prepared are later used as a ’gold standard’ (we refer to1All subjects were computer science students.

124

Document # Average Time to author1 36 mn2 28 mn3 22 mn4 14 mn

Table 10.1: Learning time measures of recipe writing in SAUT

them as reference documents). The experiment was managed as follows: first, a short presentation

of the tool (20 minutes) was given. Then, each subject received a written interactive tutorial

which took approximately half an hour to process. Finally, each subject composed a set of 3 to 4

documents. The overall time taken for each subject was 2.5 hours.

10.4.2 Evaluation

We have measured the following aspects of the system during the experiment.

Coverage – answers the questions “can I say everything I mean” and “how much of the possible

meanings that can be expressed in natural language can be expressed using the input language.” In

order to check the coverage of the tool, we examined the reference documents. We compared the

text generated from the reference documents with the original recipes and checked which parts of

the information were included, excluded, or expressed in a partial way with respect to the original.

We counted each of these in number of words in the original text, and expressed these three counts

as a percentage of the words in the original recipe. We summed the result as a coverage index

which combined the three counts (correct, missing, partial) with a factor of 70% for the partial

count.

The results were checked by two experts independently and we report here the average of these

two verifications. On a total of 10 recipes, containing 1024 words overall, the coverage of the system

was 91%. Coverage was uniform across recipes and judges. We performed error analysis for the

remaining 9% of the un-covered material as described below.

Intuitiveness – to assess the ease of use of the tool, we measured the learning curve for

users first using the system, and measuring the time it took to author a recipe for each successive

document (1st, 2nd, 3rd, 4th). For 10 users first facing the tool, the time it took to author the

documents is shown in Table 10.1.

The time distribution among 10 users was extremely uniform. We did not find variation in

125

Semantic Authoring Time Translation Time14 (minutes) 6 (minutes)

Table 10.2: Translation vs. Semantic Authoring time.

the quality of the authored documents across users and across document numbers. The tool is

mastered quickly by users with no prior training in knowledge representation or natural language

processing. Composing the reference documents (approximately 100-word recipes) by the authors

took an average of 12 minutes.

Speed – we measured the time required to compose a document as a semantic representation,

and compared it to the time taken to translate the same document in a different language. We

compared the average time for trained users to author a recipe (14 minutes) with that taken by

two trained translators to translate 4 recipes (from English to Hebrew) (see Table 10.2).

The comparison is encouraging – it indicates that a tool for semantic authoring could become

cost-effective if it is used to generate two or three languages.

The rate of data entry with semantic structures is about half the rate of data entry in natural

language for non-disabled users. While this factor of two in slowdown may sound severe for an

AAC context, it is in fact very small compared to the other factors that slow down disabled users

when selecting symbols. Since the method of semantic authoring focuses on checking the validity

of input structures at data entry time, it may in fact speed up selection time – as is investigated

specifically in the section below.

Accuracy – We analyzed the errors in the documents prepared by the 10 users according to

the following break-down:

• Words in the source document not present in the semantic form

• Words in the source document presented inaccurately in the semantic form

• User errors in semantic form that are not included in the former two parameters

We calculated the accuracy for each document produced by the subjects during the experiment.

Then we compared each document with the corresponding reference document (used here as a gold

standard). Relative accuracy of this form estimates a form of confidence – ”how sure can the user

be that s/he wrote what s/he meant?” This measurement depends on the preliminary assumption

126

Document # Accuracy1 93%2 92%3 95%4 90%

Table 10.3: Accuracy percentage of four documents written in SAUT

Document # AccuracyUser error 44%

Ontology deficit 23%Tool limitations 33%

Table 10.4: Error analysis in subjects’ generated documents.

that for a given recipe, any two readers (in the experimental environment – including the authors)

will extract similar information. This assumption is warranted for cooking recipes. This measure

takes into account the limitations of the tool and reflects the success of users in expressing all that

the tool can express.

As Table 10.3 shows, accuracy is quite consistent during the experiment sessions, i.e., it does

not change as practice increases. The average 92.5% accuracy is quite high.

We have categorized the errors found in subjects’ documents in the following manner (see Table

10.4):

• Content can be accurately expressed with SAUT (user error)

• Content will be accurately expressed with changes in the SAUT’s lexicon and ontology (on-

tology deficit)

• Content cannot be expressed in the current implementation, and requires further investigation

of the concept (implementation and conceptual limitations)

This breakdown indicates that the tool can be improved by investing more time in the GUI

and feedback quality and by extending the ontology. The difficult conceptual issues (those which

will require major design modifications, or put in question our choice of formalism for knowledge

encoding) represent 33% of the errors – overall accounting for 2.5% of the words in the word count

of the generated text.

127

10.5 Evaluating Efficiency

Since the system is not yet in a position to be tested with monitoring tools, it is possible to measure

only selection savings.

However, we can estimate the keystroke savings of the system (full evaluation will be done in

the future) by defining a detailed evaluation scenario.

For this estimation, we have collected a set of sentences written in Bliss, found in

http://www.blissymbolics.org/canada/readingroom/english/text/filip – available Septem-

ber 2005. This site has a collection of sentences written in Bliss and English (and vocalized).

Table 10.5 shows a set of 19 sentences as they appear in the Internet site and the SAUT input

specification language representation as we have authored it. The second column shows the number

of words in the original sentence and the fourth one, number of steps needed for generating the

parallel representation in SAUT language. 2 As can be seen, the total number of choice steps is

133, while the total number of words in the sentences is 122.

However, counting the number of words does not include morphology which in Bliss symbols

requires additional choices. We have counted the words in the sentences considering morphology

markers of inflections as additional words, all summing to 138, as was suggested in [McCoy and

Hershberger, 1999].

This simple ratio shows no improvement of keystrokes saving using our system. Savings, there-

fore, must be calculated in terms of narrowing the choice possibilities in each step of the process.

Assuming a display with 50 symbols (and additional keys for functions) – a vocabulary of

requires 50 different screens. Assuming symbols are organized by frequencies (first screens present

the most frequently used words) or by semantic domain.

The overall number of selections is reduced using our communication board since the selectional

restrictions narrow the number of possible choices that can be made at each step. The extent to

which selection time can be reduced at each step depends on the application domain and the

ontology structure. We cannot evaluate it in general, but expect that a well-structured ontology

could support efficient selection mechanisms, by grouping semantically related symbols in dedicated

displays.

This point raises two issues: it is unclear to what extent selection speed is affected by physical2Each step is a choice point, i.e., either a dot, comma, or space functionalities of the SAUT system.

128

disability and by cognitive factors. An ontologically motivated selection mechanism needs to be

adapted both to the cognitive processes of the user and to his/her physical disabilities.

Further progress on this issue will require empirical tests with disabled users in the context of

a task-based evaluation.

10.6 Summary

In this section, we have reviewed evaluation strategies for both NLG and AAC systems. Both

fields struggle with similar issues to define evaluation metrics that can be reproduced and can drive

system improvement in a predictable manner.

We have presented two aspects of the evaluation of the AAC system we developed: we first

performed a user evaluation of the coverage, efficiency, and usability of the semantic authoring ap-

proach as implemented in the SAUT system. This evaluation has been performed with non-disabled

users in the domain of cooking recipes, and shows that authoring of semantic expressions, which

can then be used for multilingual generation, requires about twice as much time as writing text in

a natural language; usability is high, even on the rough software prototype we have implemented;

coverage was good, given a domain-specific ontology.

We then established a detailed evaluation scenario of the potential rate of data entry of the

system by analyzing a small corpus of Bliss sentences. We compared a direct Bliss data entry process

with our semantic authoring approach and counted the selection steps required. The same number

of selection steps are computed. The semantic authoring approach, however, can generate fluent

output in other languages (English and Hebrew, beyond the Bliss sequence – without requiring

noisy translation). We also hypothesize that ontologically motivated grouping of symbols could

speed up each selection step – but this claim must be assessed empirically in a task-based extrinsic

evaluation, which remains to be done in the future.

129

# #Words #Morph #Steps Source sentenceSAUT representation

1 5 5 I live in a house3 Live(#I , House)

2 7 8 In the house there are many rooms7 Exists(Location.In(#house), Room.Quantity(many))

3 8 8 In the kitchen we make food and eat9 Make(#we food),Eat(#we #food)),Location.In(kitchen)

4 7 8 The kitchen is yellow with blue doors10 #kitchen.Attribute.Color(yellow). Have(door.plural.Attribute.Color(blue))

5 5 6 Pablo and I are playing3 Play(#Pablo, #I)

6 4 5 We are watching television4 Watch(#Pablo,#I Television)

7 5 7 We are eating chocolate buns7 Eat(#Pablo,#I Bun.Plural.Attribute(chocolate))

8 6 7 The bed stands in the bedroom3 Stand(Bed, Bedroom)

9 6 6 The bedroom has a green floor5 Have(#bedroom, Floor.Color(green))

10 9 9 In the bedroom I sleep and play with Pablo8 Sleep(#I), Play(#I #Pablo).Location.In(Bedroom)

11 5 5 We have a special playroom5 Have(#we, Playroom.Attribute(special))

12 9 10 The playroom has blue walls and a blue floor6 Have(#Playroom walls,floor.Color(blue))

13 6 7 The computer stands in the playroom3 Stand(computer, #Playroom)

14 7 8 On the veranda we have many flowers8 Have(#we, Flower.Quantity(many)).Location.On(Veranda)

15 6 6 In the autumn the flowers die5 Die(Flower.Plural).Time(autumn)

16 5 6 I watched football on television4 Watch(#I football TV)

17 8 9 Today I am going to Heikleivvegen by taxi7 Go(#I, Location.Name(Heikleivvegen)).Manner(taxi)

18 9 10 On Tuesday I played with Pablo in our room10 Play(#I#Pablo).Location.In(room.Possessor(#we)).Time(Tuesday)

19 5 6 We played on the swing6 Play(#I #Pablo).Location.On(swing)

Sum 122 137 133

Table 10.5: Sentences vs. SAUT representation, number of words

130

Chapter 11

Contributions and future work

The design of an NLG system for AAC purposes must consider the special characteristics of an

augmentative communication device: it is a domain-independent system where the vocabulary is

determined by the symbol set that is used. The graphic design must consider possible selection

methods (direct or via scanning). Since not all symbols/vocabulary can be accessed directly, there

should be efficient ways to make it accessible to the user when needed. Moreover, since a symbol can

refer to a concept, and therefore be realized with more than one specific word, the lexical chooser

should find the most appropriate word with the use collocational data. The user should be able

to control syntactic structures with minimum effort and the system should allow the possibility of

doing so.

The system was implemented for the set of approximately 2500 symbols found in the Hebrew

Bliss lexicon [Shalit et al., 1992], but the automated tools enables changes and expansions of the

vocabulary (and possibly change of symbol set to PCS or Rebus).

The first steps of the process are language-independent and can be used in our case for both

Hebrew and English.

The core of this work is an integration of available resources into a new approach for generation

from symbolic input, while considering a multi-lingual generation.

The considerations in the overall compilation are multifold:

1. Implementing a Bliss dynamic display for AAC purposes, while enhancing communication

rate.

131

2. Reducing errors in a symbols-to-text process that originate from parsing telegraphic text.

3. Wide-coverage, domain-dependent lexicon for generation.

4. Hebrew generation of text, with reference to English generation (as a basis of a multilingual

generation system).

This work presents an NLG-AAC system that generates text from a sequence of symbols without

the need for parsing.

For the development of the communication board, we implemented several systems which are

interrelated for the purpose but overall can be re-used in other novel systems.

11.1 Bliss symbols lexicon

We have designed and implemented a Bliss lexicon for both Hebrew and English, which can be used

either as a stand-alone lexicon for reference or as a part of an application. In this work, it is used

for representing symbols in our communication board, but in the future it will also be combined in

an editor (in the ”writing with symbols” style).

The idea behind the design of the lexicon takes advantage of the unique properties of the

language. Technically, only a set of atomic shapes is physically drawn while combined symbols

are generated automatically, following the symbol’s entry in a database that was constructed from

the Hebrew and English Bliss Dictionaries. The lexicon was implemented in a way that allows

searches through either text (a word), semantic components (e.g., ”all symbols that contain a

wheel”), or by forms (e.g., ”all symbols that contain a circle”).

As a byproduct, this implementation literally allows a visual inspection of words’ connectivity

and in the future we will compare word relatedness as can be concluded from the Bliss lexicon vs.

connectivity in other lexical knowledge bases such as WordNet.

11.2 HUGG

HUGG is the only syntactic realizer (SR) written for Hebrew generation. HUGG is implemented

with FUF and inputs are designed to be as similar as possible to the inputs of the English SR

SURGE.

132

The grammar, in the current state of the art, is designed to generate simple clauses, with special

care given to realization of relations (possessives, existentials, attributives, and locatives).

11.3 Integration of a large-scale, reusable lexicon with a natural

language generator

We have integrated a large-scale, reusable lexicon with FUF/SURGE as a tactical component, so

the knowledge encoded in the lexicon and can be reused, as well as to automate to some extent the

development of the lexical realization component in a generation application.

The integration of the lexicon with FUF/SURGE also brings other benefits to generation,

including the possibility of accepting a semantic input at the level of WordNet synsets, the

production of lexical and syntactic paraphrases, the prevention of non-grammatical output, reuse


We have presented the process of integrating the lexicon with FUF/SURGE, including how to

represent the lexicon in FUF format, how to unify input with the lexicon incrementally to generate

more sophisticated and informative representations, and how to design an appropriate semantic

input format so that the integration of the lexicon and FUF/SURGE can be done easily.

11.4 SAUT

SAUT [Biller, 2005] [Biller et al., 2005] is an authoring system for logical forms encoded as con-

ceptual graphs (CG). The system belongs to the family of WYSIWYM (What You See Is What

You Mean) text generation systems: logical forms are entered interactively and the corresponding

linguistic realization of the expressions is generated in several languages. The system maintains a

model of the discourse context corresponding to the authored documents.

The system helps users author documents formulated in the CG format. In a first stage, a

domain-specific ontology is acquired by learning from example texts in the domain. The ontology

acquisition module builds a typed hierarchy of concepts and relations derived from the WordNet

and Verbnet.

The user can then edit a specific document by entering utterances in sequence and maintaining

a representation of the context. While the user enters data, the system performs the standard

133

steps of text generation on the basis of the authored logical forms: reference planning, aggregation,

lexical choice, and syntactic realization – in several languages (we have implemented English and

Hebrew, and are exploring an implementation using the Bliss graphical language). The feedback

in natural language is produced in real-time for every modification performed by the author.

11.5 Communication Board

The purpose of this work was to design an NLG symbols-to-text system for AAC purposes. In

the design of an AAC system, the main motivation is to provide the user with a communication

tool that enables a possibly high rate of communication alongside as wide an expressive power

as possible. Using NLG techniques for the purpose is motivated when considering a telegraphic

text to be the input for generation system, saving the user avoidable keystrokes for function words

(determiners, preposition), or handling morphology (such as inflections, plural markers).

The display we designed was inspired by both the semantic authoring technique as implemented

in SAUT as well as dynamic displays as studied by [Burkhart, 2005].

The symbols displayed on the screen in each step of symbol insertion depends on the context

of symbols as previously seen. For example, if the previous symbol was of a verb which requires an

instrumental theme, only symbols that can function as instruments are presented on the current

display. A general context of each utterance or conversation can be determined by the user, therefore

narrowing the diversity of symbols displayed.

11.6 Future Work

The system presented here is a prototype and there are various issues that still need to be investi-

gated and developed.

Lexicons Since there are not yet fully implemented lexical resources such as WordNet, VerbNet,

or Comlex for Hebrew, the lexical data is hand-coded and cannot be as comprehensive as

the English one. An ongoing project of Hebrew Computational Linguistics is currently being

conducted (Knowledge Center for Processing Hebrew of the Ministry of Science in Israel

- http://mila.cs.technion.ac.il/) including a Hebrew lexicon of words. However, this

134

lexicon was designed for morphological analyzers and the information does not always answers

the needs of text-generation. We intend to develop a VerbNet database for Hebrew verbs.

Another lexical issue should be standardizing the meaning of symbol sets such as Blissymbols,

PCS, and Rebus with reference to lexical knowledge bases such as WordNet. From a practical

point of view, to use this system with another set of symbols, such as the more common PCS,

the ontology which is based on the synsets of the Bliss symbols, will have to be re-built to be

adjusted to the PCS symbols. Moreover, since the mapping between Bliss symbols and the

WordNet senses were done by the author, it could be judged differently by other subjects.

Bliss Symbols and Communication Board In the Bliss Symbols language, an indicator can

change the part of speech of the word that the symbol refers to. For instance, adding an

evaluation indicator to the symbol of electricity will shift the meaning of the symbol to

electric. In the current version of the lexicon, these two possible meanings of the symbol

must be hard-coded. However, adding to the system a morphological module that can do

derivations (and not only inflections), will enable a more creative use of the symbols.

An additional application of the Bliss lexicon is an editor, of the Writing In Symbols style,

where Hebrew text is inserted and the Bliss symbols are inserted above them. This kind of

application requires morphological analysis of Hebrew in order to identify suffixes and prefixes

and to find the root of a verb, the tense, and other possible inflections.

The display will be tuned and tested for access with existing selection devices. In the current

state of the art, we have not implemented any alternative for access except for direct selec-

tion with a mouse. In addition, we did not refer to voice output which is a very important

component of a communication board. NLG text-to-speech systems use the deep informa-

tion of the sentence structure for determining intonation issues. Moreover, the complexity of

morphological analysis in Hebrew, when processed to synthesized speech, can be avoided if

the information on the words does not have to be concluded but is given explicitly.

Processing Techniques As works on prestored sentences have shown ([Waller et al., 2000b],

[Vanderheyden and Pennington, 1998]), using prestored messages is efficient in several con-

texts. Integrating techniques of prestored sentences (and logging utterances online for future

135

use) can make the system more usable. Moreover, applying machine learning techniques on

the history of text generation of a single user can make the prediction more accurate (by

updating frequencies, for instance).

Evaluation The discussion on evaluation of NLG and AAC systems in Chapter 10 surveyed several

possible methods for evaluation. We have evaluated our system in AAC terms, while an

evaluating it as an NLG system will require separate measures for each component, such as

the syntactic realizer or the lexicon coverage. In addition, a subject-based evaluation of use

and communication rate should be conducted with real AAC users.

Since there are several evaluation measures that are common to the two research areas – such

as the Pragmatic function match that was defined in [Cornish and Higginbotham, 2000a] and

the extrinsic evaluation that was offered by [Bangalore et al., 1998], evaluation that satisfies

both criteria is possible.

136

Bibliography

[Andreasen et al., 1998] Andreasen, P., Waller, A., and Gregor, P. (1998). Blissword – full

access to blissymbols for all users. In Proceedings of the 8th Biennial Conference of the Int.

Society for AAC, pages 167–168, Dublin, Ireland. ISAAC.

[ASHA, 1991] ASHA (1991). Report: Augmentative and alternative communication. Amer-

ican Speech-Language-Hearing association, 33(Suppl. 5):9–12.

[Baker, 1984] Baker, B. (1984). Semantic compaction for sub-sentence vocabulary units com-

pared to other encoding and prediction systems. In Proceedings of the 10th Conference on

Re-habilitation Technology, pages 118–120, San Jose, California. RESNA.

[Baker et al., 1998] Baker, C. F., Fillmore, C. J., and Lowe, J. B. (1998). The Berkeley

FrameNet project. In In Proceedings of the COLING-ACL, Montreal, Canada.

[Baldwin, 1995] Baldwin, F. B. (1995). CogNIAC: A Discourse Processing Engine. PhD

thesis, University of Pennsylvania, Department of Computer and Information Sciences.

[Bangalore et al., 2000] Bangalore, S., Rambow, O., and Wittaker, S. (2000). Evaluation

metrics for generation. In Proceedings of the First International Natural Language Gener-

ation Conference (INLG2000), Mitzpe Ramon, Israel.

[Bangalore et al., 1998] Bangalore, S., Sarkar, A., Doran, C., and Hockey, B.-A. (1998).

Grammar and parser evaluation in the XTAG project. In Proceedings of Workshop on

Evaluation of Parsing Systems, Granada, Spain.

[Barzilay et al., 1999] Barzilay, R., McKeown, K., and Elhadad, M. (1999). Information fu-

sion in the context of multi-document summarization. In Proceeding of ACL’99, Maryland.

ACL.

137

[Bateman, 1997] Bateman, J. (1997). KPML Development Environment: multilingual lin-

guistic resource development and sentence generation. GMD, IPSI, Darmstadt, Germany.

www.darmstadt.gmd.de/publish/ komet/kpml.html.

[(BCI), 2004] (BCI), B. C. I. (2004). The fundemental rules of blissymbolics: creating new

blissymbolics characters and vocabulary. Technical report, BCI.

[Bentur et al., 1992] Bentur, E., Angel, A., and Segev, D. (1992). Computerized analysis of

Hebrew words. Hebrew Linguistics, 36:33–38. in Hebrew.

[Beukelman and Mirenda, 1998] Beukelman, D. R. and Mirenda, P. (1998). Augmentative

and Alternative Communication - Management of Severe Communication Disorders in

Children and Adults. Paul H. Brookes Publishing Co., second edition.

[Biller, 2005] Biller, O. (2005). Semantic authoring for multilingual text generation. Master’s

thesis, Department of Computer Science, Ben Gurion University, Israel.

[Biller et al., 2005] Biller, O., Elhadad, M., and Netzer, Y. (2005). Interactive authoring of

logical forms for multilingual generation. In Proceedings of the 10th European workshop on

natural language generation, Aberdeen, Scotland.

[Bliss, 1965] Bliss, C. K. (1965). Semantography (Blissymbolics). Semantography Press, Sid-

ney.

[Boissiere, 2003] Boissiere, P. (2003). An overview of existing writing assistance systems. In

French-Spanish Workshop on Assistive Technology.

[Burkhart, 2005] Burkhart, L. J. (2005). Designing dynamic displays for the beginning com-

municator. http://www.lburkhart.com/.

[Callaway, 2003] Callaway, C. (2003). Evaluating coverage for large symbolic NLG grammars.

In Proceedings of the 18th International Joint Conference on Artificial Intelligence (IJCAI

2003), pages 811–817, Acapulco, Mexico.

[Callaway, 2005] Callaway, C. (2005). The types and distributions of errors in a wide coverage

surface realizer evaluation. In Proceedings of the 10th European Workshop on Natural

Language Generation, Aberdeen, Scotland.

138

[Callaway et al., 1999] Callaway, C. B., Daniel, B., and Lester, J. C. (1999). Multilingual

natural language generation for 3d learning environments. In Argentine Symposium on

Artificial Intelligence, Buenos Aires, Argentina. (to appear).

[CallCentre, 1998] CallCentre (1998). Augmentative Communication in Practise: Scotland -

An Introduction. http://callcentre.education.ed.ac.uk, second edition.

[Canning et al., 2000] Canning, Y., Tait, J., Archibald, J., and Crawley, R. (2000). Cohesive

regeneration of syntactically simplified newspaper text. In 1st Workshop on Robust Methods

in Analysis of Natural language Data.

[Carroll et al., 1998] Carroll, J., Minnen, G., Canning, Y., Devlin, S., and Tait, J. (1998).

Practical simplification of English newspaper text to assist aphasic readers. In AAAI-98

Workshop on Integrating Artificial Intelligence and Assistive Technology, Madison, Wis-

consin. preliminary research report.

[Claypool et al., 1998] Claypool, T., Ricketts, I., Gregor, P., Booth, L., and Palazuelos, S.

(1998). Learning rates of a tri-gram based Gaelic word predictor. In Proceedings of the

8th Biennial Conference of the International Society for Augmentative and Alternative

Communication, pages 177–178, Dublin, Ireland. ISAAC.

[Coch, 1998] Coch, J. (1998). Interactive generation and knowledge administration in Mul-

tiMeteo. In Proceedings of the 9th INLG Workshop, pages 300–303, Canada.

[Copestake, 1996] Copestake, A. (1996). Applying natural language processing techniques to

speech prostheses. In Working Notes of the 1996 AAAI Fall Symposium on Developing

Assistive Technology for People with Disabilities.

[Copestake, 1997] Copestake, A. (1997). Augmented and alternative NLP techniques for

augmentative and alternative communication. In Proceedings of the ACL workshop on

Natural Language Processing for Communication Aids, Madrid.

[Copestake and Flickinger, 1998] Copestake, A. and Flickinger, D. (1998). Evaluation of NLP

technology for AAC using logged data. In Filip Loncke, J. C. and Lloyd, L., editors, ISAAC

98 research symposium proceedings, London. Whurr Publishers.

139

[Cornish and Higginbotham, 2000a] Cornish, J. and Higginbotham, D. J. (2000a). AAC de-

vice testing. http://www.cadl.buffalo.edu/download/DeviceTesting.pdf. CADL Working

papers (2000:2, rev 1.).

[Cornish and Higginbotham, 2000b] Cornish, J. L. and Higginbotham, D. J.

(2000b). Tool for evaluating communication rate in interactive contexts.

http://www.cadl.buffalo.edu/download/BigUnits1.pdf. CADL Working papers (2000:2,

rev 1.).

[Dahan-Netzer, 1997] Dahan-Netzer, Y. (1997). Design and evaluation of a functional

input specification language for the generation of bilingual nominal expressions (He-

brew/English). Master’s thesis, Department of Computer Science, Ben Gurion University,

Beer-Sheva Israel. (in Hebrew).

[Dahan-Netzer and Elhadad, 1998a] Dahan-Netzer, Y. and Elhadad, M. (1998a). Generat-

ing determiners and quantifiers in Hebrew. In Proceeding of Workshop on Computational

Approaches to Semitic Languages, Montreal, Canada. ACL.

[Dahan-Netzer and Elhadad, 1998b] Dahan-Netzer, Y. and Elhadad, M. (1998b). Genera-

tion of noun compounds in Hebrew: Can syntactic knowledge be fully encapsulated? In

Proceedings of INLG’98, pages 168–177, Niagara-on-the-Lake, Canada.

[Dahan-Netzer and Elhadad, 1999] Dahan-Netzer, Y. and Elhadad, M. (1999). Bilingual

Hebrew-English generation of possessives and partitives: Raising the input abstraction

level. In Proceedings of the 37th Annual Meeting of ACL.

[Dale and Mellish, 1998] Dale, R. and Mellish, C. (1998). Towards the evaluation of natural

language generation. In Proceedings of the First International Conference on Evaluation

of Natural Language Processing Systems, Granada, Spain.

[Delin et al., 1994] Delin, J., Hartley, A., Paris, C. L., Scott, D., and Linden, K. V. (1994).

Expressing Procedural Relationships in Multilingual Instructions. In Proceedings of the

7th. Int. Workshop on NLG, pages 61–70.

[Dorr, 1994] Dorr, B. (1994). Machine translation divergences: A formal description and

proposed solution. Journal of Computational Linguistics, 20(4):597–663.

140

[Dorr et al., 1998] Dorr, B. J., Habash, N., and Traum, D. (1998). A thematic hierarchy for

efficient generation from lexical-conceptual. Technical Report CS-TR-3934, Institute for

Advanced Computer Studies, Department of Computer Science, University of Maryland.

[Ducrot and Todorov, 1983] Ducrot, O. and Todorov, T. (1983). Encyclopedic Dictionary of

the Sciences of Language. The Johns Hopkins University Press, Maryland.

[Elhadad, 1991] Elhadad, M. (1991). FUF user manual - version 5.0. Technical Report

CUCS-038-91, University of Columbia.

[Elhadad, 1992] Elhadad, M. (1992). Using argumentation to control lexical choice: a

unification-based implementation. PhD thesis, Computer Science Department, Columbia

University.

[Elhadad, 1993] Elhadad, M. (1993). FUF – the Universal Unifier. Department of Computer

Science, Ben Gurion University, 5.2 edition. http://www.cs.bgu.ac.il/˜yaeln/fufman.

[Elhadad and Robin, 1996] Elhadad, M. and Robin, J. (1996). An overview of SURGE:

a re-usable comprehensive syntactic realization component. In Proceedings of INLG’96,

Brighton, UK. (demonstration session).

[Faber and Uson, 1999] Faber, P. B. and Uson, R. M. (1999). Constructing a Lexicon of

English Verbs. Number 23 in Functional Grammar Series. Mouton de Gruyter, Berlin,

New York.

[Fawcett, 1987] Fawcett, R. P. (1987). The semantics of clause and verb for relational pro-

cesses in english. In Halliday, M. A. and Fawcett, R. P., editors, New Developments in

Systemic Linguistics, volume 1. Frances Pinter, London.

[Garay-Vitoria and Abascal, 1997] Garay-Vitoria, N. and Abascal, J. G. (1997). Word pre-

diction for inflected languages. application to Basque language. In Proceedings of the ACL

workshop on Natural Language Processing for Communication Aids, Madrid.

[Garay-Vitoria and Abascal, 2004] Garay-Vitoria, N. and Abascal, J. G. (2004). A com-

parison of prediction techniques to enhance the communication rate. In Stary, C. and

Stephanidis, C., editors, Proceedings of the 8th ERCIM Workshop on User Interfaces for

141

All, Vienna, Austria. User-Centered Interaction Paradigms for Universal Access in the

Information Society, Springer. Lecture Notes in Computer Science 3196.

[Gatti and Matteucci, 2005] Gatti, N. and Matteucci, M. (2005). CABA2L a Bliss predictive

composition assistant for AAC communication software. In Seruca, I. and Filipe, J., editors,

Enterprise Information Systems VI. Kluwer Publisher, Amsterdam, The Netherlands.

[Goldberg et al., 1994] Goldberg, E., Driedger, N., and Kittredge, R. (1994). Using natural-

language processing to produce weather forecasts. IEEE Expert, 9(2):45–53.

[Grishman and Sterling, 1989] Grishman, R. and Sterling, J. (1989). Analyzing telegraphic

messages. In Proceedings of DARPA Speech and Natural Language Workshop, pages 204–

208, Philadelphia.

[Halliday, 1994] Halliday, M. A. K. (1994). An Introduction to Functional Grammar. Edward

Arnold, London, second edition.

[Hartley et al., 2000] Hartley, A., Scott, D., Kruijff-Korbayouva, I., Sharoff, S., Teich, E.,

Sokolova, L., Staykova, K., Dochev, D., Cmajrek, M., and Hana, J. (2000). Evaluation of

the final prototype. Technical report, Brighton University.

[Hehner, 1980] Hehner, B. (1980). Blissymbols for use. Blissymbolics Communication Insti-

tute. Contributors: Jinny Storr, Peter Reich, Shirley McNaughton and Don Mills.

[Henkin, 1994] Henkin, R. (1994). There is this too. Hebrew Linguistics, (38):41–54. In

Hebrew.

[Higginbotham, 1995] Higginbotham, D. J. (1995). Use of nondisabled subjects in AAC re-

search: Confessions of a research infidel. AAC Augmentative and Alternative Communica-

tion, 11. AAC Research forum.

[Hill and Romich, 2001] Hill, K. and Romich, B. (2001). AAC clinical summary measures

for characterizing performance. In Proceedings of Technology and Persons with Disabilities

CSUN. CSUN. http://www.csun.edu/cod/conf2001/proceedings/0098hill.html.

[Hill and Romich, 2002] Hill, K. and Romich, B. (2002). A rate index for augmentative and

alternative communication. International Journal of Speech Technology, (5):57–64.

142

[Hill et al., 2001] Hill, K., Romich, B., and Holko, R. (2001). AAC performance: The ele-

ments of communication rate. In ASHA, New Orleans.

[Hourcade et al., 2004] Hourcade, J., Pilotte, T. E., West, E., and Parette, P. (2004). A

history of augmentative and alternative communication for individuals with severe and

profound disabilities. Focus on Autism and Other Developmental Disabilities, 19(14):235–

245.

[Hovy and Lin, 1998] Hovy, E. and Lin, C. (1998). Automated text summarization in SUM-

MARIST. In Maybury, M. and Mani, I., editors, Automatic Text Summarization. MIT

Press, Cambridge.

[Hunnicutt, 1986] Hunnicutt, S. (1986). Bliss symbol-to-speech conversion: Blisstalk. Journal

of the American Voice I\O Society, 3:19–38.

[Jing et al., 2000] Jing, H., Dahan-Netzer, Y., Elhadad, M., and McKeown, K. (2000). Inte-

grating a large-scale, reusable lexicon with a natural language generator. In Proceedings of

the 1st INLG conference, pages 209–216, Mitzpe Ramon, Israel.

[Jing and McKeown, 1998] Jing, H. and McKeown, K. (1998). Combining multiple, large-

scale resources in a reusable lexicon for natural language generation. In 36th Annual Meeting

of the Association for Computational Linguistics and the 17th International Conference on

Computational Linguistics (COLING-ACL’98), Montreol.

[Joshi, 1987] Joshi, A. K. (1987). An introduction to tree adjoining grammars. In Manaster-

Ramer, A., editor, Mathematics of Language. John Benjamins, Amsterdam.

[Karberis and Kouroupetroglou, 2002] Karberis, G. and Kouroupetroglou, G. (2002). Trans-

forming spontaneous telegraphic language to well-formed Greek sentences for alternative

and augmentative communication. In SETN ’02: Proceedings of the Second Hellenic Con-

ference on AI, pages 155–166, London, UK. Springer-Verlag.

[Kharitonov, 1999] Kharitonov, M. (1999). CFUF: A fast interpreter for the functional uni-

fication formalism. Master’s thesis, Ben Gurion University, Israel.

143

[Kipper et al., 2000] Kipper, K., Dang, H. T., and Palmer, M. (2000). Class-based construc-

tion of a verb lexicon. In Proceeding of AAAI-2000.

[Kukich, 1983] Kukich, K. (1983). Knowledge-based report generation: A technique for au-

tomatically generating natural language reports from databases. In Proceedings of the 6th

International ACM SIGIR Conference.

[Langer and Newell, 1997] Langer, S. and Newell, A. (1997). Alternative routes to commu-

nication. The Newsletter of the European Network in Language and Speech.

[Langkilde and Knight, 1998] Langkilde, I. and Knight, K. (1998). The practical value of

n-grams in generation. In Proceedings of INLG’98, pages 248–255, Niagara-on-the-Lake,

Canada.

[Lavoie and Rambow, 1997] Lavoie, B. and Rambow, O. (1997). A fast and

portable realizer for text generation systems. In ANLP’97, Washington, DC.

www.cogentex.com/systems/realpro.

[Lee et al., 1997] Lee, Y.-S., Weinstein, C., Seneff, S., and Tummala, D. (1997). Ambiguity

resolution for machine translation of telegraphic messages. In Proceedings of the eighth

conference on European chapter of the Association for Computational Linguistics, pages

120–127, Morristown, NJ, USA. Association for Computational Linguistics.

[Levin, 1993] Levin, B. (1993). English Verb Classes and Alternations: Apreliminary Inves-

tigation. University of Chicago Press, Chicago Illinois.

[Liben-Nowell, 2000] Liben-Nowell, D. (2000). Text Simplification. PhD thesis, MPhil, Com-

puter Speech and Language Processing, University of Cambridge, Churchill College.

[Macleod and Grishman, 1995] Macleod, C. and Grishman, R. (1995). COMLEX Syntax

Reference Manual. Proteus Project, NYU.

[Mann, 1996] Mann, G. (1996). Control of a Navigating, Rational Agent by Natural Language.

PhD thesis, School of Computer Science and Engineering, University of New South Wales.

http://www.it.murdoch.edu.au/˜mann/NL/BEELINE.html.

144

[Mann, 1983] Mann, W. C. (1983). An overview of the Penman text generation system. pages

261–265. Also appears as USC/Information Sciences Institute Tech Report RR-83-114.

[McCoy et al., 1994] McCoy, K., McKnitt, W., Peischl, D., Pennington, C., Vanderheydan,

P., and Demasco, P. (1994). AAC-user therapist interactions: Preliminary linguistic ob-

servations and implications for Compansion. In Proceedings of RESNA ’94 17th Annual

Conferenc, Nashville, Tennessee.

[McCoy et al., 1998] McCoy, K., Pennington, C., and Badman, A. L. (1998). Compansion:

From research prototype to practical integration. Natural Language Engineering, (4):41–55.

Cambridge University Press.

[McCoy, 1997] McCoy, K. F. (1997). Simple NLP techiques for expanding telegraphic sen-

tences. In Proceedings of workshop on NLP for Communication Aids, Madrid. ACL/EACL.

[McCoy et al., 2001] McCoy, K. F., Bedrosian, J. L., and Hoag, L. A. (2001). Pragmatic

trade-offs in utterance-based systems: Uncovering technological implications. ASHA

(American Speech-Language-Hearing Association), Division 12 Newsletter. Guest Editor:

Jeff Higginbotham.

[McCoy and Hershberger, 1999] McCoy, K. F. and Hershberger, D. (1999). The role of eval-

uation in bringing NLP to AAC: A case to consider. In Loncke, F. T., Clibbens, J.,

Arvidson, H. H., and Lloyd, L. L., editors, Augmentative and Alternative Communication:

New Directions in Research and Practice, pages 105–122. Whurr Publishers, London.

[McDonald, 1982] McDonald, E. T. (1982). Teaching and Using Blissymbolics. Blissymbolics

Communication Institute.

[Mel’cuk. and Pertsov, 1987] Mel’cuk., I. and Pertsov, N. (1987). Surface Syntax of En-

glish - a Formal Model within the Meaning-Text Framework. John Benjamins, Amster-

dam/Philadelphia.

[Miliaev et al., 2003] Miliaev, N., Cawsey, A., and Michaelson, G. (2003). Applied NLG

system evaluation, FlexyCAT. In Preceedings of the 9th European Workshop on Natural

Language Generation (in conjunction with EACL2003), Budapest, Hungary.

145

[Miller et al., 1990] Miller, G., Beckwith, R., Fellbaum, C., Gross, D., and Miller, K. (1990).

Introduction to WordNet: an on-line lexical database. International Journal of Lexicogra-

phy (special issue), 3(4):235–312.

[Miller, 1995] Miller, G. A. (1995). WordNet: a lexical database for English. Commun.

ACM, 38(11):39–41.

[Moulton et al., 1999] Moulton, B. J., Lesher, G. W., and Higginbotham, D. J. (1999). A

system for automatic abbreviation expansion. In Proceedings of the RESNA ’99 Annual

Conference, pages 55–57, Arlington, VA. RESNA Press.

[Nir, 2005] Nir, M. (2005). Bliss - is it really a second language? ISAAC-Israel Annual. (in

Hebrew).

[Ordan and Wintner, 2005] Ordan, N. and Wintner, S. (2005). Representing natural gender

in multilingual lexical databases. International Journal of Lexicography, 18(3).

[Paris and Linden, 1996] Paris, C. and Linden, K. V. (1996). Building knowledge bases for

the generation of software documentation. In Meeting of the International Association for

Computational Linguistics (COLING-96).

[Paris and Vander Linden, 1996] Paris, C. and Vander Linden, K. (1996). DRAFTER: An

interactive support tool for writing multilingual instructions. IEEE Computer, 29(7):49–56.

[Pennington and McCoy, 1998] Pennington, C. A. and McCoy, K. F. (1998). Providing intel-

ligent language feedback for augmentative communication users. In et al., V. O. M., editor,

Assistive Technology and AI, number 1458 in LNAI, pages 59–72. Springer-Verlag, Berlin

Heidelberg.

[Pennington et al., 1998] Pennington, C. A., McCoy, K. F., Bedrosian, J. L., and Hoag, L. A.

(1998). Important issues for effectively using prestored text in augmentative communica-

tion. In 1998 AAAI Workshop on Integrating Artificial Intelligence and Assistive Technol-

ogy, pages 48–54, Madison, Wisconsin.

146

[Pianta et al., 2002] Pianta, E., Bentivogli, L., and Girardi, C. (2002). MultiWordNet: devel-

oping an aligned multilingual database. In Proceedings of the First International Conference

on Global WordNet, Mysore, India.

[Pollard and Sag, 1987] Pollard, C. and Sag, I. (1987). Information-based Syntax and Seman-

tics - Volume 1, volume 13 of CSLI Lecture Notes. University of Chicago Press, Chicago,

Il.

[Porter, 2000] Porter, G. (2000). Low-tech dynamic displays: User friendly multi-level com-

munication books. In Proceedings of ISAAC Ninth Biennial Conference, Washington, DC.

[Power and Scott, 1998] Power, R. and Scott, D. (1998). Multilingual authoring using feed-

back texts. In Proceedings of COLING-ACL 98, Montreal, Canada.

[Quinlan, 1992] Quinlan, P. (1992). The Oxford Psycholinguistic Database. Oxford University

Press.

[Quirk et al., 1985] Quirk, R., Greenbaum, S., Leech, G., and Svartvik, J. (1985). A compre-

hensive grammar of the English language. Longman.

[Reiter and Dale, 1992] Reiter, E. and Dale, R. (1992). A fast algorithm for the generation of

referring expressions. In Proceedings of the 14th COLING, pages 232–238, Nantes, France.

[Reiter and Dale, 2000] Reiter, E. and Dale, R. (2000). Building Natural-Language Genera-

tion Systems. Cambridge University Press.

[Roberts, 1973] Roberts, D. D. (1973). The Existential Graphs of Charles S. Peirce. Mouton

and Co.

[Rosner and Stede, 1994] Rosner, D. and Stede, M. (1994). Generating multilingual docu-

ments from a knowledge base: The TECHDOC project. pages 339–346.

[Ruppenhofer et al., 2005] Ruppenhofer, J., Ellsworth, M., Petruck, M. R. L., and

Johnson, C. R. (2005). FrameNet: Theory and practice. Online Book.

http://framenet.icsi.berkeley.edu/book/book.html.

147

[Schabes et al., 1988] Schabes, Y., Abeille, A., and Joshi, A. K. (1988). Parsing strategies

with Lexicalized Grammars: Application to tree adjoining grammars. In Proceedings of the

12th COLING, pages 578–583, Budapest, Hungary.

[Scott et al., 1998] Scott, D., Power, R., and Evans, R. (1998). Generation as a solution to

its own problem. In Proceedings of INLG’98, pages 256–265, Niagara-on-the-Lake, Canada.

[Shalit et al., 1992] Shalit, A., Wine, J., and Yaniv, K. (1992). Hebrew Blissymbols Lexicon.

ISAAC-Israel.

[Shaw, 1995] Shaw, J. (1995). Conciseness through aggregation in text generation. In Pro-

ceedings of the 33rd conference on ACL, pages 329–331, Morristown, NJ, USA.

[Shaw et al., 1994] Shaw, J., Kukich, K., and Mckeown, K. (1994). Practical issues in auto-

matic documentation generation. In Proceeding of the 4th ANLP, pages 7–14.

[Shieber and Baker, 2003] Shieber, S. M. and Baker, E. (2003). Abbreviated text input. In

IUI’03, Miami, Florida, USA.

[Sowa, 1984] Sowa, J. F. (1984). Conceptual Structures: Information Processing in Mind and

Machine. Addison-Wesley.

[Sowa, 1987] Sowa, J. F. (1987). Semantic networks. In Shapiro, S. C., editor, Encyclopedia

of Artificial Intelligence 2. John Wiley & Sons, New York.

[Sripada et al., 2005] Sripada, S. G., Reiter, E., and Hawizy, L. (2005). Evaluating an NLG

system using post-edit data: Lessons learned. In Proceedings of ENLG-2005, pages 133–139,

Aberdeen, Scotland.

[Stede, 1996] Stede, M. (1996). Lexical semantics and knowledge representation in multilin-

gual sentence generation. PhD thesis, Graduate Department of Computer Science, Univer-

sity of Toronto.

[Temizsoy and Cicekli, 1998] Temizsoy, M. and Cicekli, I. (1998). A language-independent

system for generating feature structures from interlingua representations. In Proceedings

of INLG’98, pages 188–197, Niagara-on-the-Lake, Canada.

148

[Ushioda et al., 1993] Ushioda, A., Evans, D. A., Gibson, T., and Waibel, A. (1993). Fre-

quency estimation of verb subcategorization frames based on syntactic and multidimen-

sional statistical analysis. In Proceedings of the 3rd International Workshop on Parsing

Technologies (IWPT3), Tilburg, The Netherlands.

[Vaillant, 1997] Vaillant, P. (1997). A semantic-based communication system for dysphasic

subjects. In Proceedings of the 6th conference on Artificial Intelligence in Medicine Europe

(AIME’97), Grenoble, France.

[Vaillant and Checler, 1995] Vaillant, P. and Checler, M. (1995). Intelligent voice prosthesis:

converting icons into natural language sentences. Computation and Language E-print

Archive. http://xxx.lanl.gov/aba/cmp-lg/9506018.

[Vanderheyden et al., 1996] Vanderheyden, P., Damesco, P., and McCoy, K. (1996). A pre-

liminary study into schema-based access and organization of reusable text in AAC. In

Langton, A., editor, Proceedings of the RESNA ’96 Annual Conference, Salt Lake City,

UT.

[Vanderheyden and Pennington, 1998] Vanderheyden, P. B. and Pennington, C. A. (1998).

An augmentative communication interface based on conversational schemata. In Assis-

tive Technology and Artificial Intelligence, Applications in Robotics, User Interfaces and

Natural Language Processing, pages 109–125, London, UK. Springer-Verlag.

[VanderLinden and Scott, 1995] VanderLinden, K. and Scott, D. (1995). Raising the interlin-

gual ceiling in multilingual text generation. In the Multilingual Natural Language Genera-

tion Workshop, International Joint Conference in Artificial Intelligence (IJCAI’95), pages

95–109, Montreal.

[Waller and Jack, 2002] Waller, A. and Jack, K. (2002). A predictive blissymbolic to English

translation system. In Proceedings of ASSETS 2002, pages 186–191, Edinburgh, Scotland.

[Waller et al., 2000a] Waller, A., O’Mara, D., Tait, L., booth, L., Hood, H., and Brophy-

Arnott, B. (2000a). The development and evaluation of a narrative-based AAC approach. In

Proceedings of the Ninth Biennial Conference of the International Society for Augmentative

and Alternative Communication, pages 232–234, Washington D.C. ISAAC.

149

[Waller et al., 2000b] Waller, A., O’mara, D., Tait, L., Hood, H., Booth, L., and Brophy-

Arnott, B. (2000b). Integrating a story-based aid within curriculum. AAC 2000 Practical

Approaches to Augmentative and Alternative Communication.

[Woltosz, 1997] Woltosz, W. (1997). Dynamic vs. static displays: What are

the issues? In CSUN 1997 Conference. CSUN Center On Disabilities.

http://www.dinf.ne.jp/doc/english/Us Eu/conf/csun 97/csun97 072.htm.

150

Semantic Authoring for Blissymbols Augmented Communication...

Documents

Transcript of Semantic Authoring for Blissymbols Augmented Communication...