3 GL126

Gerstner Laboratoryfor Intelligent Decision Making and Control

Czech Technical University in Prague

-Series of Research Reports

Report No:GL 126/01

OntologiesDescription and Applications

Marek [email protected]

http://cyber.felk.cvut.cz/gerstner/reports/GL126.pdf

Gerstner Laboratory, Department of CyberneticsFaculty of Electrical Engineering, Czech Technical University

Technick 2, 166 27 Prague 6, Czech Republictel. (+420-2) 2435 7421, fax: (+420-2) 2492 3677

http://cyber.felk.cvut.cz/gerstner

Prague, 2001

ISSN 1213-3000

Ontologies Description and Applications

Marek [email protected]

February 22, 2001

Abstract

The word ontology has gained a good popularity within the AIcommunity. Ontology is usually viewed as a high-level description con-sisting of concepts that organize the upper parts of the knowledge base.However, meaning of the term ontology tends to be a bit vague, as theterm is used in different ways. In this paper we will attempt to clarifythe meaning of the ontology including the philosophical views and showwhy ontologies are useful and important.

We will give an overview of ontology structures in several particularsystems. A field proposed within ontological efforts, ontological engi-neering, will be also described.

Usage of ontologies in several particular ways will be discussed. Theseinclude systems and ideas to support knowledge base sharing and reuse,both for computers and humans, ontology based communication in multi-agent systems, applications of ontologies for natural language processing,applications in documents search and enrichment of knowledge bases,both particularly for the World Wide Web environment and constructionof educational systems, particularly intelligent tutoring systems.

1

Contents

1 Introduction 31.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Philosophical View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 What is an Ontology? 42.1 Common Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2.1.1 Ontology as a Philosophical Term . . . . . . . . . . . . . . . . . . . . 52.1.2 Ontology as a Specification of Conceptualization . . . . . . . . . . . . 52.1.3 Ontology as a Representational Vocabulary . . . . . . . . . . . . . . . 62.1.4 Ontology as a Body of Knowledge . . . . . . . . . . . . . . . . . . . . 7

2.2 Other Ontology Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Ontology Structure 93.1 CYC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.2 Russell & Norvigs General Ontology . . . . . . . . . . . . . . . . . . . . . . . 103.3 Ontology Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.3.1 Structure of Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.3.2 Ontology Engineering Subfields . . . . . . . . . . . . . . . . . . . . . . 12

4 Using Ontologies 144.1 Top-Level Ontology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.2 Knowledge Sharing and Reuse . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.2.1 KIF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.2.2 Ontolingua . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.2.3 Collaboration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174.2.4 Particular Ontologies Reuse . . . . . . . . . . . . . . . . . . . . . . . . 18

4.3 Communication in Multi-Agent Systems . . . . . . . . . . . . . . . . . . . . . 184.3.1 FIPA Agent Management Model . . . . . . . . . . . . . . . . . . . . . 194.3.2 Ontology Service by FIPA . . . . . . . . . . . . . . . . . . . . . . . . . 204.3.3 Ontologies Relationships . . . . . . . . . . . . . . . . . . . . . . . . . . 214.3.4 FIPA Knowledge Model and Meta-Ontology . . . . . . . . . . . . . . . 22

4.4 Natural Language Understanding . . . . . . . . . . . . . . . . . . . . . . . . . 234.4.1 CYC NLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244.4.2 WordNet Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.5 Document Search and Ontologies . . . . . . . . . . . . . . . . . . . . . . . . . 244.5.1 OntoSeek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254.5.2 WebKB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.5.3 Knowledge Representation Techniques . . . . . . . . . . . . . . . . . . 264.5.4 Document Enrichment . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.6 Educational Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.6.1 EON . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294.6.2 ABITS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.6.3 Other Proposals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5 Conclusion 30

2

1 Introduction

The word ontology has gained a good popularity within the AI community. Ontology isusually viewed as a high-level description consisting of concepts that organize the upper partsof the knowledge base. However, meaning of the term ontology tends to be a bit vague, asthe term is used in different ways. In this paper we will attempt to clarify the meaning of theontology and show why ontologies are useful and important.

We will discuss usage of ontologies in several particular ways, such as knowledge basereuse, knowledge sharing, communication in multi-agent systems, applications of ontologiesfor WWW applications, for natural language processing, and for intelligent tutoring systems.

1.1 Motivation

In AI research history, we can identify two types of research [31, 8]. One is form-orientedresearch (mechanism theories) and the other is content-oriented research (content theories).The former deals with logic and knowledge representation while the latter with content ofknowledge. Apparently, the former has dominated AI research up to date. Recently, however,content-oriented research has become to gather much attention because a lot of real-worldproblems to solve such as knowledge reuse, facilitation of agent communication, media in-tegration through understanding, large-scale knowledge bases, and so on, require not onlyadvanced theories or reasoning methods, but also sophisticated treatment of the content ofknowledge.

Formal theories such as predicate logic provides us with a powerful tool to guaranteesound reasoning and thinking. It even enables us to discuss the limits of our reasoning in aprincipled way. However, it cannot answer to any of the questions such as what knowledge weshould have for solving given problems, what is knowledge at all, what properties a specificknowledge has, and so on.

Sometimes, the AI community gets excited by some mechanisms such as neural nets, fuzzylogic, genetic algorithms, constraint propagation etc. These mechanisms are proposed as thesecret of making intelligent machines. At other times, it is realized that, however wonderfulthe mechanism, it cannot do much without a good content theory of the domain on which itis to work. Moreover, we often recognize that once a good content theory is available, manydifferent mechanisms might be used equally well to implement effective systems, all usingessentially the same content.

Importance of content-oriented research is being recognized more and more nowadays.Unfortunately it seems that there are no widely recognized sophisticated methodologies forcontent-oriented research now. Major results till later years were only development of knowl-edge bases. The reasons for this can be [31]:

content-oriented research tends to be ad-hoc there is no methodology that enables to accumulate research resultsIt is necessary to overcome these difficulties in the content-oriented research. Ontologies

are proposed for that purpose. Ontology engineering, as proposed in e.g. [31], is a researchmethodology which gives us design rationale of a knowledge base, kernel conceptualization ofthe world of interest, strict definition of basic meanings of basic concepts together with sophis-ticated theories and technologies enabling accumulation of knowledge which is dispensable formodeling the real world.

3

Interest in ontologies has also grown as researchers and system developers have becomemore interested in reusing or sharing knowledge across systems. Currently, one key imped-iment to sharing knowledge is that different systems use different concepts and terms fordescribing domains. These differences make it difficult to take knowledge out of one systemand use it in another. If we could develop ontologies that could be used as the basis for multi-ple systems, they would share a common terminology that would facilitate sharing and reuse.Developing such reusable ontologies is an important goal of ontology research. Similarly, ifwe could develop tools that would support merging ontologies and translating between them,sharing would be possible even between systems based on different ontologies.

1.2 Philosophical View

The term ontology was taken from philosophy. According to Websters Dictionary an ontologyis

a branch of metaphysics relating to the nature and relations of being a particular theory about the nature of being or the kinds of existenceOntology (the science of being) is a word, like metaphysics, that is used in many different

senses. It is sometimes considered to be identical to metaphysics, but we prefer to use it in amore specific sense, as that part of metaphysics that specifies the most fundamental categoriesof existence, the elementary substances or structures out of which the world is made. Ontologywill thus analyze the most general and abstract concepts or distinctions that underlay everymore specific description of any phenomenon in the world, e.g. time, space, matter, process,cause and effect, system.

Recently, the term of ontology has been up taken by researchers in Artificial Intelligence,who use it to designate the building blocks out of which models of the world are made. Anagent (e.g. an autonomous robot) using a particular model will only be able to perceive thatpart of the world that his ontology is able to represent. In this sense, only the things inhis ontology can exist for that agent. In that way, an ontology becomes the basic level ofa knowledge representation scheme. An example is set of link types for a semantic networkrepresentation which is based on a set of ontological distinctions: changinginvariant, andgeneralspecific.

2 What is an Ontology?

The term ontology is used in many different ways. In this section we will discuss what anontology is on several definitions that are currently used.

2.1 Common Definitions

The most widespread definitions of ontology are given below.

1. Ontology is a term in philosophy and its meaning is theory of existence.

2. Ontology is an explicit specification of conceptualization [21].

3. Ontology is a theory of vocabulary or concepts used for building artificial systems [31].

4

4. Ontology is a body of knowledge describing some domain (eg. a common sense knowl-edge domain in CYC [45])

The definition 1 is radically different from all the others (including additional ones dis-cussed below). We will shortly discuss some implications of its meaning for definition ofontology for AI purposes. The second definition is generally proposed as a definition ofwhat an ontology is for the AI community. It may be classified as syntactic, but its precisemeaning depends on the understanding of the terms specification and conceptualization.The third definition is a proposal for definition within the knowledge engineering community.The last fourth definition differs from the previous two ones it views the ontology as aninner body of knowledge, not as the way to describe the knowledge.

Although these definitions are compact, they are not sufficient for in-depth understandingof what an ontology is. We will try to give more comprehensive definitions and insights.

2.1.1 Ontology as a Philosophical Term

Following [24] we will use the convention that the uppercase initial letter O is to distinguishthe Ontology as a philosophical discipline from other usages of this term. Ontology is abranch of philosophy that deals with the nature and the organization of reality. It tries toanswer questions like what is existence, what properties can explain the existence etc.

Aristotle defined Ontology as the science of being as such. Unlike the special sciences,each of which investigates a class of beings and their determinations, Ontology regards allthe species qua being and the attributes that belong to it qua being (Aristotle, Metaphysics,IV, 1). In this sense Ontology tries to answer the question what is the being? or, in ameaningful reformulation what are the features common to all beings?.

This is what is today called General Ontology in contrast with various Special or Re-gional Ontologies (eg. Biological, Social). From this, Formal Ontology is defined as an areathat has to determinate the conditions of the possibility of the object in general and the in-dividualization of the requirements that every objects constitution has to satisfy. Accordingto [24] Formal Ontology can be defined as the systematic, formal, axiomatic development ofthe logic of all forms and modes of being. From this, Formal Ontology is not concerned somuch in the existence of certain objects, but rather in the rigorous description of their formsof being, i.e. their structural features. In practice, Formal Ontology can be intended as thetheory of the distinctions, which can be applied independently of the state of the world, i. e.the distinctions:

among the entities of the world (physical objects, events, regions...) among the meta-level categories used to model the world (concept, property, quality,state, role, part...)

In this sense, Formal Ontology, as a discipline, may be relevant to both Knowledge Rep-resentation and Knowledge Acquisition [24].

2.1.2 Ontology as a Specification of Conceptualization

The second definition of ontology mentioned above, explicit specification of conceptualiza-tion, is briefly described in [20]. The definition comes from work [22] where the ontology is

5

used in context of knowledge sharing. According to Thomas Gruber, explicit specificationof conceptualization means that an ontology is a description (like a formal specification ofa program) of the concepts and relationships that can exist for an agent or a community ofagents. This definition is consistent with the usage of ontology as set of concept definitions,but more general.

In this sense, ontology is important for the purpose of enabling knowledge sharing andreuse. An ontology is in this context a specification used for making ontological commitments.Practically, an ontological commitment is an agreement to use a vocabulary (i.e. ask queriesand make assertions) in way that is consistent (but not complete) with respect to the theoryspecified by an ontology. Agents are then built that commit to ontologies and ontologies aredesigned so that the knowledge can be shared with and among these agents.

The body of a knowledge is based on a conceptualization: the objects, concepts, andother entities that are assumed to exist in some area of interest and the relationship thathold among them. A conceptualization is an abstract, simplified view of the world thatwe wish to represent for some purpose. Every knowledge base, knowledge-based system, orknowledge-level agent is committed to some conceptualization, explicitly or implicitly. Forthese systems, what exists is that which can be represented. When the knowledge of adomain is represented in a declarative formalism, the set of objects that can be represented iscalled the universe of discourse. This set of objects and the describable relationships amongthem, are reflected in the representational vocabulary with which a knowledge-based programrepresents knowledge. Thus, in the context of AI, we can describe the ontology of a programby defining a set of representational terms. In such an ontology, definitions associate thenames of entities in the universe of discourse (e.g. classes, relations, functions, or otherobjects) with human readable text describing what the names mean, and formal axioms thatconstraint the interpretation and well-formed use of these terms. Formally it can be said thatan ontology is a statement of a logical theory [20].

Ontologies are often equated with taxonomic hierarchies of classes without class definitionsand the subsumption relation. Ontologies need not to be limited to these forms. Ontologiesare also not limited to conservative definitions, that is, definitions in the traditional logic sensethat only introduce terminology and do not add any knowledge about the world. To specifya conceptualization, one needs to state axioms that do constrain the possible interpretationsfor the defined terms.

Pragmatically, a common ontology defines the vocabulary with which queries and as-sertions are exchanged among agents. The agents sharing a vocabulary need not share aknowledge base. An agent that commits to an ontology is not required to answer all queriesthat can be formulated in the shared vocabulary. In short, a commitment to a common ontol-ogy is a guarantee of consistency, but not completeness, with respect to queries and assertionsusing the vocabulary defined in the ontology.

2.1.3 Ontology as a Representational Vocabulary

The third definition of ontology proposed above says that it is in fact a representational vo-cabulary [8, 31]. The vocabulary can be specialized to some domain or subject matter. Moreprecisely, it is not the vocabulary as such that qualifies as an ontology, but the conceptu-alization that the terms in the vocabulary are intended to capture. Thus, translating theterms in an ontology from one language to another, for example from Czech to English, doesnot change the ontology conceptually. In engineering design, one might discuss the ontology

6

of an electronic devices domain, which might include vocabulary that describes conceptualelements transistors, operational amplifiers, and voltages and the relations betweenthese elements operational amplifiers are a type-of electronic device, and transistors arecomponent-of operational amplifiers. Identifying such a vocabulary and the underlying con-ceptualization generally requires careful analysis of the kinds of objects and relations thatcan exist in the domain.

The term ontology is sometimes used to refer to a body of knowledge describing somedomain (see below), typically a common sense knowledge domain, using a representationalvocabulary. For example, CYC [45] often refers to its knowledge representation of some areaof knowledge as its ontology.

In other words, the representation vocabulary provides a set of terms with which one candescribe the facts in some domain, while the body of knowledge using that vocabulary isa collection of facts about a domain. However, this distinction is not as clear as it mightfirst appear. In the electronic-device example, that transistor is a component-of operationalamplifier or that the latter is a type-of electronic device is just as much a fact about itsdomain as a CYC fact about some aspect of space, time or numbers. The distinction is thatthe former emphasizes the use of ontology as a set of terms for representing specific facts inan instance of the domain, while the latter emphasizes the view of ontology as a general setof facts to be shared.

2.1.4 Ontology as a Body of Knowledge

Sometimes, ontology is defined as a body of knowledge describing some domain, typicallya common sense knowledge domain, using a representation vocabulary as described above.In this case, an ontology is not only the vocabulary, but the whole upper knowledge base(including the vocabulary that is used to describe this knowledge base).

The typical example of this definition usage is project CYC (http://www.cyc.com/, [45])that defines its knowledge base as an ontology for any other knowledge based system. CYCis the name of a very large, multi-contextual knowledge base and inference engine. Thedevelopment of CYC started during the early 1980s headed by Douglas Lenat. CYC is anattempt to do symbolic AI on a massive scale. It is neither based on numerical methodssuch as statistical probabilities, nor is it based on neural networks or fuzzy logic. All of theknowledge in CYC is represented declaratively in the form of logical assertions. CYC containsover 400, 000 significant assertions [45], which include simple statements of fact, rules aboutwhat conclusions to draw if certain statements of fact are satisfied (true), and rules about howto reason with certain types of facts and rules. New conclusions are derived by the inferenceengine using deductive reasoning.

The CYC team doesnt believe there is any shortcut toward being intelligent or creatingan artificial intelligence based agent. Addressing the need for a large body of knowledge withcontent and context may only be done by manually organizing and collating information.This knowledge includes heuristic, rule of thumb problem solving strategies, as well as factsthat can only be known to a machine if it is told.

Much of the useful common sense knowledge needed for life is prescientific and has there-fore not been analyzed in detail. Thus a large part of the work of the CYC project is toformalize common relationships and fill in the gaps between the highly systematized knowl-edge used by specialists.

It is not necessary to divide such a large knowledge base into smaller pieces to enable

7

reasoning in reasonable time. Because of this, the CYC knowledge base uses a special contextspace [29], that is divided by 12 dimensions into smaller pieces (contexts) that have somethingin common and can be used to reason about a specific problem in that context. It is possibleto lift assertion from one context to another when the problem requires it.

The CYC common sense knowledge can be used as a body of a knowledge base for anyknowledge intensive system. In this sense, this body of knowledge can be viewed as anontology of the knowledge base of the system.

2.2 Other Ontology Definitions

As we can see from the above discussions, the exact definition of ontology is not obvious,however it can be seen that the definitions have much in common. In addition to the abovedefinitions there are many other proposals for ontology definitions. Some other definitionscollected from [24] are:

1. informal conceptual system

2. formal semantic account

3. representation of a conceptual system via a logical theory

(a) characterized by specific formal properties

(b) characterized only by its specific purposes

4. vocabulary used by a logical theory

5. (meta-level) specification of a logical theory

Definitions 1 and 2 conceive an ontology as a conceptual semantic entity, either formalor informal, while according to the interpretations 3, 4 and 5 is a specific syntactic object.

According to interpretation 1, an ontology is the conceptual system which may be assumedto underlay a particular knowledge base. Under interpretation 2, instead, the ontology, thatunderlies a knowledge base, is expressed in terms of suitable formal structures at the semanticlevel. In both cases, we may say that the ontology of knowledge base A is different fromthat of knowledge base B.

Under interpretation 3, an ontology is nothing else then a logical theory. The issue iswhether such a theory needs to have particular formal properties in order to be an ontologyor, rather, whether it is the intended purpose which lets us consider a logical theory as anontology. The latter position can be supported by arguing that an ontology is an annotatedand indexed set of assertion about something: leaving off the annotations and indexing, thisis a collection of assertions: what in logic is called a theory (Pat Hayes statement in [24]).

According to interpretation 4, an ontology is not viewed as a logical theory, but just as thevocabulary used by a logical theory. Such an interpretation collapses into 3.a if an ontologyis thought of as a specification of a vocabulary consisting of a set of logical definitions. Wemay anticipate that the Grubers interpretation (specification of conceptualization) collapsesinto 3.a as well when a conceptualization is intended as a vocabulary.

Finally, under interpretation 5, an ontology is seen as a specification of a logical theoryin the sense that it specifies the architectural components (or primitives) used within aparticular domain theory.

8

3 Ontology Structure

From the overview above we can see that an ontology can be perceived in basically twoapproaches. The first approach is an ontology as a representational vocabulary, where theconceptual structure of terms should remain unchanged during translation. The other ap-proach, that is discussed in this section, is an ontology as the body of knowledge describinga domain, in particular a common sense domain.

An ontology can be divided in several ways. We will describe some of the proposals here.Particularly interesting is so called upper ontology that is intended to serve as an upperpart of ontology of practically all knowledge based systems. Some of the ways of dividingpresented here are intended to be used for merging to form an upper ontology standard inthe IEEE Standard Upper Ontology Study Group [39]. On pages linked from [39] there aremany other examples that could be used as some kind of an upper ontology.

Individual object Intangible Represented Living Nonliving

Configuration Element Sequence Concrete Process Object Abstract

Um-Thing Thing

Thing Thing

GUM Sowa's

CYC Wordnet

Figure 1: How ontologies differ in their analyses of the most general concepts [8]

It is interesting that many authors agree that the upper class1 of the ontology is thing,however even in the second level they do not agree on the separation, as can be seen in thefigure 1. The initiative [39] tries to unify these views.

3.1 CYC

The ontology of CYC is based on a several terms that form the fundamental vocabulary of theCYC knowledge base. The universal set is #$Thing2 (see figure 1). It is the set of everything.

Every CYC constant in the knowledge base is a member of this collection. In the prefixnotation of the language CycL [10], we express that fact as (#$isa CONST #$Thing). Thus,too, every collection in the knowledge base is a subset of the collection #$Thing. In CycL,that fact is expressed as (#$genls COL #$Thing).

The set #$Thing has some subsets, such as PathGeneric, Intangible, Individual, Sim-pleSegmentOfPath, PathSimple, MathematicalOrComputationalThing, IntangibleIndividual,Product, TemporalThing, SpatialThing, Situation, EdgeOnObject, FlowPath, Computation-

1When using hierarchy of classes for ontology description. For describing ontologies however we do nothave to limit to class hierarchies as in the case of taxonomies.

2We will use the notation used in CYC language. The explanation of it can be found in [10].

9

alObject, Microtheory, plus about 1500 more public subsets and about 13600 unpublishedsubsets.

#$Individual is the collection of all things that are not sets or collections. Thus,#$Individual includes (among other things) physical objects, temporal subabstractions ofphysical objects, numbers, relations, and groups (#$Group). An element of #$Individualmay have parts or a structure (including parts that are discontinuous), but no instance of#$Individual can have elements or subsets.

#$Collection is the collection of all CYC collections. CYC collections are natural kindsor classes, as opposed to mathematical sets. Their elements have some common attribute(s).Each CYC collection is like a set in so far as it may have elements, subsets, and supersets, andmay not have parts or spatial or temporal properties. Sets, however, differ from collectionsin that a mathematical set may be an arbitrary set of things which have nothing in common(#$Set-Mathematical). In contrast, the elements of a collection will all have in commonsome feature(s), some intensional qualities. In addition, two instances of #$Collection canbe co-extensional (i.e. have all the same elements) without being identical, whereas if twoarbitrary sets had the same elements, they would be considered equal.

#$Individual and #$Collection are disjoint collections. No CYC constant can be aninstance of both.

#$Predicate is the set of all CYC predicates. Each element of #$Predicate is a truth-functional relationship in CYC which takes some number of arguments. Each of those argu-ments must be of some particular type. Informally, one can think of elements of #$Predicateas functions that always return either true or false. More formally, when an element of#$Predicate is applied to the legal number and type of arguments, an expression is formedwhich is a well-formed formula (wff) in CycL. Such expressions are called atomic formulas ifthey contain variables, or ground atomic formulas (gaf) if they contain no variables.

#$isa: expresses the ISA relationship. (#$isa ELCOL) means that EL is an element of the collection COL. CYC knows that #$isa distributesover #$genls. That is, if one asserts (#$isa EL COL) and (#$genls COL SUPER), CYCwill infer that (#$isa EL SUPER). Therefore, in practice one only manually asserts a smallfraction of the #$isa assertions the vast majority are inferred automatically by CYC.

#$genls: expresses similar relationship for collections(generalization). (#$genls COL SUPER) means that SUPER is one of the supersets of COL.Both arguments must be elements of #$Collection. Again, as with the #$isa, CYC knowsthat #$genls is transitive, therefore, in practice one only manually asserts a small fraction ofthe #$genls assertions since the rest is inferred inferred automatically.

More details about the structure of the CYC ontology and about how the CYC knowledgebase is constructed can be found at http://www.cyc.com.

3.2 Russell & Norvigs General Ontology

Yet another view of general ontology structure is presented in Russell & Norvigs book [38].Every category of their ontology (see figure 2) is discussed in detail on example axioms.

An example of this ontology in KIF [18] can be found at http://ltsc.ieee.org/suo/ontologies/Russell-Norvig.txt.

10

Anything

Sets

EventsAbstractObjects

Numbers RepresentationalObjects Intervals Places PhysicalObjects Processes

Categories Sentences Measurements Moments Things Stuff

Times Weights Animals Agents Solid Liquid Gas

Humans

Figure 2: Russell & Norvigs general ontology structure [38]

3.3 Ontology Engineering

Ontology engineering is a field in artificial intelligence or computer science that is concernedwith ontology creation and usage. Report [31], that proposes and comments this field, declaresthat the ultimate purpose of ontology engineering should be to provide a basis of buildingmodels of all things in which computer science is interested.

3.3.1 Structure of Usage

An ontology can be divided into following subcategories according to [31] from the knowledgereuse and ontology engineering point of view as follows. This is rather a structure of ontologiesfrom a point of view of their usage than a division of one general ontology. Some examplesare included.

Workplace OntologyThis is an ontology for workplace which affects task characteristics by specifying severalboundary conditions which characterize and justify problem solving behaviour in theworkplace. Workplace and task ontologies collectively specify the context in whichdomain knowledge is intended and used during the problem solving.Examples from circuit troubleshooting: fidelity, efficiency, precision, high reliability.

Task OntologyTask ontology is a system of vocabulary for describing problem solving structure ofall the existing tasks domain independently. It does not cover the control structure. Itcovers components or primitives of unit inferences taking place during performing tasks.Task knowledge in turn specifies domain knowledge by giving roles to each objects andrelations between them.Examples from scheduling tasks: schedule recipient, schedule resource, goal, constraint,availability, load, select, assign, classify, remove, relax, add.

11

Domain ontologyDomain ontology can be either task dependent or task independent. Task independentontology usually relates to activities of objects.

Task-dependent ontologyA task structure requires not all the domain knowledge but some specific domainknowledge in a certain specific organization. This special type of domain knowledgecan be called task-domain ontology because it depends on the task.Examples from job-shop scheduling: job, order, line, due date, machine availability,tardiness, load, cost.

Task-independent ontology

Activity-related ontology Object ontology. This ontology covers the structure, behaviour andfunction of the object.Examples from circuit boards: component, connection, line, chip, pin,gate, bus, state, role.

Activity ontology.Examples from enterprise ontology: use, consume, produce, release, state,resource, commit, enable, complete, disable.

Activity-independent ontology Field ontology. This ontology is related to theories and principles whichgovern the domain. It contains primitive concepts appearing in the theoriesand relations, formulas, and units constituting the theories and principles.

Units ontology.Examples: mole, kilogram, meter, ampere, radian.

Engineering mathematics ontology.Examples: linear algebra, physical quantity, physical dimension, unit ofmeasure, scalar quantity, physical components.

General or Common ontologyExamples: things, events, time, space, causality or behaviour, function etc.

3.3.2 Ontology Engineering Subfields

We can also divide the ontology or ontologies from the point of view of ontology engineeringas a field. The subjects which should be covered by ontology engineering are demonstratedin [31]. It includes basic issues in philosophy, knowledge representation, ontology design,standardization, EDI, reuse and sharing of knowledge, media integration, etc. which are theessential topics in the future knowledge engineering. Of course, they should be constantlyrefined through further development of ontology engineering.

Basic Subfield Philosophy(Ontology, Meta-mathematics)Ontology which philosophers have discussed since Aristotle is discussed as well aslogic and meta-mathematics.

12

Scientific philosophyInvestigation on Ontology from the physics point of views, e.g., time, space, pro-cess, causality, etc. is made.

Knowledge representationBasic issues on knowledge representation, especially on representation of ontologi-cal stuff, are discussed.

Subfield of Ontology Design General(Common) ontologyGeneral ontologies such as time, space, process, causality, part/whole relation, etc.are designed. Both in-depth investigation on the meaning of every concept andrelation and on formal representation of ontologies are discussed.

Domain ontologiesVarious ontologies in, say, Plant, Electricity, Enterprise, etc. are designed.

Subfield of Common Sense Knowledge Parallel to general ontology design, common sense knowledge is investigated andcollected and knowledge bases of common sense are built.

Subfield of Standardization EDI (Electronic Data Interchange) and data element specificationStandardization of primitive data elements which should be shared among peoplefor enabling full automatic EDI.

Basic semantic repositoryStandardization of primitive semantic elements which should be shared amongpeople for enabling knowledge sharing.

Conceptual schema modeling facility (CSMF)

Components for qualitative modelingStandardization of functional components such as pipe, valve, pump, boiler, regis-ter, battery, etc. for qualitative model building.

Subfield of Data or Knowledge Interchange Translation of ontologyTranslation methodologies of one ontology into another are developed.

Database transformationTransformation of data in a data base into another of different conceptual schema.

Knowledge base transformationTransformation of a knowledge base into another built based on a different ontology.

Subfield of Knowledge Reuse Task ontologyDesign of ontology for describing and modeling human ways of problem solving.

13

T-domain ontologyTask-dependent domain ontology is designed under some specific task context.

Methodology for knowledge reuseDevelopment of methodologies for knowledge reuse using the above two ontologies.

Subfield of Knowledge Sharing Communication protocolDevelopment of communication protocols between agents which can behave coop-eratively under a goal specified.

Cooperative task ontologyTask ontology design for cooperative communication

Subfield of Media Integration Media ontologyOntologies of the structural aspects of documents, images, movies, etc. are de-signed.

Common ontologies of content of the mediaOntologies common to all media such as those of human behavior, story, etc. aredesigned.

Media integrationDevelopment of meaning representation language for media and media integrationthrough understanding media representation are done.

Subfield of Ontology Design Methodology Methodology

Support environment

Subfield of ontology evaluation Evaluation of ontologies designed is made using the real world problems by forminga consortium.

4 Using Ontologies

From above, we can see that an ontology can describe an upper-part of the knowledge base.The distinction between an ontology and a knowledge base is that ontology provides the basicstructure or armature around which a knowledge base could be built. An ontology providesa set of concepts and terms for describing some domain, while a knowledge base uses thoseterms to represent what is true about some real or hypothetical world. Thus, a medicalontology might contain definitions for terms such a leukemia or terminal illness, but itwould not contain assertions that a particular patient has some disease, although a knowledgebase might.

We can use the terms provided by the domain ontology to assert specific propositionsabout a domain or a situation in a domain. For example, in the electronic-device domain,we can represent a fact about a specific circuit, such as circuit 35 has transistor 22 as a

14

component, where circuit 35 is an instance of the concept circuit and the transistor 22 is aninstance of the concept transistor [8]. Another example of blocks on table by Geneserethand Nillson is deeply discussed in [24] and [23] including discussion of possible ontologies.

Once we have the basis for representing propositions, we can also represent knowledgeinvolving propositional attitudes, such as hypothesize, believe, expect, hope, desire, fear, etc.Propositional attitude terms take propositions as arguments. For example, for the electronic-device domain, we can assert for example the diagnostician hypothesizes or believes thatpart 2 is broken, or the designer expects or desires that the power plant has an output of 20megawatts[8].

Thus, an ontology can represent beliefs, goals, hypotheses, and predictions about a do-main, in addition to simple facts. The ontology also plays a role in describing such as plansand activities, because these also require specification of world objects and relations. Propo-sitional attitude terms are also part of a larger ontology of the world, useful especially indescribing the activities and properties of the special class of objects in the world calledintensional entities for example, agents like humans who have mental states.

4.1 Top-Level Ontology

Ontologies range in abstraction, from very general terms that form the foundation for knowl-edge representation in all domains, to terms that are restricted to specific knowledge do-mains. For example, space, time, parts, and subparts are terms that apply to practically alldomains; malfunction applies to engineering or biological domains; and hepatitis applies onlyto medicine.

Even in cases where a task might seem to be quite domain-specific, knowledge represen-tation might include an ontology that describes knowledge at higher levels of generality. Forexample, solving problems in the domain of turbines might require knowledge expressed us-ing domain-general terms such as flows or causality. Such general-level descriptive terms arecalled the upper-ontology or top-level ontology [8]. There are open research issues about thecorrect ways to analyze knowledge at the upper level. Illustration of different upper parts ofontologies are show in figure 1. For example, many ontologies have thing or entity as theirroot class. Figure 1 illustrates that thing and entity start to diverge at the next level. Someof the differences arise because not all of these ontologies are intended to be general-purposetools, or even explicitly to be ontologies.

Although differences exist within ontologies, general agreement exists between ontologieson many issues [8]:

There are objects in the world. Objects have properties or attributes that can take values. Objects can exist in various relations with other objects. Properties and relations can change over time. There are events that occur at different time instances. There are processes in which objects participate and that occur over time. The world and its objects can be in different states.

15

Events can cause events or states as effects. Objects can have parts.The representational repertoire of objects, relations, states, events, and processes does not

say anything about which classes of these entities exist. The modeler of the domains makesthese commitments.

4.2 Knowledge Sharing and Reuse

To support the sharing and reuse of formally represented knowledge among AI systems, it isuseful to define the common vocabulary in which shared knowledge or ontology is represented.There have been several attempts to create engineering framework for constructing ontologiesto support knowledge sharing and knowledge base reuse.

4.2.1 KIF

Michael R. Genesereth and Richard E. Fikes describe KIF (Knowledge Interchange Format),an enabling technology that facilitates expressing domain factual knowledge using a formalismbased on augmented predicate calculus [18]. Knowledge Interchange Format is a computer-oriented language for interchange of knowledge among disparate programs. Some importantproperties according to [18] are

it has declarative semantics, i.e. the meaning of expressions in the representation canbe understood without appeal to an interpreter for manipulating those expressions

it is logically comprehensive, i.e. it provides for the expression of arbitrary sentences inthe first-order predicate calculus

it provides for the representation of knowledge about the representation of knowledge it provides for the representation of nonmonotonic reasoning rules it provides for the definition of objects, functions, and relations

4.2.2 Ontolingua

Thomas R. Gruber has proposed a language called Ontolingua to help construct portable on-tologies. In [22] an ontology as definitions of classes, relations, functions, and other objects isproposed for a specification of a representational vocabulary for a shared domain of discourse.The paper describes a mechanism for defining ontologies that are portable over representa-tion systems. Definitions written in a standard format for predicate calculus are translatedby a system called Ontolingua into specialized representations, including frame-based sys-tems as well as relational languages (see figure 3). This allows researchers to share and reuseontologies, while retaining the computational benefits of specialized implementations.

The architecture of Ontolingua is shown in figure 3. Ontolingua enables to translatefrom a common ontology into several others. The same ontology can be used for differentpurposes in different systems. Instances of common representation idioms are recognized andtransformed into a simpler, equivalent form using the second-order vocabulary from the Frame

16

"Off the shelf"OntologyOntolingua

Parsing and syntaxchecking

Recognition ofidioms

Canonicalization

FrameOntology

Pure KIF generator

LOOM Translator

Epikit Translator

Other OntolinguaTranslator

Other KIFTranslators

LOOM T-Boxdescriptions

Epikit axioms

System-specificontology

data modelsProlog rules

...

Canonical Form

KIF sentences

Ontolinguadefinitions

Figure 3: Translation architecture for Ontolingua translation from a common ontologyinto several others using the canonical form.

Ontology. These and other transformations result in a canonical form, which is given to back-end translators that generate system-specific output. A pure KIF output is also available tobe given to other translators developed for KIF, such as KIF-to-Prolog.

4.2.3 Collaboration

Ontologies facilitate collaboration among computer systems, among people, and betweencomputers and people. Given an ontology for a particular domain it is possible to formalizeexchanged or stored messages and so enable easier cooperation. Without such an ontology theterms used may not have the same meaning for all parties and a confusion may occur whichcan lead to misunderstanding. Also, when having formal form, it is possible for computers toassist in e.g. searching of needed information.

The ontology for these purposes can be developed by one expert. The ontology is thenusually fixed for other users. This has an advantage that the ontology is more likely precise[35]. On the other hand this approach requires some amount of time of an expert and aknowledge engineer. It is also possible to develop the ontology collaboratively. With thisapproach, everyone can contribute to the ontology. This is usually supported by web-based

17

tools, such as Ontolingua Server [22]. The ontology is then evolving according to actual needs,but can lead to a wide and even inconsistent ontology.

One of the areas where the common terms to efficiently communicate are needed, is en-terprise modeling. The enterprise ontology can support integration with existing and newtools and get together different views. Such an ontology also facilitates communication andinformation reuse, which can be supported by computer tools. An example of carefully de-signed enterprise ontology is described in [43]. On applications it is shown that the enterpriseontology encourages to use the terms in a unified form, and so the results by one group canbe more easily reused by other group.

Shared explicit conceptualization saves much efforts whenever collaborators from differentareas have to work together. The effect of conceptualization is much bigger when the collab-orators work together over large distances. The discussions usually consist of some proposalsthat are then revised and accepted or refuted. The discussion is usually supported by sketches,diagrams and other documents including formal structures. In Tadzebao World Wide DesignLab [46] such a discussion is supported by a set of notepads that can contain documentsincluding ontologies. This tool even facilitates reusing previous solutions by retrieving themfrom other clients the similar solution is searched by a central case-based reasoning engine.

4.2.4 Particular Ontologies Reuse

There were some attempts to develop ontologies for particular cases that would enable toconstruct knowledge bases from them. One of the areas that is being explored is diagnosticsarea for example a careful ontological analysis of fault process and category of faults isdescribed in [27]. Such an ontology can be core of a knowledge base solving any relatedproblem.

Using such a common knowledge base we can develop a deeper model of the system, notonly a few heuristic shallow facts. This might enable to infer deeper causes of a fault. Anotheradvantage is that the system with a model can easily explain its inferences.

The ontology can be comparatively easily reused for diagnostics of other similar systems.The knowledge engineer doesnt have to start from the scratch, but can use already predefinedconcepts in ontology. Also, because of using the same terms in the same ontology, knowledgebases from different sources can be more easily compared and possibly merged.

4.3 Communication in Multi-Agent Systems

Knowledge sharing and exchange is particularly important in multi-agent systems. An agentis usually described as a persistent entity with some degree of independence or autonomy thatcarries out some set of operations depending on what he perceives. An agent usually containssome level of intelligence, so it has to have some knowledge about its goals or desires. Inmulti-agent systems, an agent usually cooperates with other agents, so it should have somesocial and communicative abilities.

Each agent has to know something about a domain he is working in and also has tocommunicate with other agents. An agent is able to communicate only about things that canbe expressed by some ontology. This ontology must be agreed and understood among theagent community (or at least among its part) in order to enable each agent to understandmessages from other agents.

18

Agent Platform

AgentManagement

SystemDirectoryFacilitator

AgentCommunication

Channel

AgentCommunication

Channel

Internal Platform Message Transport

Agent

Figure 4: FIPA Agent Management Reference Model [15]

The ontology in multi-agent system can be explicit or implicit. It is explicit when it isspecified in declarative form e.g. as a set of axioms and definitions. It is implicit, whenthe assumptions on the meaning of its vocabulary are implicitly embedded in agents, i.e. insoftware programs representing agents. The explicit form enables and requires communicationabout an ontology that can be modified when agents agree on that. The implicit form isfixed, so no communication about ontology is required, but the change is impossible withoutreprograming agents.

It is obvious that in open systems, where agents designed by different programmers ororganizations may enter into communication, the ontology must be explicit. In these en-vironments, it is also necessary to have some standard mechanism to access and refer toexplicitly defined ontologies. We will describe a recommendation for this published by FIPA.This recommendation is being widely accepted and is also implemented in some systems forconstructing agents (see for example FIPA Open Source at http://www.nortelnetworks.com/products/announcements/fipa/index.html [37] or JADE at http://sharon.cselt.it/projects/jade/ [4]).

4.3.1 FIPA Agent Management Model

The Foundation for Intelligent Physical Agents (FIPA, http://www.fipa.org/) is a non-profit association registered in Geneva. It is formed by companies and organizations sharingthe effort to produce specifications of generic agent technologies. FIPAs purpose is to makeand publish internationally agreed specifications or recommendations for agent systems andalso to promote agent-based applications, services and equipments.

Basic structure of the multi-agent system compliant to FIPA [15] is shown in figure 4.An agent is a fundamental actor on an agent platform. Agent platform (AP) consists of themachine(s), operating system(s), agent support software, FIPA compliant agent managementcomponents (DF, AMS, ACC see below), proprietary internal platform message transportand agents. The communication between agents on one proprietary platform can be carriedout in different ways, however according to FIPA at least the agent management componentsmust be able to communicate using FIPA compliant communication protocols and languages.

Agent management components (see figure 4) consist of these agents [15]:

Agent management system (AMS) is an agent that exerts supervisory control over

19

access to and use of the agent platform. Only one AMS must exist in a single AP.The AMS maintains a directory of logical agent names and their associated transportaddresses for an agent platform. The AMS offers white pages services to other agents.

Agent communication channel (ACC) is an agent that provides the default com-munication methods between agents on different APs. One agent platform containsexactly one ACC agent.

Directory facilitator (DF) is an agent that provides yellow pages service to otheragents. Agents may register their services with the DF and query the DF to find outwhat services are offered by other agents. One platform contains one main DF and cancontain several other DFs that help the main DF.

These agent must be able to use FIPA agent communication language (ACL) [14] that issimilar to KQML (Knowledge Query and Manipulation Language [2]). An example messagein this language can be:

(inform:sender agent1:receiver hpl-auction-server:content (price (bid good02) 150):in-reply-to round-4:reply-with bid04:language sl:ontology hpl-auction

)

As we can see, this language is content independent3, so that it can be used in anysystem, but when communicating, an agent must specify how to perceive the content. Thisspecification is done via ontology. As was already said, there can be several ontologies andagents must be able to access and to refer to them.

4.3.2 Ontology Service by FIPA

For ontology services there is proposed a dedicated ontology agent (OA) in FIPA agent plat-form. The role of such an agent is to provide some or all of the following services [16]:

discovery of public ontologies in order to access them help in selecting a shared ontology for communication maintain (e.g. register with the DF, upload, download, or modify) a set of publicontologies

translate expressions between different ontologies and/or different content languages respond to query for relationship between terms or between ontologies3Some content languages such as SL, CCL, KIF and RDF are described in FIPA specification. An overview

can be found in [17].

20

facilitate the identification of a shared ontology for communication between two agentsIt is not mandatory that an ontology agent must provide all of these services, but every OA

must be able to participate in a communication about these tasks. Also, it is not mandatorythat every agent platform must contain an ontology agent, but when an ontology agent ispresent, it must complain to FIPA specifications.

Example scenarios [16] of using OA particular services are:

Querying the OA for definitions of termsAn user interface agent A wants to receive pictures from picture-archiver agent B toshow them to a user. It asks agent B for citrus. However the agent B discovers thatit doesnt have any picture with that description. So it asks the appropriate OA toobtain sub-species of citrus within the given ontology. OA answers B that orangeand lemon are sub-species of citrus, so the agent B can send pictures with thesedescriptions to agent A and so satisfy his requirements.

Finding equivalent ontologyAn ontology designer declares the ontology car-product to the ontology agent OA2in U.S. in English terms and translates the same ontology to French for the ontologyagent OA1 in France. Agent A2 uses the ontology from OA2 and wants to communicatewith agent A1 about cars in ontology maintained by OA2. Because agent A1 doesntknow ontology of agent A2, it queries OA1 for ontology equivalent to that one used byA2 (and maintained by OA24). OA1 returns its French ontology about cars and so A1can inform A2 that these two ontologies are equivalent and that OA1 can be used as atranslator. After that, a dialogue between A1 and A2 can start.

Translations of termsAn agent A1 wants to translate a given term from an ontology #1 into the correspondingterm in an ontology #2 (for example the concept the name of a part can be calledname in ontology #1 and nomenclature in ontology #2). A1 queries DF for an OAwhich supports the translation between these ontologies. DF returns the name of anOA that knows the format of these ontologies (e.g. XML) and has capabilities to maketranslations between them. A1 can then query this OA and request translation of aterm from ontology #1 to ontology #2.

4.3.3 Ontologies Relationships

In an open environment agents may benefit from knowing the existence of some relationshipsbetween ontologies, for instance to decide if and how to communicate with other agents. In theagent community, the ontology agent has the most adequate role to know that. It can be thenqueried for the information about such relationships and it can use that for translation or forfacilitating the selection of a shared ontology for agent communication. In FIPA specification[16] the following relations are proposed:

Extension ontology O1 extends ontology O2The ontology O1 extends or includes the ontology O2. Informally this means that all

4There is an ontology naming scheme described in [16] that allows to identify these ontologies for examplethe English car ontology of agent OA2 can be named as OA2@http://makers.ford.com/car-product

21

the symbols that are defined within the O2 are found in the O1 together with therestrictions, meanings and other axiomatic relations of these symbols from O2.

Identical ontologies O1 and O2 are identicalVocabulary, axiomatization and the language are physically identical, but the name canbe different.

Equivalent ontologies O1 and O2 are equivalentLogical vocabulary and logical axiomatization are the same, but the language is different(e.g. XML and Ontolingua). When O1 and O2 are equivalent then they are stronglytranslatable in both ways.

Strongly-translatable source ontology O1 is strongly translatable to the targetontology O2The vocabulary of O1 can be totally translated to the vocabulary of O2, axiomatizationfrom O1 holds in O2, there is no loss of information from O1 to O2 and there isno introduction of inconsistency. Note that the representation languages can still bedifferent.

Weakly-translatable source ontology O1 is weakly translatable to the target ontol-ogy O2The translation permits some loss of information (e.g. the terms are simplified in O2),but doesnt permit introduction of inconsistency.

Approx-translatable source ontology O1 is approximately translatable to the targetontology O2The translation permits even introduction of inconsistencies, i.e. some of the relationsbecome no more valid and some constraints do not apply anymore.

The problem of deciding whether two logical theories (as ontologies usually are) haverelationships to each other, is in general computationally very difficult. Therefore, knowingabout these relationships often requires manual intervention and so a FIPA ontology agentshould be also able to at least maintain database of these relationships.

4.3.4 FIPA Knowledge Model and Meta-Ontology

To allow agents to talk about knowledge and about ontologies, for instance to query for thedefinition of a concept or to define a new concept, a standard meta-ontology and knowledgemodel is necessary. This meta-ontology and knowledge model must be able to describe theprimitives like concepts, attributes or relations.

FIPA adopts for these purposes the OKBC Knowledge Model [16]. OKBC, the OpenKnowledge Base Connectivity [9], provides operations for manipulating knowledge expressedin an implicit representation formalism called the OKBC Knowledge Model. This knowl-edge model supports an object-oriented representation of knowledge and provides a set ofrepresentational constructs and thus can serve as an interlingua for knowledge sharing andtranslation. The OKBC Knowledge Model includes constants, frames, slots, facets, classes,individuals, and knowledge bases. For precise description of the model, the KIF [18] is used.

The OKBC knowledge model assumes a universe of discourse consisting of all entitiesabout which knowledge is to be expressed. In every domain of discourse it is assumed that

22

all constants of the following basic types are always defined: integers, floating point numbers,strings, symbols, lists, classes. It is also assumed that the logical constants true and false areincluded in every domain of discourse. Classes are set of entities5, and all sets of entities areconsidered to be classes.

A frame is a primitive object that represents an entity in the domain of discourse. Aframe is called class frame when it represents a class, and is called individual frame when itrepresents an individual. A frame has associated with it a set of slots that have associateda set of slot values. A slot has associated a set of facets that put some restrictions on slotvalues. Slots and slot values can be again any entities in the domain of discourse, includingframes. A class is a set of entities, that are instances of that class (one entity can be instanceof multiple classes). A class is a type for that entities. Entities that are not classes are referredto as individuals. Class frames may have associated a template slots and template facets thatare considered to be used in instances of subclasses of that class. Default values can be alsodefined. Each slot or facet may contain multiple values. There are three collection types:set, bag (unordered, multiple occurrences permitted), and list (ordered bag). A knowledgebase (KB) is a collection of classes, individuals, frames, slots, slot values, facets, facet values,frame-slot associations, and frame-slot-facet associations. KBs are considered to be entities ofthe universe of discourse and are represented by frames. There are defined standard classes,facets, and slots with specified names and semantics expressing frequently used entities. Theseare described in detail in [9] or [16].

Knowledge bases or ontologies conforming to OKBC are often expressed in KIF [18] orRDF [28]. An example of OKBC compliant editor of knowledge bases or ontologies thatsupports both of these formats is Protege-2000 [19].

FIPA specification [16] defines ontology FIPA-meta-ontology based on OKBC KnowledgeModel to describe ontologies. This ontology must be used by an agent when it talks aboutontologies. Ontology FIPA-Ontol-service-Ontology must be used when requesting servicesof an ontology agent. This ontology extends the basic FIPA-meta-ontology by symbolsenabling manipulation with ontologies. These ontologies are described in detail in [16].

4.4 Natural Language Understanding

One of AI fields that depend on a rich body of knowledge is natural language understanding(NLU) or natural language processing (NLP). Ontologies are useful in NLU in two ways.First, domain knowledge often plays a crucial role in disambiguation. A well designed domainontology provides the basis for domain knowledge representation. In addition, ontology of adomain helps identify the semantic categories that are involved in understanding discourseof the domain. For this use, the ontology plays the role of a concept dictionary. In general,for NLU we need both a general-purpose upper ontology and a domain-specific ontology thatfocuses on the domain of discourse (such as military communications or business stories).CYC [45], Wordnet [3] and Sensus are examples of sharable ontologies that have been usedfor language understanding. NLU is one area that is highly motivating the work on ontologies even CYC, which was originally motivated by the need of for the knowledge systems to havecommon world knowledge, has been tested more in natural language areas than in knowledgesystems applications [8].

5The term class is used synonymously with the term concept as used in the description logic.

23

4.4.1 CYC NLP

The system for natural language developed with CYC in Cycorp is unique in having accessto a very large, declaratively represented common sense knowledge base. CYC helps thenatural language system handle word or phrase disambiguation, and also provides a targetinternal representation language (CycL [10]) that can be used to do interesting things, such asinference. A substantial portion of the CYC natural language processing system (the lexiconand many semantic rules) is actually represented in the CYC knowledge base. Syntacticparsing is carried out by application of phrase-structure rules to an input string. Semanticrules are applied to the output of the syntax module. It is in the application of the semanticrules that the knowledge in the knowledge base is proving especially advantageous.

Most of the CYC pilot applications developed in the recent past have some NL componentin their interfaces. The captioned image retrieval application [45], for example, accepts queriesin English, and allows captioners to describe new images to the system using English sentences.The CYC NL team is currently expanding the lexicon, extending the parser, and adding newsemantic capabilities to the system.

4.4.2 WordNet Database

Probably the most used linguistic database (and linguistic ontology) for natural languageprocessing is WordNet [3] developed at Princeton University. WordNet is a lexical refer-ence system whose design is inspired by current psycholinguistic theories of human lexicalmemory. English nouns, verbs, adjectives and adverbs are organized into synonym sets, eachrepresenting one underlying lexical concept. The WordNet includes several environments formanipulating its database.

Each synonym set (called synset) is usually expressed as a unique combination of syn-onymous words. In general, each word can be associated to more than one synset and morethan one lexical category. Different relations link the synonym sets. Examples of these arehypernymy, hyponymy, and antonymy. The first can be roughly assimilated to the usual sub-sumption relation, while the last links together opposite or inverse terms, such as tall/short.

The WordNet offers two distinct services: a vocabulary, which describes the various wordsenses, and an ontology, which describes the semantic relationships among senses. Both theseservices can be used in natural language systems, such as WWW based systems (see below).

Currently, an effort is being made for the translation of WordNet to several European lan-guages in EuroWordNet project (see http://www.hum.uva.nl/~ewn/). For some languages,such as Dutch, Italian, Spanish, German, French, Czech and Estonian, the WordNet databaseis already available.

4.5 Document Search and Ontologies

To find a document in a large database is often not easy. For example, World Wide Web is sobig knowledge base, that it is sometimes very time consuming to find some needed information.The search is usually entered in a form of a few keywords that are then searched. This ishowever not sufficient for todays web some king of more content-based search is needed.Structured content representations coupled with linguistic ontologies can increase both recalland precision of the search.

Ontologies can be used in many ways for these purposes. They can support searching orknowledge mining from textual web sources, they can be used to specify the content of the

24

pages or to standardize content and query vocabulary. In this section we will also brieflydescribe applications for standardization of materials that need not be necessarily sharedover the WWW, such as structured documents for business-to-business (B2B) informationexchange.

4.5.1 OntoSeek

OntoSeek [25] is a tool intended for searching documents in product catalogs, such as yellowpages. Structured content representation with linguistic ontologies are used to increase bothrecall and precision of the retrieval. OntoSeek adopts a language of a limited expressivenessfor content representation. It uses a large ontology based on WordNet (see above) for contentmatching.

According to [25], we can distinguish three areas of information retrieval :

Text retrieval the goal is to find relevant documents in a large collection in responseto a users query expressed as a sequence of words. The user does not know muchabout collection content, therefore a precise semantic match between the query and therelevant documents is not very important. Also, it is assumed, that the textual qualityof the documents is good.

Data retrieval both the queries and data to be retrieved are encoded by a structuredlist of words, acting as values of a set of attributes established by the systems designer.These words usually belong to a fixed taxonomy. However, large taxonomies can behard to design and difficult to maintain.

Knowledge retrieval here the query and data-encoding language is much more expres-sive. This results in increased precision, because the user can represent accurately thedatas content structure and formulate sophisticated queries. An arbitrary descriptioncan be then formed and matched on the basis of an ontology of primitive concepts andrelations. However, this forces users to adopt a language that could be too expressivefor their purposes.

The last area is particularly interesting. In knowledge retrieval systems, an ontology pro-vides the primitives needed to formulate queries and resource descriptions. Simple ontologies,such as keyword hierarchies, might be also useful to text and data retrieval techniques.

OntoSeek uses WordNet and Sensus ontology for matching of queries and content. Forcontent description (derived from natural language and refined by user) OntoSeek uses lexicalconceptual graphs (LCG), that are simplified variants of Sowas conceptual graphs in way thatthey permit only lexical natural language relations between concepts (no ad-hoc relationssuch as part-of are allowed).

The way of work with OntoSeek is following. In encoding phase, a resource descriptionis converted into a LCG with the help of the user interface. The resulted LCG is storedinto database. When querying, query description is converted into LCG with variables atplaces where the user expects an URL as an answer. Then the database is searched formatching graphs and the results are returned. Note that if the query language changes, butthe underlying ontology for constructing LCGs remains the same, the system will continueto work correctly, i.e. if we change WordNet database to EuroWordNet database for somelanguage, we have made the localization.

25

4.5.2 WebKB

Similar techniques for content matching are used in the project WebKB [30] (available athttp://meganesia.int.gu.edu.au/~phmartin/WebKB/). Here however the content descrip-tion is embedded directly to a web page. The description of a (part of) text is embeddedinto two HTML markers and . Because it is possible to use several formal lan-guages to describe the content, the representational language must be also specified (e.g. for conceptual graphs). It is also possible to use alternative ways, such asembedding the description in the ALT property of images to describe content of images.

The formal description language in WebKB include KIF [18], conceptual graphs, andthe Resource Description Format (RDF) [28]. All of these suppose that some ontology isimported to distinguish between different concept, however it is possible to suppress the checkwhether the used concepts are already defined. Simpler notations, that are not so expressive,are also supported. These include structured text (with delimiters like :,=>,[Person]->(Chrc)->[Age:*a];

}];

The WebKB tool provides a way to interpret these descriptions. It also provides an inter-face to Unix-like text processing commands to exploit web-accessible documents or databasesand process them, for example query them.

4.5.3 Knowledge Representation Techniques

The current HTML standard in which the WWW pages are written doesnt provide a wayto embed some machine-understandable description of the content to the page. HTML wasintended to provide a way to format documents to be readable for humans. There are howeverattempts to overcome this either by extending the standard or by using special existingstandard markers for special purposes.

One of the oldest ways of using HTML-based semantic markup to describe briefly thecontent of the page is to use HTML tags. In this way, we can embed the description ofthe whole page, that can be then processed by indexation or search agents. However this doestprovide any easily machine usable description. It is also usually used only for description ofthe whole page. There is a way to describe parts of the text using these tags and SPAN tagsas shown in [44], but this is not standardized way. Another unstandardized proposed way is

26

to use Cascading Style Sheets (CSS), that could be used not only for formatting, but also forsemantic markup.

A standard enabling to embed any descriptive tree structure into a document is ExtensibleMarkup Language XML [5]. XML is in fact meta-markup language, since it enables to definecustoms tags with desired syntactic structure via its DTD (Document Type Definition). TheDTD can be viewed as an ontology specification. There are many proposals of DTD forspecific areas examples are CML, Chemical Markup Language and OML, Ontology MarkupLanguage. The lexical structure of a XML document is a tree, but using shared identifiers itis possible to encode an arbitrary tree.

W3C supported recommendation for semantic markup is Resource Description Framework(RDF) [28]. The data model of of RDF provides three entity types:

subjects is an entity that can be referred to by an address at the WWW. Subjects arethe elements that are described by RDF objects.

predicate defines a binary relation between subjects and/or atomic values provided byprimitive data type definitions in XML.

object specifies a value of a subject property. That is, objects provide the actual char-acterization of the WWW documents.

There are alternative approaches using XML, such as SHOE or Ontobroker. SHOE(http://www.cs.umd.edu/projects/plus/SHOE/), a derivative of OML, provides tags forconstructing ontologies and tags for annotating web documents. It doesnt provide any stan-dard top-level ontology, but it is possible to choose from offered particular ontologies. Onto-broker (http://ontobroker.aifb.uni-karlsruhe.de/) uses ontology based on frame logicto describe knowledge and to express queries. It includes an inference engine that can deriveadditional knowledge through inference rules. These two approaches have in common thatthey enable to express an inferential knowledge, i.e. to express relationships between enti-ties in ontologies. These relationships can be used to form sophisticated queries or to infernew knowledge from existing web pages.

Recently started DARPA program called DAML (DARPA Agent Mark-Up Language) [1]has a goal to create a mark-up language built upon XML that would allow users to providemachine-readable semantic annotations for specific communities of interest together withagent-based tools using this language. The program is using results and proposals mentionedabove. So far the DAML results consists of the DAML language first proposal. The DAMLlanguage is a frame-based language based on RDF [28] and includes a basic top ontology fordescribing basic entities and ontologies.

4.5.4 Document Enrichment

So far we were concerned mostly with techniques enabling to represent document contents ina machine readable form to enable to use it for search. Ontologies can however be used forexpressing not only the contents of documents, but also relations between documents. Theserelations can explicitly express what is usually expressed only implicitly in the contents ofthe document. For example in scholarly publications [40] a publication may support or refuseideas expressed in another publication. Such a relationship with other documents adds a newinformation to the document and so this process is called document enrichment [41, 35].

27

This additional information can be used for discovering relevant documents or for examplefor finding a potentially interesting gaps in available documents, such as finding possiblyinteresting themes for articles in a news system [12].

The particular relations between documents have to conform to an explicit structureto enable adding new documents to a structure and to enable using the structure. Thisexplicit structure is expressed via ontology for a particular document domain. All the processof working with the documents is then ontology-driven [35]. The steps of a methodologydescribed in [35] are:

1. Identify use scenario

2. Characterize viewpoint for ontology

3. Develop the ontology

4. Perform ontology-driven model construction

5. Customize query interface for semantic knowledge retrieval

6. Develop additional reasoning services on top of knowledge model

The first three steps are performed by system authoring experts, because the qualityof the document relationship model depends crucially on a quality of the ontology, thatsays what everything is possible to express. According to [35], the authors should focus onusability rather than on reusability. Reusable ontology often includes all potential aspects thatcould ever arise, however considering all these aspects when submitting a new document to adocument structure could easily discourage system users. Usable ontology on the other sideincludes only highly relevant aspects that are easy to understand. The ontology is expressedin OCML (Operational knowledge modeling language) [34] language and used for rest of thesteps above. For constructing ontologies a special web-based tool WebOnto can be used. Thefourth step, construction of the model based on the ontology (i.e. document structure), isperformed by casual users in a distributed way. When submitting a new document an authoruses an environment enabling to define the relations to other documents or other aspects ina modeled world. This form-based environment called Knote is dynamically generated fromthe actual ontology. The fifth step, querying the database, is performed using Lois, a form-based interface for knowledge retrieval, that is also created automatically once the key classesfor a knowledge model have been specified.

This methodology summarized in [35] was used in several domains, such as enrichingnews stories [12], supporting scholarly debate [40] and knowledge management of medicalguidelines.

News server Planet-Onto [12] uses an ontology of events and classes like people, orga-nizations, stories, projects and technologies. In addition to enriching and searching storiesthere are designed (but not implemented) two intelligent agent. NewsHound would gatherdata about popular news item and thus could solicit potentially popular stories by identify-ing gaps in the knowledge base. NewsBoy would enable personalized service for finding newstories according to user profile.

Digital library server SholOnto [40] enables to contextualize ideas in relation to the lit-erature. The ontology consists of contribution elements and relationships that are further

28

divided to argumentation and non-argumentation links. The ontology is designed to to sup-port scholars in making claims by asserting relationships between concepts. Other scholarsmay support, raise-issue-with or refute these claims.

Document enrichment can also be used to support organizational learning. The projectEnrich [41] tries to support discussions structured along documents and concepts for learning.It is based on the fact, that the learning is more efficient when the acquired knowledge isimmediately applied. The documents and domain concepts can be incrementally enriched byusers, which supports further group learning.

4.6 Educational Systems

Computer educational or tutoring systems allow users to take a lesson without any time, place,or teacher availability constraints. Traditional computer aided instruction tools provide justa set of a static pages with text, or in better cases some simulators usable for one specificpurpose. The courses delivered by these systems can be adapted and personalized usuallyonly by a teacher editor of the course. Also, these tools have the teaching strategies (ifany) encoded directly in the taught content (such as when the user presses button A, go toscreen 23). It is obvious, that it is practically impossible to reuse knowledge encoded in sucha system. It is necessary to have the content knowledge separated from the system as well asfrom teaching strategies and so on.

Intelligent tutoring systems (ITS, also called knowledge based tutors) are computer-basedinstructional systems that have separate data bases, or knowledge bases, for instructionalcontent (specifying what to teach), and for teaching strategies (specifying how to teach),and attempt to use inferences about a students mastery of topics to dynamically adaptinstruction. The adaptation of teaching materials for a particular student is usually madeby artificial intelligence techniques. ITS usually requires to model the student and from theperceived model plan the next actions that would be the best for a particular student.

These systems were usually built from the scratch up to date, without reusing any knowl-edge or parts of the other tutoring systems. Today however there are attempts to overcomethis by establishing a common methodology or frameworks to enable knowledge reuse andso speed up development of such a systems and make their development easier. To establishcommon ontologies seems as the best way in order to achieve this goal. Some other advantagesare discussed below.

4.6.1 EON

One of the pioneering works is EON [36], a collection of tools for authoring content, instruc-tional strategy, student model and interface design. These authoring tools enable to store allthe tutoring materials in a separate and thus a better reusable form.

When editing a network of topic for further storing of course content, it is necessary todefine an ontology of the network for a particular course area. That ontology can then bereused for another course in the same area. Also, the content can be easily transferred toanother course with the same ontology with all of its properties. Other parts of the tutoringsystem are treated in the same way it is necessary to define some concepts and constraintsunder which one will work and then it is possible to use them to specify the particularbehaviour.

29

4.6.2 ABITS

Another attempt to create a reusable framework is ABITS [6], an agent based intelligenttutoring system. The knowledge taught is organized and indexed according so called Learn-ing Object Metadata [26], that specifies properties and constraints of objects that could beconsidered as some entities that could be taught. For further description, resource descriptionformat [28] is used. Using a common format with the same ontology would facilitate sharingor transferring the knowledge to another tutoring systems. The topic structure is modeledvia conceptual graphs.

User modeling consists of cognitive state and learning preferences. Both these are ex-pressed as fuzzy numbers. ABITS architecture consists of three types of agents evaluationagents, that take care about the cognitive state of the student, affective agents, that evaluatelearning preferences, and pedagogical agents, that update the curriculum (current plan forteaching). All these agents communicate with database connected to their type and withagents of other types.

4.6.3 Other Proposals

There are currently proposals and standards for some aspects of intelligent tutoring systems.IEEE initiative to specify Learning Object Metadata [26] was already mentioned.

A partial task ontology for intelligent educational system is proposed in [33]. Such anontology enables to provide a vocabulary in terms of which existing systems can be comparedand enables to accumulate research results. Another advantages are that the educationaltasks can be formalized and that it is possible to create reusable parts of tutoring systems.Also, it is possible to standardize communication protocol among component agents of ITS.Various entities, methods and concepts are analyzed to create an ontology that could be usedfor any ITS. A collection of other proposals of ontologies and their usage for several aspectof the intelligent educational systems, such as student model and preferences, curriculumontology, task ontology, and others, can be found in [32].

Other initiatives are proposing ontologies for a particular areas that could be used forteaching special knowledge, such as simulation of physical systems. These ontologies havemuch in common with previously mentioned special ontologies and even top-level ontologies they are often derived from them.

5 Conclusion

From the presented survey it follows that ontologies will fundamentally change the way inwhich systems are constructed. Today, knowledge bases are still built with little sharingor reuse almost each one starts from a blank slate. In the future, intelligent systemsdevelopers will have libraries of ontologies at their disposals. Rather than building fromscratch, they will assemble knowledge bases from components drawn from the libraries. Thisshould greatly decrease development time while improving the robustness and reliability ofthe resulting knowledge bases.

There is proposed a new field within knowledge engineering that is called ontology en-gineering [31]. This field is concerned with all theoretical or practical aspects of ontologies,such ontology development, maintenance, reuse etc. We have discussed some of the areas ofinterest within this discipline.

30

An ontology can be viewed (among other possibilities) as a topmost part of the knowledgebase. This upper part is usually encoding the general or common sense knowledge, suchas that objects can have properties. For larger applications, that are not strictly focusedon one simple thing, such a knowledge can be very extensive and difficult to maintain. Oneof the examples that were shown is the project CYC [45], that claims to encode substantialpart of the peoples common sense knowledge.

We have shown applications of ontologies in several areas. The first applications, thatstarted broader interest in ontologies, were the ones in knowledge sharing and reuse [22].Ontologies here help to express commonly accepted conceptualization of domains. The worldsin these domains can be described in various ways or languages, but if this all conform toone ontology, then it is much more easily possible to provide translation between differentdescriptions. Ontology helps here as interlingua, that is that it serves as a common languagefor translations (we do not have to construct translators for every pair of language, sincethe translators between each language and common interlingua are enough to provide anytranslation). This facilitates sharing of knowledge and also reuse of the knowledge in varioussystems, that do not have to use exactly the same internal representation of the world.

An area that is tightly related to the knowledge sharing and reuse is communication inmulti-agent systems. If agents have to communicate, they have to use a commonly agreedand understood syntax and semantic of the messages. If this is expressed explicitly as anontology, it is possible to manipulate with various forms of communication. Fo

3 GL126

Documents

Transcript of 3 GL126