Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

81
1 Towards a Reference Terminology for Talking about Ontologies and Related Artifacts Barry Smith http://ontology.buffalo.edu/smith with thanks to Werner Ceusters, Waclaw Kusnierczyk, Daniel Schober

description

Towards a Reference Terminology for Talking about Ontologies and Related Artifacts. Barry Smith http://ontology.buffalo.edu/smith with thanks to Werner Ceusters, Waclaw Kusnierczyk, Daniel Schober. Problem of ensuring sensible cooperation in a massively interdisciplinary community. concept - PowerPoint PPT Presentation

Transcript of Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

Page 1: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

1

Towards a Reference Terminology for Talking about Ontologies and

Related Artifacts

Barry Smith

http://ontology.buffalo.edu/smith

with thanks to

Werner Ceusters, Waclaw Kusnierczyk, Daniel Schober

Page 2: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

2

Problem of ensuring sensible cooperation in a massively interdisciplinary community

concepttypeinstancemodelrepresentationdata

Page 3: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

3

What do these mean?

‘conceptual data model’

‘semantic knowledge model’

‘reference information model’

‘an ontology is a specification of a conceptualization’

Page 4: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

4

Page 5: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

5

natural language labels

to make the data cognitively accessible to human beings

and algorithmically tractable

Page 6: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

6

compare: legends for mapscompare: legends for maps

Page 7: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

7

ontologies are legends for data

Page 8: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

8

compare: legends for cartoons

Page 9: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

9

legends

help human beings use and understand complex representations of reality

help human beings create useful complex representations of reality

help computers process complex representations of reality

Page 10: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

10

computationally tractable legends

help human beings find things in very large complex representations of reality

Page 11: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

11

xi = vector of measurements of gene i k = the state of the gene ( as “on” or “off”)θi = set of parameters of the Gaussian model......

legends for mathematical equations

Page 12: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

12

Glue-ability / integrationrests on the existence of a common benchmark

called ‘reality’

the ontologies we want to glue together are representations of what exists in the world

not of what exists in the heads of different groups of people

Page 13: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

13

truth is correspondence to reality

Page 14: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

14

simple representations can be true

Page 15: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

15

a network diagram can be a veridical representation of reality

Page 16: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

16

Page 17: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

17

maps may be correct by reflecting topology, rather than geometry

Page 18: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

18

a labeled image can be a more useful veridical representation of reality

an image can be a veridical representation of reality

Page 19: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

19

an image labelled with computationally tractable labels can be an even more useful veridicalrepresentation of reality

Page 20: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

20

annotations using common ontologies can yield integration of image data

Page 21: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

21

if you’re going to semantically annotate piles of data, better work out how to do it right from the start

Page 22: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

22

two kinds of annotations

Page 23: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

23

names of types

Page 24: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

24

names of instances

Page 25: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

25

First basic distinction

type vs. instance

(science text vs. diary)

(human being vs. Tom Cruise)

Page 26: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

26

For ontologies

it is generalizations that are important = ontologies are

about types, kinds

Page 27: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

27

Ontology types Instances

Page 28: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

28

Ontology = A Representation of types

Page 29: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

29

An ontology is a representation of types

We learn about types in reality from looking at the results of scientific experiments in the form of scientific theories

experiments relate to what is particular science describes what is general

Page 30: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

30

There are created types

bicyclesteering wheelaspirinFord Pinto

we learn about these by looking at manufacturers’ catalogues

Page 31: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

31

measurement units are created types

Page 32: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

32

Inventory vs. CatalogTwo kinds of representational

artifact

Roughly:

Databases represent instances

Ontologies represent types

Page 33: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

33

A 515287 DC3300 Dust Collector Fan

B 521683 Gilmer Belt

C 521682 Motor Drive Belt

Catalog vs. inventory

Page 34: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

34

Catalog vs. inventory

Page 35: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

35

Catalog of types/Types

Page 36: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

36

siamese

mammal

cat

organism

objecttypes

animal

frog

instances

Page 37: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

37

Ontologies are here

Page 38: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

38

or here

Page 39: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

39

ontologies represent general structures in reality (leg)

Page 40: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

40

Ontologies do not represent concepts in people’s heads

Page 41: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

41

They represent types in reality

Page 42: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

42

which provide the benchmark for integration

Page 43: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

43

if you’re going to semantically annotate piles of data, better work out how to do it right from the start

Page 44: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

44

Entity =def

anything which exists, including things and processes, functions and qualities, beliefs and actions, documents and software (Levels 1, 2 and 3)

Page 45: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

45

what are the kinds of entity?

Page 46: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

46

First basic distinction

universal vs. instance

(science text vs. diary)

(human being vs. Tom Cruise)

Page 47: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

47

Ontology Universals Instances

Page 48: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

48

Ontology = A Representation of Universals

Page 49: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

49

Ontology = A representation of universals

Each node of an ontology consists of:

• preferred term (aka term)

• term identifier (TUI, aka CUI)

• synonyms

• definition, glosses, comments

Page 50: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

50

An ontology is a representation of universals

We learn about universals in reality from looking at the results of scientific experiments in the form of scientific theories

experiments relate to what is particular science describes what is general

Page 51: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

siamese

mammal

cat

organism

substanceuniversals

animal

frog

instances

Page 52: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

52

Domain =def

a portion of reality that forms the subject-matter of a single science or technology or mode of study or administrative practice ...;

proteomics

HIV

epidemiology

Page 53: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

53

Representation =def

an image, idea, map, picture, name or description ... of some entity or entities.

Page 54: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

54

Ontologies are representational artifacts

comparable to science textsand subject to the same sorts of constraints (including need

for update)

Page 55: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

55

Representational units =def

terms, icons, alphanumeric identifiers ... which refer, or are intended to refer, to entities

and which are minimal (atoms)

Page 56: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

56

Composite representation =defrepresentation

(1) built out of representational units

which

(2) form a structure that mirrors, or is intended to mirror, the entities in some domain

Page 57: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

57

Analogue representations

no representational units, no ‘atoms’

Page 58: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

58

Periodic Table

The Periodic Table

Page 59: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

59

Language has the power to create general terms

which go beyond the domain of universals studied by science and documented in catalogs

Page 60: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

60

Problem: fiat demarcations

male over 30 years of age with family history of diabetes

abnormal curvature of spine

participant in trial #2030

Page 61: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

61

Problem: roles

fist

patient

FDA-approved drug

Page 62: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

62

Administrative ontologies often need to go beyond universals

Fall on stairs or ladders in water transport injuring occupant of small boat, unpowered

Railway accident involving collision with rolling stock and injuring pedal cyclist

Nontraffic accident involving motor-driven snow vehicle injuring pedestrian

Page 63: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

63

Class =defa maximal collection of particulars determined by a general term (‘cell’. ‘electron’ but also: ‘ ‘restaurant in Palo Alto’, ‘Italian’)

the class A = the collection of all particulars x for which ‘x is A’ is true

Page 64: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

64

universals vs. their extensions

universals

{a,b,c,...} collections of particulars

Page 65: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

65

Extension =def

The extension of a universal A is the class: instance of the universal A

(it is the class of A’s instances)

(the class of all entities to which the term ‘A’ applies)

Page 66: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

66

Problem

The same general term can be used to refer both to universals and to collections of particulars. Consider:

HIV is an infectious retrovirus

HIV is spreading very rapidly through Asia

Page 67: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

67

universals vs. classes

universals

{c,d,e,...} classes

Page 68: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

68

universals vs. classes

universals

~ defined classes

Page 69: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

69

universals vs. classes

universals

e.g. populations, ...

Page 70: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

70

Defined class =def

a class defined by a general term which does not designate a universal

the class of all diabetic patients in Leipzig on 4 June 1952

Page 71: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

71

OWL is a good representation of defined classes

• sibling of Finnish spy

• member of Abba aged > 50 years

• pizza with > 4 different toppings

Page 72: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

72

Terminology =def.

a representational artifact whose representational units are natural language terms (with IDs, synonyms, comments, etc.) which are intended to designate universals together with defined classes, with no particular attention to composite representations

Page 73: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

73

universals, classes, concepts

universals

defined classes

‘concepts’ ?

Page 74: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

74

universals < defined classes < ‘concepts’

‘concepts’ which do not correspond to defined classes:

‘Surgical or other procedure not carried out because of patient's decision’

‘Congenital absent nipple’

because they do not correspond to anything

Page 75: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

75

(Scientific) Ontology =def.

a representational artifact whose representational units (which may be drawn from a natural or from some formalized language) are intended to represent

1. universals in reality

2. those relations between these universals which obtain universally (= for all instances)

lung is_a anatomical structure

lobe of lung part_of lung

Page 76: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

Rules for Scientific Ontology

How ontology development can be evidence-based

76

Page 77: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

Basis in textbook science

OBO Foundry ontologies are created by biologist-curators with a thorough knowledge of the underlying science

Ontology quality is measured in terms of biological accuracy and usefulness to working biologists (measured in turn by numbers of independent users, of associated software applications, papers published, ... ).

77

Page 78: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

Measure of success for OBO Foundry initiative

= degree to which it serves the integration of ever more heterogeneous types of data / is exploited in the creation of new types of software or of new types of informatics-based experimentation

78

Page 79: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

Ontology building closely tied to needs of users with data to annotate

In the GO/Uniprot collaboration, the Foundry methodology is applied by domain experts who enjoy joint control of ontology, data and annotations.

All three get to be curated in tandem.

As results of experiments are described in annotations, this leads to extensions or corrections of the ontology, which in turn lead to better annotations, the whole process being governed by the querying needs of users in a way which fosters widespread adoption.

Blake J, et al. Gene Ontology annotations: Proceedings of Bio-Ontologies Workshop, ISMB/ECCB, Vienna, July 20, 2007

79

Page 80: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

Science-based vs. arms-length ontology

This yields superior outcomes when measured by the results achieved by third parties who apply the ontologies to tasks external to those for which they were created

superior = to those generated on the basis of arms-length methodologies such as automatic mining from published literature.

PLoS Biol. 2005 Feb;3(2):e65.

80

Page 81: Towards a Reference Terminology for Talking about Ontologies and Related Artifacts

81