Industrial Ontologies Group University of Jyväskylä Industrial Ontologies Group.
Towards a Reference Terminology for Talking about Ontologies and Related Artifacts
description
Transcript of Towards a Reference Terminology for Talking about Ontologies and Related Artifacts
1
Towards a Reference Terminology for Talking about Ontologies and
Related Artifacts
Barry Smith
http://ontology.buffalo.edu/smith
with thanks to
Werner Ceusters, Waclaw Kusnierczyk, Daniel Schober
2
Problem of ensuring sensible cooperation in a massively interdisciplinary community
concepttypeinstancemodelrepresentationdata
3
What do these mean?
‘conceptual data model’
‘semantic knowledge model’
‘reference information model’
‘an ontology is a specification of a conceptualization’
4
5
natural language labels
to make the data cognitively accessible to human beings
and algorithmically tractable
6
compare: legends for mapscompare: legends for maps
7
ontologies are legends for data
8
compare: legends for cartoons
9
legends
help human beings use and understand complex representations of reality
help human beings create useful complex representations of reality
help computers process complex representations of reality
10
computationally tractable legends
help human beings find things in very large complex representations of reality
11
xi = vector of measurements of gene i k = the state of the gene ( as “on” or “off”)θi = set of parameters of the Gaussian model......
legends for mathematical equations
12
Glue-ability / integrationrests on the existence of a common benchmark
called ‘reality’
the ontologies we want to glue together are representations of what exists in the world
not of what exists in the heads of different groups of people
13
truth is correspondence to reality
14
simple representations can be true
15
a network diagram can be a veridical representation of reality
16
17
maps may be correct by reflecting topology, rather than geometry
18
a labeled image can be a more useful veridical representation of reality
an image can be a veridical representation of reality
19
an image labelled with computationally tractable labels can be an even more useful veridicalrepresentation of reality
20
annotations using common ontologies can yield integration of image data
21
if you’re going to semantically annotate piles of data, better work out how to do it right from the start
22
two kinds of annotations
23
names of types
24
names of instances
25
First basic distinction
type vs. instance
(science text vs. diary)
(human being vs. Tom Cruise)
26
For ontologies
it is generalizations that are important = ontologies are
about types, kinds
27
Ontology types Instances
28
Ontology = A Representation of types
29
An ontology is a representation of types
We learn about types in reality from looking at the results of scientific experiments in the form of scientific theories
experiments relate to what is particular science describes what is general
30
There are created types
bicyclesteering wheelaspirinFord Pinto
we learn about these by looking at manufacturers’ catalogues
31
measurement units are created types
32
Inventory vs. CatalogTwo kinds of representational
artifact
Roughly:
Databases represent instances
Ontologies represent types
33
A 515287 DC3300 Dust Collector Fan
B 521683 Gilmer Belt
C 521682 Motor Drive Belt
Catalog vs. inventory
34
Catalog vs. inventory
35
Catalog of types/Types
36
siamese
mammal
cat
organism
objecttypes
animal
frog
instances
37
Ontologies are here
38
or here
39
ontologies represent general structures in reality (leg)
40
Ontologies do not represent concepts in people’s heads
41
They represent types in reality
42
which provide the benchmark for integration
43
if you’re going to semantically annotate piles of data, better work out how to do it right from the start
44
Entity =def
anything which exists, including things and processes, functions and qualities, beliefs and actions, documents and software (Levels 1, 2 and 3)
45
what are the kinds of entity?
46
First basic distinction
universal vs. instance
(science text vs. diary)
(human being vs. Tom Cruise)
47
Ontology Universals Instances
48
Ontology = A Representation of Universals
49
Ontology = A representation of universals
Each node of an ontology consists of:
• preferred term (aka term)
• term identifier (TUI, aka CUI)
• synonyms
• definition, glosses, comments
50
An ontology is a representation of universals
We learn about universals in reality from looking at the results of scientific experiments in the form of scientific theories
experiments relate to what is particular science describes what is general
siamese
mammal
cat
organism
substanceuniversals
animal
frog
instances
52
Domain =def
a portion of reality that forms the subject-matter of a single science or technology or mode of study or administrative practice ...;
proteomics
HIV
epidemiology
53
Representation =def
an image, idea, map, picture, name or description ... of some entity or entities.
54
Ontologies are representational artifacts
comparable to science textsand subject to the same sorts of constraints (including need
for update)
55
Representational units =def
terms, icons, alphanumeric identifiers ... which refer, or are intended to refer, to entities
and which are minimal (atoms)
56
Composite representation =defrepresentation
(1) built out of representational units
which
(2) form a structure that mirrors, or is intended to mirror, the entities in some domain
57
Analogue representations
no representational units, no ‘atoms’
58
Periodic Table
The Periodic Table
59
Language has the power to create general terms
which go beyond the domain of universals studied by science and documented in catalogs
60
Problem: fiat demarcations
male over 30 years of age with family history of diabetes
abnormal curvature of spine
participant in trial #2030
61
Problem: roles
fist
patient
FDA-approved drug
62
Administrative ontologies often need to go beyond universals
Fall on stairs or ladders in water transport injuring occupant of small boat, unpowered
Railway accident involving collision with rolling stock and injuring pedal cyclist
Nontraffic accident involving motor-driven snow vehicle injuring pedestrian
63
Class =defa maximal collection of particulars determined by a general term (‘cell’. ‘electron’ but also: ‘ ‘restaurant in Palo Alto’, ‘Italian’)
the class A = the collection of all particulars x for which ‘x is A’ is true
64
universals vs. their extensions
universals
{a,b,c,...} collections of particulars
65
Extension =def
The extension of a universal A is the class: instance of the universal A
(it is the class of A’s instances)
(the class of all entities to which the term ‘A’ applies)
66
Problem
The same general term can be used to refer both to universals and to collections of particulars. Consider:
HIV is an infectious retrovirus
HIV is spreading very rapidly through Asia
67
universals vs. classes
universals
{c,d,e,...} classes
68
universals vs. classes
universals
~ defined classes
69
universals vs. classes
universals
e.g. populations, ...
70
Defined class =def
a class defined by a general term which does not designate a universal
the class of all diabetic patients in Leipzig on 4 June 1952
71
OWL is a good representation of defined classes
• sibling of Finnish spy
• member of Abba aged > 50 years
• pizza with > 4 different toppings
72
Terminology =def.
a representational artifact whose representational units are natural language terms (with IDs, synonyms, comments, etc.) which are intended to designate universals together with defined classes, with no particular attention to composite representations
73
universals, classes, concepts
universals
defined classes
‘concepts’ ?
74
universals < defined classes < ‘concepts’
‘concepts’ which do not correspond to defined classes:
‘Surgical or other procedure not carried out because of patient's decision’
‘Congenital absent nipple’
because they do not correspond to anything
75
(Scientific) Ontology =def.
a representational artifact whose representational units (which may be drawn from a natural or from some formalized language) are intended to represent
1. universals in reality
2. those relations between these universals which obtain universally (= for all instances)
lung is_a anatomical structure
lobe of lung part_of lung
Rules for Scientific Ontology
How ontology development can be evidence-based
76
Basis in textbook science
OBO Foundry ontologies are created by biologist-curators with a thorough knowledge of the underlying science
Ontology quality is measured in terms of biological accuracy and usefulness to working biologists (measured in turn by numbers of independent users, of associated software applications, papers published, ... ).
77
Measure of success for OBO Foundry initiative
= degree to which it serves the integration of ever more heterogeneous types of data / is exploited in the creation of new types of software or of new types of informatics-based experimentation
78
Ontology building closely tied to needs of users with data to annotate
In the GO/Uniprot collaboration, the Foundry methodology is applied by domain experts who enjoy joint control of ontology, data and annotations.
All three get to be curated in tandem.
As results of experiments are described in annotations, this leads to extensions or corrections of the ontology, which in turn lead to better annotations, the whole process being governed by the querying needs of users in a way which fosters widespread adoption.
Blake J, et al. Gene Ontology annotations: Proceedings of Bio-Ontologies Workshop, ISMB/ECCB, Vienna, July 20, 2007
79
Science-based vs. arms-length ontology
This yields superior outcomes when measured by the results achieved by third parties who apply the ontologies to tasks external to those for which they were created
superior = to those generated on the basis of arms-length methodologies such as automatic mining from published literature.
PLoS Biol. 2005 Feb;3(2):e65.
80
81