Medical Natural Sciences Year 2: Introduction to Bioinformatics
New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of...
-
Upload
arthur-powell -
Category
Documents
-
view
213 -
download
0
Transcript of New York State Center of Excellence in Bioinformatics & Life Sciences R T U New York State Center of...
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
VUB Leerstoel 2009-2010Theme: Ontology for Ontologies, theory and applications
Inaugural Oration:The quest for semantic interoperability
May 17, 2010; 16h30-19h00Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels
Room D2.01
Prof. Werner CEUSTERS, MD
Ontology Research Group, Center of Excellence in Bioinformatics and Life Sciences and
Department of Psychiatry, University at Buffalo, NY, USA
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Buffalo NYCChicago
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Center of Excellence in
Bioinformatics & Life Sciences
Buffalo, NY
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
?
Short personal history
1959 - 20101977
1989
1992
1998
2002
2004
2006
19931995
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
?
Short personal history
1959 - 2030?1977
1989
1992
1998
2002
2004
2006
19931995
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
A trajectory of mixes and mingles …
Biology
TranslationalResearch
Defense &Intelligence
Pharmacology
PharmacogenomicsPerforming
Arts
Linguistics
Computational Linguistics
Medical NaturalLanguage Understanding
Informatics
Medicine
Knowledge Representation
ElectronicHealth Records Referent
Tracking
PhilosophyOntology
Realism-BasedOntology
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
… provides the context for this lecture series (1)
• May 17: the quest for semantic interoperability
– what is it ?– what are the building blocks ?– why do only few systems exhibit it ?– Take home message:
• good ontologies are badly needed
Informatics Knowledge Representation
ElectronicHealth Records
PhilosophyOntology
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
… provides the context for this lecture series (2)• May 18: the need for realism-based ontology
development– What ontology should be
• philosophical realism, applied to …• … ‘knowledge representation’
– Generic/specific distinction• relation with Referent Tracking
– Target audience:• ontology developers and evaluators• philosophers who want a real job• technology scouts
– Take home message: • good ontology = realism-based ontology
ReferentTracking
PhilosophyOntology
Realism-BasedOntology
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
… provides the context for this lecture series (3)
• May 19: ontologies in healthcare and the vision of personalized medicine
– An ontologist’s view on data and information models– Open Biomedical Ontologies Foundry– Example ontologies for eHealth
Biology
TranslationalResearch
Pharmacology
PharmacogenomicsMedicine
ElectronicHealth Records Referent
Tracking
Realism-BasedOntology
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
… provides the context for this lecture series (4)
• May 20: ontologies and Natural Language Understanding
• Target audience:– computational linguists– semantic engineers
Linguistics
Computational Linguistics
Medical NaturalLanguage Understanding
Informatics
Medicine
ElectronicHealth Records
Realism-BasedOntology
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
… provides the context for this lecture series (5)
• May 21: Referent Tracking: why Big Brother was just a little baby.
– theory of Referent Tracking:• give a unique identifier to everything
– implementation of RT systems– application in situational awareness
(in the broadest sense)
– Target audience:• everybody who wants to survive after 2012 Defense &
IntelligencePerforming
Arts
ReferentTracking
Realism-BasedOntology
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Semantic Interoperability
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Interoperability of Information Systems
The capacity of distinct information systems to
exchange ‘stuff’From ‘Wargames’
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Gradations in interoperability• Level 0: no interoperability at all• Level 1: technical and syntactical interoperability (no
semantic interoperability)• Level 2: two orthogonal levels of partial semantic
interoperability– Level 2a: unidirectional semantic interoperability– Level 2b: bidirectional semantic interoperability
of meaningful fragments• Level 3: full semantic interoperability, sharable context,
seamless co-operabilitySemantic Interoperability for Better Health and Safer Healthcare.
Semantic HEALTH Report. January 2009
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
One often used definition
Semantic Interoperability (SI)
=
the ability of two or more computer systems to exchange information in such a way that the
meaning of that information can be automatically interpreted by the receiving system accurately
enough to produce useful results to the end users of both systems.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
‘Full interoperability’
• ‘Neither language nor technological differences prevent the system to seamlessly integrate the received information into the local record and provide a complete picture of someone’s health as if it would have been collected locally.’
• ‘Further, the anonymized data feeds directly into the tools of public health authorities and researchers.’
Stroetmann et.al. Semantic Interoperability for Better Health and Safer Healthcare. SemabticHEALTH report. Jan 2009
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
What this practically means …
Healthcare Finance Intelligence and Command & Control
Digital collectionsand
IP rights
Enterprise&
supply chainmanagement
…
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Biggest SI endeavor: the Semantic Web
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The standard web: end users are humans
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The Semantic Web: end-users are maximally assisted by agents
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Where is a web … is usually a spider
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The core issue:
Semantic Interoperability (SI)
=
the ability of two or more computer systems to exchange information in such a way that the
meaning of that information can be automatically interpreted by the receiving system accurately
enough to produce useful results to the end users of both systems.
meaning
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
And meaning is, of course, the problem
• ‘I know that you believe that you understood what you think I said, but I am not sure you realize that what you heard is not what I meant.’
– Robert McCloskey, State Department spokesman (attributed).
• http://www.quotationspage.com/quotes/Robert_McCloskey/
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Goal of the Semantic Web
• to make it possible for software to find the data it needs on the Web, understand it, cross-reference it and apply it to a particular task.
• “I should be able to tell my Web-enabled handheld device to schedule an appointment with a dentist within 20 miles of home and let the computer do the rest.”
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
“I should be able to tell my Web-enabled handheld device to schedule an appointment with a dentist within 20 miles of home and let the computer do the rest.”• So the SW must understand natural language ?
• So the SW must know when the requester is free ?
• So the SW must understand that it is to take care of the requester’s teeth, and not to have a nice diner date ?
• So the SW can then deduce what the actual length of “20 miles” is for this particular person ?
• So the SW must understand where the requester lives ?
If it were just that simple
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Pray your computer isn’t Irish
X: “Hallo stranger, you appear to be travelling?”Y: “Yes, I always travel when on a journey.”
X: “And pray, what might your name be?”Y: “It might be Sam Patch, but it isn't.”
X: “Have you been long in these parts?”Y: “Never longer than at present—5 feet 9.”
X: “Do you get anything new?”Y: “Yes, I bought a new whetstone this morning.”
Copyright © 1996 Electronic Historical Publications
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The linguistic perspective (1)
characters
lexemes words syntax
semantics
word categories
pragmatics
discourse
morphology
phrases
sentences
prose
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The linguistic perspective (2)
• Words:– ‘in’, ‘hepatitis’, ‘the’, ‘virus’, ‘sit’, ‘bank’, ‘river’, ‘money’
• We combine them in phrases and sentences:– ‘hepatitis virus’ ‘virus hepatitis’,
– ‘money in the bank’ ‘bank in the river’
• We combine sentences:– ‘First I removed the skin from the fish. Then I fried it. It was
delicious.’
• We know what (not) to use under which circumstances:– ‘girl’ – ‘chick’, ‘man’ – ‘guy’ – ‘dude’, …
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Building the Semantic Web requires this too
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
But it seems that only dummies are involved
• there must be a lot of dummies, or• don’t they still get it?
• a lot does seem to mean nothing at all
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The clever (?) business man and his XML card
<business-card>
<name> John Nitwit </name>
<address>
<street> 524 Moon base avenue </street>
<city> Utopia </city>
</address>
<phone> … </phone>
…
</business-card>
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Is anything gained this way?
Eric Miller. Weaving Meaning: The Semantic Web. 2002. www.w3.org/Talks/2002/10/16-sw/
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Are mails like this one surprising?At 10:13 PM 3/22/2010, you wrote:
Dear Prof Smith,just a quick email to express my sincerest gratitude - the learning materials you made available are being of enormous value to me. After a PhD in the Semantic Web area at [a well known knowledge management
institute], I came out so disgusted with the general lack of scientific & philosophical grounding in the community around me, that I felt I totally lost sight of my research path.But your systematic and thorough presentation of the field is helping me see where I stand, without all the usual technical buzzwords and marketing pitches. At the same time, this gives me hope of finding more solid grounds for my future research.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The heart of the evil …
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UWhat it was …
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U… and became
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
What is (an) Ontology ?
Without buzzwords and marketing pitches
but
with adequate philosophical thinking
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
“What is … ?” –questions are problematic
• How would you answer the following questions:– what is a human being ?– what is JFK ?– what is yellow ?– what is a unicorn ?– what is a drug ?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
What do the following juxtapositions reveal?
what is a human being?
what is JFK?
what is yellow?
what is a unicorn?
what is a drug?
what does ‘human being’ mean?
what does ‘JFK’ mean?
what does ‘yellow’ mean?
what does ‘unicorn’ mean?
what does ‘drug’ mean?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
What do the following juxtapositions reveal?
what is a human being?
what is JFK?
what is yellow?
what is a unicorn?
what is a drug?
what does ‘human being’ mean?
what does ‘JFK’ mean?
what does ‘yellow’ mean?
what does ‘unicorn’ mean?
what does ‘drug’ mean?
Ontology Terminology
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The Ontology-Terminology divide
• Ontology is about what things are.
• Terminology is about how to name things, without caring about whether what is named exists.
• Sadly, this distinction is by many people who call themselves ‘ontologists’ or build ‘ontologies’ either not understood at all, or applied in the wrong way.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Terminological versus Ontological approach
• The terminologist defines:– ‘a clinical drug is a pharmaceutical product given to (or taken
by) a patient with a therapeutic or diagnostic intent’. (RxNorm)
• The (good, real) ontologist thinks:– Does ‘given’ includes ‘prescribed’?
– Is manufactured with the intent to … not sufficient?• Are newly marketed products – available in the pharmacy, but not yet
prescribed – not clinical drugs?
• Are products stolen from a pharmacy not clinical drugs?
• What about such products taken by persons that are not patients?– e.g. children mistaking tablets for candies.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
This dichotomy is also present in simple words
Carl Austin Weiss, MD(Dec 6, 1906 – Sept 8, 1935)
Huey Pierce Long, Jr.(Aug 30, 1893 - Sept 10, 1935)
Solving crimes through Referent Tracking
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
A double mystery• (It is argued that) On September 9th, 1935, Carl Austin Weiss
shot Senator Huey Long in the Louisiana State Capitol with a .35 calibre pistol. Long died from this wound thirty hours later on September 10th. Weiss, on the other hand, received between thirty-two and sixty .44 and .45 calibre hollow point bullets from Long's agitated bodyguards and died immediately.
Sorensen, R., 1985, "Self-Deception and Scattered Events", Mind, 94: 64-69.
• Questions:– Did Weiss kill Senator Long ?
– If so, when did he kill him ?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The events on a time linetime
Senator Long’s living
Weiss’ shooting of Long
Carl Weiss’ living
Bodyguards’shooting of Weiss
Weiss’ path. body reactions
Long’s pathological body reactions
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
When did the killing happen ?time
Senator Long’s living
Weiss’ shooting of Long
Carl Weiss’ living
Bodyguards’shooting of Weiss
Weiss’s path. body reactions
Long’s pathological body reactions
t1?
t2?
If at t1: Long was not dead after he was killed
If at t2: Long was killed by a dead person
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
What this demonstrates
• What things are and how things are named, are two different issues,
• (Natural) language does not fit nicely with reality,– formed at a time when insight in reality was crippled,– did not evolve with our insight,
• Human brains have the capacity not to be bothered too much by the unfaithfulness of natural language.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Ambiguous phrasings
warning on plastic bag
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Hotel semantics
in Miami hotel lobbyin A’dam hotel elevator
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Good philosophers lack this capacity
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Computers lack this ability too, but that is a problem
• Knowledge representation and semantic interoperability are for machines, not humans;
• Computer languages and knowledge representations must at least be unambiguous, and preferably also faithful to (our best understanding of) reality.
• Unfortunately, the majority of them don’t precisely because of the confusion between terminology and ontology.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
The terminology of ‘ontology’:
Google ‘define: ontology’:• the study of the broadest range of categories of existence, which also
asks questions about the existence of particular kinds of objects;• an explicit representation of the meaning of terms in a vocabulary, and
their relationships;• a common vocabulary for describing the concepts that exist in an area
of knowledge and the relationships that exist between them;• specification of a conceptualisation of a knowledge domain;• a structured information model of a domain capable of supporting
reasoning by human users and software agents;• a data model that represents a set of concepts within a domain and the
relationships between those concepts;• …
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
One term, many definitions
This raises some (philosophical?) questions:1. Is it possible for a term to have so many meanings?
2. Can the authors of these definitions all be right at the same time?
3. Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
• Clearly: yes !
• This phenomenon is called:
• and is usually explained in terms of the semantic or semiotic or meaning triangle.
Homonymy
Q1: Is it possible for a term to have so many meanings?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Standard Semiotic/Semantic Triangle
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Standard Semiotic/Semantic Triangle
Useful,but nevertheless wrong !
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Useful to build multi-lingual dictionaries
Concept ‘cat’
catchatkat
Katze…
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Problem: several interpretations of the Semiotic/Semantic triangle
Sign:Language/
Term/Symbol
Referent:Reality/Object
Reference: Concept / Sense / Model / View / Partition
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Aristotle’s triadic meaning model
semeia
gramma/ phoné pragma
pathemaWords spoken are signs or symbols (symbola) of affections or impressions (pathemata) of the soul (psyche); written words (graphomena) are the signs of words spoken (phoné). As writing (grammatta), so also is speech not the same for all races of men. But the mental affections themselves, of which these words are primarily signs (semeia), are the same for the whole of mankind, as are also the objects (pragmata) of which those affections are representations or likenesses, images, copies (homoiomata).
Aristotle, 'On Interpretation', 1.16.a.4-9, Translated by Cooke & Tredennick,
Loeb Classical Library, William Heinemann, London, UK, 1938.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Richards’ semantic triangle
• Reference (“concept”): “indicates the realm of memory where recollections of past experiences and contexts occur”.
• Hence: as with Aristotle, the reference is “mind-related”: thought.
• But: not “the same for all”, rather individual mind-related
symbol referent
referenceunderstandingmy your understanding
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Don’t confuse with homonymy !
“mole” mole (animal)
R1
mole (unit)
R2
mole (skin lesion)
R3
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Different thoughts Homonymy
“mole” mole “animal”
R1
mole “unit”
R2
mole“skin lesion”
R3
symbol referent
understanding
One conceptof x understanding of y
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
And by the way, synonymy...
the Aristotelian view Richards’ view
“perspiration”
“sweat”“sweat”
“perspiration”
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Frege’s view
• “sense” is an objective feature of how words are used and not a thought or concept in somebody’s head
• 2 names with the same referent can have different senses– morning star– evening star
• 2 names with the same sense have the same referent (synonyms)
• a name with a sense does not need to have a referent (“Beethoven’s 10th symphony”)
referent
sense
name
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Homonymous use of the term ‘ontology’• the study of the broadest range of categories of existence, which also
asks questions about the existence of particular kinds of objects;
• an explicit representation of the meaning of terms in a vocabulary, and their relationships;
• a common vocabulary for describing the concepts that exist in an area of knowledge and the relationships that exist between them;
• specification of a conceptualisation of a knowledge domain;
• a structured information model of a domain capable of supporting reasoning by human users and software agents;
• a data model that represents a set of concepts within a domain and the relationships between those concepts;
• …
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Q2: Can the authors of these definitions all be right at the same time?
• Yes, if we are dealing with a case of homonymy.
• But in that case, they are all talking about different distinct things.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Q3: Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ?
study
representation
vocabulary
specification
information model
data model
‘that’ thing
is an
is a
is ais ais a
is a ?(hint on next slide)
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Remember the “what is yellow?”-question
• Answers could have been:– a color– a banana
• Thus:– can something which is a color be a banana ?– can something which is a banana be a color ?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Q3: Is it possible for something to which one of these definitions applies to be such that also one or more of the other definitions apply ?
study
representation
vocabulary
specification
information model
data model
‘that’ thing
is an
is a
is ais ais a
is a ?Not for all !
Only for some
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Homonymous use of the term ‘ontology’:at least one clear cut distinction
• the study of the broadest range of categories of existence, which also asks questions about the existence of particular kinds of objects;
• an explicit representation of the meaning of terms in a vocabulary, and their relationships;
• a common vocabulary for describing the concepts that exist in an area of knowledge and the relationships that exist between them;
• specification of a conceptualisation of a knowledge domain;• a structured information model of a domain capable of supporting
reasoning by human users and software agents;• a data model that represents a set of concepts within a domain and the
relationships between those concepts;• …
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
‘Ontology’ as the study of what exists• Key questions:
– What exists ?– How do things that exist relate to each other ?
• Some hypotheses:– An external reality, time, space– Ideas, concepts– Particulars, universals, objects, processes– God
• Ontologists from distinct ‘schools’ differ in opinion about the existence of some of the above:– Realism, nominalism, conceptualism, monism, …
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
An ontology as a representation
• Terms WordNet, MedDRA, RxNORM
• Concepts the majority of ‘ontologies’But … overwhelming lack of clarity about what
‘concepts’ are:• meaning shared in common by synonymous terms ?• idea shared in common in the minds of those who use these terms ?• unit of knowledge describing meanings ?• feature or property or characteristic shared in common by entities in
the world ?
• Universals Realism-based ontology
Key question: of what ?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
• But what the word ‘concept’ denotes, is never clarified and users of it often refer to different entities in a haphazard way:
• meaning shared in common by synonymous terms• idea shared in common in the minds of those who
use these terms• unit of describing meanings knowledge• universal that what is shared by all and only all entities
in reality of a similar sort
Most ontologies are ‘concept’-based
Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, Biomedical Ontology in Action, November 8, 2006, Baltimore MD, USA
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Concepts in ISO ?• A unit of thought constituted through abstraction on the
basis of properties common to a set of objects. (ISO 1087:1990) – Object: anything perceivable or conceivable. Objects may also
be material (e.g. an engine, a sheet of paper, a diamond), immaterial (e.g. a conversion ratio, a project plan) or imagined (e.g. a unicorn). [Adapted from ISO 1087-1:2000, 3.1.1]
• A unit of knowledge created by a unique combination of characteristics. [ISO 1087-1:2000, 3.2.1] – characteristic: Abstraction of a property of an object or of a set
of objects. Characteristics are used for describing concepts. [ISO
1087-1:2000, 3.2.4]
What knowledge is there to have about unicorns ?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
• But what the word ‘concept’ denotes, is never clarified and users of it often refer to different entities in a haphazard way:
• meaning shared in common by synonymous terms• idea shared in common in the minds of those
who use these terms• unit of describing meanings knowledge• universal that what is shared by all and only all entities
in reality of a similar sort
These views require the involvement of a cognitive entity:
Most terminologies are ‘concept’-based
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
• But what the word ‘concept’ denotes, is never clarified and users of it often refer to different entities in a haphazard way:
• meaning shared in common by synonymous terms• idea shared in common in the minds of those
who use these terms• unit of describing meanings knowledge• universal that what is shared by all and only all entities
in reality of a similar sort
These views require the involvement of a cognitive entity:
This view does not presuppose cognition at all
Most terminologies are ‘concept’-based
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Therefore: a multi-disciplinary approach to ontology
• In philosophy:– Ontology (no plural) is
the study of what entities exist and how they relate to each other;
• Our ‘realist’ view within the Ontology Research Group combines the two:– We use realism, a specific theory of ontology, as the basis for
building high quality ontologies, using reality as benchmark.
• In mainstream computer science and biomedical informatics:– An ontology (plural: ontologies) is a
shared and agreed upon conceptualization of a domain;
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Realism-based Ontology
• Accepts the existence of: – a real world outside mind and language,– a structure in that world prior to mind and language
(universals / particulars).
• Rejects ontology as a matter of agreement on ‘conceptualizations’.
• Uses reality as a benchmark for testing the quality of ontologies as artifacts by building appropriate logics with referential semantics (rather than model-theoretic).
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
A realism-based ontology is …
• a representation of some pre-existing domain of reality which:– (1) reflects the properties of the entities within its
domain in such a way that there obtains a systematic correlation between reality and the representation itself,
– (2) is intelligible to a domain expert,– (3) is formalized in a way that allows it to support
automatic information processing.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Compare with Alberti’s grid
reality
representation
Ontologicaltheory
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
BFO Top-Level Ontology (partial)
ContinuantOccurrent
(always dependent on one or more
independent continuants)
IndependentContinuant
DependentContinuant
Role
Function
Realizable
SpatialRegion
TemporalRegion
ProcessQuality
SDC GDC
Disposition
InformationContentEntity
Functioning
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Rise and fall of theconcept-based approach
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
No serious scholar should work with ‘concepts’
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Slow penetration of the idea …
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
More serious scholars become convinced …
what is a concept description a
description of?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Eugen Wüster• 1935
• Professor of Woodworking Machinery in the Vienna Agricultural College
• Terminology-hobbyist
• founder of ISO-TC 37: terminology standards
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
• concepts are inside people’s brains– a concept is a mental surrogate of a plurality of objects
grouped together on the basis of perceived similarities– what makes those objects similar is itself a concept
• object = def. anything to which human thought is or can be directed, whether material or immaterial, real or purely imagined
• ISO: ‘In the course of producing a terminology, philosophical discussions on whether an object actually exists in reality … are to be avoided’.
Eugen Wüster’s psychological view of concepts
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Concept-based approaches are top-down
• FIRST concepts (meanings, words, terms)• THEN (if you’re lucky) real-world phenomena
• Reasons:
– Wüsterianism and the ISO terminology standards
– needs of programmers (and of third-party payers)
– hold-overs from the era of electronic dictionaries
Smith B., Ceusters W, Temmerman R. Wüsteria. In: Engelbrecht R. et al. (eds.) Medical Informatics Europe, IOS Press, Amsterdam, 2005;:647-652
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Typical reasoning patterns for Wüsterians
• If domain experts use some term– then, there must be a concept,
• whether or not there is some referent.
• If observations reveal the existence of ‘objects’ which are of a similar kind,– then, even if we don’t know yet what that kind is,– there must be an associated concept.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Observations and similarities
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Observations and similarities
Are these pictures of concepts or of horses ?
Is this a sensible question:‘What concepts have tails and do …?’
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Observations and similarities
Are these pictures of concepts?
Are these pictures of anything at all?
If concepts are in brains, that must be awfully big brains!
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Concepts = confusions !
• Use/mention confusions:– Brussels is a nice city and has eight letters.
Brussels is a nice city and
Brussels’ name is ‘Brussels’ and
‘Brussels’ has eight letters.
• Kantian confusions:– what exists is what we believe that exists– horses exists because we have the concept of horse and
we see in reality things that fit that concept.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
And confusion is thus everywherein terminologies, classifications and ‘ontologies’
• SNOMED:– ‘Disorders are concepts in which there is an explicit or
implicit pathological process causing a state of disease which tends to exist for a significant length of time under ordinary circumstances.’
– And also: “Concepts are unique units of thought”.– Thus: Disorders are unique units of thoughts in
which there is a pathological process …???– And thus: to eradicate all diseases in the world at once
we simply should stop thinking ?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
SNOMED International (1995, V3.1)
• T Topography 12,385• M Morphology 4,991• F Function 16,352• L Living Organisms 24,265• C Drugs & Biological Products 14,075• A Physical Agents, Forces and Activities 1,355• D Disease/ Diagnosis 28,623• P Procedures 27,033• S Social Context 433• J Occupations 1,886• G General Modifiers 1,176• TOTAL RECORDS 132,641 ?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Diagnosis versus disease
The disease is hereThe diagnosis is here
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Border’s classification (of medicine?)• Medicine
– Mental health– Internal medicine
• Endocrinology– Oversized endocrinology
• Gastro-enterology• ...
– Pediatrics– ...– Oversized medicine
Refer to the size of the books that do not fit on
a normal Border’s Bookshop shelf
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
MeSH: Geographic Locations• Africa [Z01.058] +• Americas [Z01.107] +• Antarctic Regions [Z01.158]• Arctic Regions [Z01.208]• Asia [Z01.252] +• Atlantic Islands [Z01.295] +• Australia [Z01.338] +• Cities [Z01.433] +• Europe [Z01.542] +• Historical Geographic Locations
[Z01.586] +• Indian Ocean Islands [Z01.600] +• Oceania [Z01.678] +• Oceans and Seas [Z01.756] +• Pacific Islands [Z01.782] +
• mereological mess• mixture of geographic
entities with socio-political entities
• mixture of space and time
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
MeSH: Geographic Locations [Z01]• Africa [Z01.058] +• Americas [Z01.107] +• Antarctic Regions [Z01.158]• Arctic Regions [Z01.208]• Asia [Z01.252] +• Atlantic Islands [Z01.295] +• Australia [Z01.338] +• Cities [Z01.433] +• Europe [Z01.542] +• Historical Geographic Locations
[Z01.586] +• Indian Ocean Islands [Z01.600] +• Oceania [Z01.678] +• Oceans and Seas [Z01.756] +• Pacific Islands [Z01.782] +
• Ancient Lands [Z01.586.035] +• Austria-Hungary [Z01.586.117]• Commonwealth of Independent States
[Z01.586.200] +• Czechoslovakia [Z01.586.250] +• European Union [Z01.586.300]• Germany [Z01.586.315] +• Korea [Z01.586.407]• Middle East [Z01.586.500] +• New Guinea [Z01.586.650]• Ottoman Empire [Z01.586.687]• Prussia [Z01.586.725]• Russia (Pre-1917) [Z01.586.800]• USSR [Z01.586.950] +• Yugoslavia [Z01.586.980] +
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Diabetes Mellitus in MeSH 2008
?
Different set of more specific terms when different path from the top is taken.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UMeSH: some paths from top to Wolfram Syndrome
Wolfram Syndrome
All MeSH Categories
Diseases Category
Nervous System Diseases
Cranial Nerve Diseases
Optic Nerve Diseases
Optic Atrophy
Optic Atrophies,Hereditary
NeurodegenerativeDiseases
HeredodegenerativeDisorders,
Nervous System
Eye Diseases
Eye Diseases, Hereditary
Optic Nerve Diseases
Male UrogenitalDiseases
Urologic Diseases
Kidney Diseases
Diabetes Insipidus
Female Urogenital Diseasesand Pregnancy Complications
Female Urogenital Diseases
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T UWhat would it mean if used in the context of a patient ?
Wolfram Syndrome
All MeSH Categories
Diseases Category
Nervous System Diseases
Cranial Nerve Diseases
Optic Nerve Diseases
Optic Atrophy
Optic Atrophies,Hereditary
has
NeurodegenerativeDiseases
HeredodegenerativeDisorders,
Nervous System
Eye Diseases
Eye Diseases, Hereditary
Optic Nerve Diseases
Female Urogenital Diseasesand Pregnancy Complications
Female Urogenital Diseases
Male UrogenitalDiseases
Urologic Diseases
Kidney Diseases
Diabetes Insipidus
???
…
has
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Description logics is no guarantee to get parthood rightSNOMED-RT (2000)
SNOMED-CT (2003)
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Mistakes dueto inappropriatelexical mapping ?
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Find the problem
concept
terms
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Snomed CT (July 2007):“fractured nasal bones”
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
SNOMED-CT: abundance of false synonymy
nose
bones
fracture
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Coding / Classification confusion
A patient with a fractured nasal bone
A patient with a broken nose
A patient with a fracture of the nose
=
=
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
A patient with a fractured nasal bone
A patient with a broken nose
A patient with a fracture of the nose
=
=
Coding / Classification confusion
A patient with a fractured nasal bone
A patient with a broken nose
A patient with a fracture of the nose
=
=
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
cycles inhierarchicalrelationships
UMLS: Metathesaurus: merging terminologies
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Conclusion
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Conclusion
• Concept-based terminology (and standardisation thereof) is there as a mechanism to improve understanding of messages by humans.
• It is NOT the right device – to explain why reality is what it is, how it is organised,
etc., (although it is needed to allow communication), – to reason about reality, – to make machines understand what is real,– to integrate across different views, languages,
conceptualisations, ...
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Why not ?
Because there is no valid
benchmark !
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Why not ?• Conceptualism does not take care of the structure of
reality.• Concepts not necessarily correspond to something that
(will) exist(ed)– Sorcerer, unicorn, leprechaun, ...
• Definitions set the conditions under which terms may be used, and may not be abused as conditions an entity must satisfy to be what it is. Kantian constructivism
• Language can make strings of words look as if it were terms– “Middle lobe of left lung”
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Today’s biggest problem: a confusion between “terminology” and “ontology”
• The conditions to be agreed upon when to use a certain term to denote an entity, are often different than the conditions which make an entity what it is.– Trees would still be different from rabbits if there were
no humans to agree on how these things should be called.
• “ontos” means “being”. The link with reality tends to be forgotten: one concentrates on the models instead of on the reality.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
What to do about it ? (1)
• Research:– Revision of the appropriatness of concept-based
terminology for specific purposes;– Relationship between models and that part of reality
that the models want to represent;– Adequacy of current tools and languages for
representation;– Boundaries between terminology and ontology and the
place of each in semantic interoperability.
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
What to do about it ? (2)
• Training and awareness– Make people more critical wrt terminology and
ontology promisses• What is needed must be based on needs, not on the
popularity of a new paradigm
• But in a system, it’s not just your own needs, it is each component’s needs !
– Towards “an ontology of ontologies”• First description
• Then quality criteria
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
Ontology based on Unqualified Realism
• Accepts the existence of – a real world outside mind and language– a structure in that world prior to mind and language
(universals / particulars)
• Rejects nominalism, conceptualism, ontology as a matter of agreement on ‘conceptualizations’
• Uses reality as a benchmark for testing the quality of ontologies as artifacts by building appropriate logics with referential semantics (rather than model-theoretic)
New York State Center of Excellence in Bioinformatics & Life Sciences
R T U New York State Center of Excellence in Bioinformatics & Life Sciences
R T U
How that works ?
Come and see tomorrow