Ontologie de l'œuvre provisoire. Walter Benjamin, le jazz ...
ifomis.org 1 Die Ontologie biomedizinischer Daten Barry Smith Institute for Formal Ontology and...
-
Upload
patrick-woods -
Category
Documents
-
view
214 -
download
0
Transcript of ifomis.org 1 Die Ontologie biomedizinischer Daten Barry Smith Institute for Formal Ontology and...
ifomis.org1
Die Ontologie biomedizinischer Daten
Barry Smith
Institute for Formal Ontology and Medical Information Science
ifomis.org2
IFOMIS
Institute for Formal Ontology and Medical Information Science
Mission: to develop formal ontologies to support empirical research in biomedical informatics and in the life sciences in general
ifomis.org3
Biomedical ontologies and terminology systems
currently manifest a very low degree of formal rigour
ifomis.org4
SNOMED-CT
900,000 ‘concepts’ and relations between them, such as
is_a (for class subsumption)
part_of
causes
treats
ifomis.org5
SNOMED’s confused treatment of is_a
beide_Hoden is_a Hoden
beide_ Gebärmuttern is_a Gebärmutter
ifomis.org6
Halbextraktion_aus_Steißlage is_a Extraktion_aus_Steißlage
Extraktion_aus_Steißlage is_a vollständige_Steißgeburt
____________________________
Halbextraktion_aus_Steißlage is_a vollständige_Steißgeburt
ifomis.org7
Confused treatment of objects and processes
diagnostische_endoskopische_Untersuchung_eines_Mediastinums_NOS
is_a Mediastinoskop.
ifomis.org8
Confusion of object with knowledge about object
Kontrazeption is_a funktionaler_Befund
ifomis.org9
National Cancer InstituteThesaurus
a biomedical thesaurus created specifically to meet the needs of the NCI
semantically modeled cancer-related terminology built using Description Logic
ifomis.org10
NCI Thesaurus root concepts
Anatomic Structure, Anatomic System, or Anatomic Substance ?Or ? Does the NCI not know to which categoryAny item classified there belongs ?Anatomic Substance ? If yes, why is geneproduct not subsumed by it ? If no, why aredrugs and chemicals not subsumed by it ?
ifomis.org11
Conceptual entity
Definition: none
Semantic type: – Conceptual entity– Classification
Subconcepts:– Action:
• definition: action; a thing done
– And: • Definition: an article which expresses the relation of
connection or addition, used to conjoin a word with a word, ...
ifomis.org12
Action is_a Conceptual Entity
And is_a Conceptual Entity
Swimming is healthy and contains 8 letters
Conceptual entity
ifomis.org13
Definition of “cancer gene”
ifomis.org14
NCI Thesaurus architecture
Disease
BreastBreast neoplasmDisease-has-associated-anatomy
ISA
Findings-And-Disorders-Kind Anatomy-Kind
“Formal subsumption” or
“inheritance”
“Associative” relationships providing
“differentiae”
“Kinds” restrict the domain and range of
associative relationships
What diseases have a diameter of over 3 cm ?
ifomis.org15
Confusion of objects and the states in which they participate
Disease
BreastBreast neoplasmDisease-has-associated-anatomy
ISA
Findings-And-Disorders-Kind Anatomy-Kind
ifomis.org16
No one knows what ‘concept’ (or ‘conceptualization’) means
1. The linguistic reading
2. The psychological reading
3. The epistemological reading
4. The ontological reading
ifomis.org17
1) The linguistic reading
A concept is a meaning that is shared in common by a collection of synonymous terms
ifomis.org18
Unified Medical Language System
is_a =def.
If one item ‘is_a’ another item then the first item is more specific in meaning than the second item.
ifomis.org19
Fruit
Orange
Vegetable
similarTo
ApfelsinesynonymWith
NarrowerTerm
Goble & Shadbolt
Semantic Networks
ifomis.org20
The linguistic reading is bad
for work on ontologies in support of research in the natural sciences / evidence-based medicine
ifomis.org21
Problem of evaluation
a good ontology/terminology/vocabulary = one which corresponds to reality as it exists beyond our concepts
if an ontology is a mere network of meanings, then the distinction between good and bad ontologies loses its foothold
ifomis.org22
angel or devil are perfectly good concepts
so arecancelled performanceavoided meetingprevented pregnancyimagined mammal alien implant removal Chios energy healing
ifomis.org23
The linguistic reading
yields a more or less coherent reading of relations like:
‘is_a’‘synonymous_with’‘associated_to’
but it fails miserably when it comes to relations of other types
ifomis.org24
part_ofheart part_of human
human heart part_of human
testis part_of human
human testis part_of human
but not: human has_part human testis
ifomis.org25
how can concepts, on the linguistic reading, figure as relata of relations
like:
part_of = def. composes, with one or more other physical units, some larger whole
contains =def. is the receptacle for fluids or other substances
ifomis.org26
How can a set of synonymous terms serve as
a receptacle for fluids or other substances?
ifomis.org27
The psychological reading of ‘concept’
Concepts are ideas in the minds of human subjects
ifomis.org28
Eugen Wüster1935
Professor of WoodworkingMachineryin the ViennaAgriculturalCollege
ifomis.org29
Eugen Wüster
Terminology-hobbyistandfounder of International StandardsOrganizationTechnicalCommittee 37
ifomis.org30
International Standard Bad Philosophy
Wüster: concepts are inside people’s brains
ISO terminology standards
ifomis.org31
Wüster
a concept is a mental surrogate of a plurality of objects grouped together on the basis of perceived similarities
and what makes those objects similar is another concept
(Turtles all the way down)
ifomis.org32
ISO: Terminologists should still postulate ‘concepts’ even when they have no idea of what the terms in question mean
In the domain of woodworking equipment we can see the similarities between groups of objects to which general terms are assigned.
Not so in medicine (consider: a carcinoma, or an embryo, in the successive phases of its development)
ifomis.org33
Wüster / ISO on ‘objects’
object = def. anything to which human thought is or can be directed
... whether material or immaterial, real or purely imagined
ISO: In the course of producing a terminology, philosophical discussions on whether an object actually exists in reality … are to be avoided.
ifomis.org34
3) The epistemological reading
Concepts are ‘units of knowledge’
as in ‘knowledge modeling’, ‘knowledge representation’, ‘knowledge-intense disciplines’
Even errors are ‘knowledge’ on this reading
– so here, too, the concept orientation draws as too far away from empirical science and too close to delusion and myth
ifomis.org35
Against ‘knowledge representation’
Not
‘KNOWLEDGE-BASED SYSTEMS’
but
‘true-or-false-belief-based systems’
ifomis.org36
Concepts are Triply Ethereal
because they are simultaneously supposed to be
1. software proxies for entities in reality (some ghostly diabetes counterpart is needed – because “you can’t get the diabetes itself inside the computer”)
2. the ‘knowledge’ (ideas and beliefs) in the minds of human experts
3. the meanings of the terms such experts use
ifomis.org37
ifomis.org38
4) The ontological reading
concepts are not creatures of cognition or of computation
they are invariants out there in reality
Better: they are what philosophers call types, kinds, universals
ifomis.org39
is_a
human is_a mammal
all instances of the universal human are instances of the universal mammal
is_a defined in terms of the primitive relation of instantiation between a particular and a universal
ifomis.org40
part_of
defined in terms of the primitive relation of mereological parthood defined between one instance and another (for example between Mary and her heart)
A part_of B =def. given any instance a of A there is some instance b of B such that a part_of b
ifomis.org41
inverse relations
nucleus part_of cell
cell has_part nucleus
ifomis.org42
All-some definitions of relations between universals
A adjacent_to B =def
all instances of A are adjacent to (in the instance-level sense) some instance of B
ifomis.org43
Ajacency as a relation between universals is not symmetrical
nucleus adjacent_to cytoplasm
Not: cytoplasm adjacent_to nucleus
seminal vesicle adjacent_to urinary bladder
Not: urinary bladder adjacent_to seminal vesicle
ifomis.org44
Evaluation
Bad ontologies are (inter alia) those whose general terms lack the relation to corresponding universals in reality, and thereby also to corresponding instances.
ifomis.org45
Good ontologies
= representations of universals and particulars in reality
ifomis.org46
?
The concept diabetes mellitus becomes ‘associated with a diabetic patient’
concept patient concept diabetes
what it is on the
side of the patient?
ifomis.org47
?
The concept diabetes mellitus becomes ‘associated with a diabetic patient’
concept patient concept diabetes
what it is on the
side of the patient?what is the relation here?
ifomis.org48
what it is on the
side of the patient
Make this our starting point
+substance accident
ifomis.org49
A bottom-up approach
begin with what confronts the physician at the point of care (or in the lab):
instances in reality (patients, disorders, pains, fractures, ...)
= the what it is on the side of the patient
and build up to terminologies from there
ifomis.org50
What happens when a new disorder first begins to make itself manifest?
physicians delineate a certain family of cases manifesting a new pattern of symptoms
... hypothesis: they are instances of a single universal or kind
(this universal still hardly understood)
but already: need for a new term (e.g. ‘AIDS’)
ifomis.org51
‘SARS’
not: severe acute respiratory syndrome
but: this particular severe acute respiratory syndrome, instances of which were first identified in Guangdong in 2002 and caused by instances of this particular coronavirus whose genome was first sequenced in Canada in 2003
ifomis.org52
Users can point to instances in the lab or clinic – but not yet to universals
The terminologist plugs the gap by postulating concepts instead
ifomis.org53
Users can point to instances in the lab or clinic – but not yet to universals
(The terminologist plugs the gap by postulating concepts instead)
ifomis.org54
It’s sometimes hard to grasp the universals in reality to which our general terms refer.So, let’s guarantee that every general term ‘w’ has a precisely tailored referent:
‘the concept w’We can then forget the messy job of coming to grips with reality, and substitute instead the more pleasant job of grasping the conceptual entities we ourselves have created
ifomis.org55
Better: terminology building should start from the instances that we apprehend in the lab or clinic
Assertions in scientific texts pertain to universals in reality
Assertions in the EHR pertain to instances of these universals
ifomis.org56
Universals are those invariants in realitywhich make possible the use of general terms in scientific inquiry and the use of standardized tests and standardized therapies in clinical care
ifomis.org57
Universals have instances
SNOMED CT comprehends universals in the realms of disorders, symptoms, anatomical structures, ...
In each case we have corresponding instances
= the what it is on the side of the patient
but such instances are poorly recorded in EHRs so far
ifomis.org58
The Great Task of Terminology Building in an Age of Evidence-Based Medicine
Terminology work should start with instances in reality, and seek to build up from there to align our terms with the corresponding universals
We can then abandon the detour through concepts altogether
ifomis.org59
Terminologies should be aligned not with concepts but with universals in
reality
including the universals instituted by therapies, acts of measurement, portions of bodily substance, etc.
ifomis.org60
Define a node of a terminology:
<p, Sp, d>with p a preferred term (string)
Sp a set of synonymsd an (optional) definition
Define a terminology:T = <N, L, v>
N a set of nodesL a set of links (graph-theoretical edges)v a version number
ifomis.org61
The ideal: one-to-one correspond between nodes and universals in reality
Problem: bad terms (‘phlogiston’, ‘diabetes’) At any given stage we will have:
N = N1 N> N< where
N1 = terms which correspond to exactly one universalN> = terms which correspond to more than one universal N< = terms which correspond to less than one universal
ifomis.org62
The belief in scientific progress
with the passage of time, N> and N< will become ever smaller, so that N1 will approximate ever more closely to N *
Assumption: the vast bulk of the beliefs expressed / presupposed in biomedical texts are true. Hence N1 already constitutes a very large portion of N (the collection of terms already in general use).
*modulo the fact that the totality of universals will itself change with the passage of time
ifomis.org63
There are hearts
ifomis.org64
But science is an asymptotic process
At all stages prior to the ideal end to our labors, we will not know where the boundaries between N1, N<, and N> are to be drawn
ifomis.org65
We do not know how the terms are presently distributed between N1, N< and N>,
So: is the distinction of purely theoretical interest – a matter of abstract (philosophical) housekeeping ?
ifomis.org66
We typically have at our disposal a whole developing series of versions of a terminology
New idea: we can create locally our own alternative developing series in order to test out alternative hypotheses regarding how to classify given particulars as instances of given types of disorders or symptoms
ifomis.org67
How make instances visible to reasoning systems?
Create an EHR regime in which explicit alphanumerical IUIs (instance unique identifiers) are automatically assigned to each instance when it first becomes relevant to the treatment of a given patient
ifomis.org68
We could then perform experiments with terminologies
Our referent-tracking machinery will give us the facility to experiment with different scenarios as concerns the division between N1, N<, and N>
better terminologies
better decision-support for diagnosis
ifomis.org69
How medical terms are introduced
we have a pool of cases (instances) manifesting a certain hitherto undocumented pattern of irregularities (deviations from the norm)
the universal kind which they instantiate is unknown – and the challenge is to solve for this unknown
(cf. the discovery of Pluto)
ifomis.org70
Instance vector
an ordered triple
<i, p, t>
i is a IUI, p a preferred term, and t a time
instance #5001 is associated with
SNOMED-CT code glomus tumor
at 4/28/2005 11:57:41 AM
ifomis.org71
Instantiation of a terminology
Let D be a set of IUIs (collected by a given hospital)
For a terminology T= <N,L,v> define an instantiation
It(T, D)
as the set of all instance vectors <i, p, t> for i in D and p in N
ifomis.org72
Instantiation of a term
For each term p, define its t-extension
It(T, D)(p)
as the set of all IUIs i for for which <i, p, t> is included in It(T, D)
ifomis.org73
Tracking invariantsFor each p we subject its t-extensions for varying t and D to statistically based factor-analysis in order to determine whether
1. p is in N1(it designates a single universal): the instances in It(T, D)(p) manifest a common invariant pattern
2. p is in N>
3. p is in N<
ifomis.org74
We can track patterns for It(T, D)(p)
e.g. in relation to the IUIs for patients in given geographical areas, or at given stages of development and growth
In relation to a given patient, we can track patterns e.g. for different diagnoses, e.g.
It(T, D)(p) vs. It(T, D)(q + r)
to see which gives a better match
ifomis.org75
Diagnostic decision-support
Consider the characteristic patterns of correction which arise in the early phases of diagnosis of degenerative diseases such as multiple sclerosis.
ifomis.org76
The (true) story of Jane Smith
(with thanks to Werner Ceusters)
ifomis.org77
Jane’s favourite supermarket
July 4th, 1990: Jane goes shopping:
The freezer section of Jane’s favourite supermarket
The only available warning sign used outside
A very suspiciously shaped upper leg
ifomis.org78
A visit to the hospital City Health Centre Dr. Peters
(City HC) Dr. Longley
ifomis.org79
Diagnosis: a severe spiral fracture of the femur
ifomis.org80
The City HC’s medical record
captures in a structured form all of the ‘clinically significant’ information in the narrative notes
ifomis.org81
Structured Medical Record
www.medappz.com/
04/07/1990 – 17:10Dr. Peters
Jane SmithOrthopedics
Emergency visit: 04/07/1990 – 17.00
Severe
Left upper leg
Since fall on floor
Constant
26442006 closed fracture of shaft of femur
81134009 fracture, closed, spiral
ifomis.org82
5572 04/07/1990 26442006 closed fracture of shaft of femur
5572 04/07/1990 81134009 Fracture, closed, spiral
5572 12/07/1990 26442006 closed fracture of shaft of femur
5572 12/07/1990 9001224 Accident in public building (supermarket)
5572 04/07/1990 79001 Essential hypertension
0939 24/12/1991 255174002 benign polyp of biliary tract
2309 21/03/1992 26442006 closed fracture of shaft of femur
2309 21/03/1992 9001224 Accident in public building (supermarket)
47804 03/04/1993 58298795 Other lesion on other specified region
5572 17/05/1993 79001 Essential hypertension
298 22/08/1993 2909872 Closed fracture of radial head
298 22/08/1993 9001224 Accident in public building (supermarket)
5572 01/04/1997 26442006 closed fracture of shaft of femur
5572 01/04/1997 79001 Essential hypertension
PtID Date ObsCode Narrative
0939 20/12/1998 255087006 malignant polyp of biliary tract
Same patient, same hypertension code:Same (numerically identical) hypertension ?
Different patients, same fracture codes:Same (numerically identical) fracture ?
Same patient, different dates, same fracture
codes: same (numerically identical)
fracture ?
Same patient, same date,2 different fracture codes:
same (numerically identical) fracture ?
ProblemsDifferent patients. Same supermarket? Maybe the same (irrelevant ?) freezer section ?Or different supermarkets, but always in the freezer sections ?
Same patient, different dates, Different codes. Same (numericallyidentical) polyp ?
ifomis.org83
Main problems of EHRsStatements refer only implicitly to the
concrete entities about which they give information.
Codes are general: they tell us only that some instance of the class the codes refer to, is referred to in the statement, but not what instance precisely.
ifomis.org84
Proposed solution:
Referent Tracking
Purpose:– explicit reference to the concrete individual
entities relevant to the accurate description of each patient’s condition, therapies, outcomes, ...
Method:– Introduce an Instance Unique Identifier (IUI)
for each relevant particular / instance
ifomis.org85
CUI (coo-ey): Concept Unique Identifier (e.g. a SNOMED code)
UUI (oo-ey): Universal Unique Identifier
IUI (you-ey): Instance Unique Identifier (e.g. a Social Security Number)
ifomis.org86
An ontological analysis
continuantsCity HC
The freezer section of Jane’s favourite supermarket
Jane’s left femur
Jane’s left femur fracture
Jane Smith
Dr. Peters
Jane’s left femur
Jane’s fracture’s image
Dr. Longley
City HC’s EHR system
t
UniversalsEHR system
HC
Freezer section
Person
Femur
Fracture
Image
Jane’s fallingJane’s femur breakingDr. Peter’s examination of Jane’s fractureDr. Peter’s ordering of an X-rayShooting the pictures of Jane’s leg
occurrents
Jane’s fracture’s healingDr. Peter’s diagnosis making
Jane diesFreezer section dismantledDr. Longley’s examination of Jane’ s fracture
ifomis.org87
IUI assignmentsThe act of IUI assignment can be represented as:
<IUIa , Ai , td>
IUIa = IUI of the registering agent
Ai = <IUIa, IUIp, tap, c>
IUIa = IUI of the author of the
assertion
IUIp = IUI of the particular
tap = time of assignment
c = optional description
td = time of registering Ai in the IUI-repository
ifomis.org88
A SNOMED-CT example <IUI-0945, 18/04/2005, SNOMED-CT v0301, IUI-
1921, 367720001, forever>• #IUI-0945: author of the statement• #IUI-1921: the left testicle of patient #IUI-78127• 367720001: the SNOMED concept-code to which “left testis” is
(in SNOMED) attached as term
So we can denote #IUI-1921 by means of• that left testis• that entire left testis• that testicle, that male gonad, that testis• that genital structure• that physical anatomical entity• BUT NOT: that SNOMED-CT concept
ifomis.org89
Pragmatics of IUIs in EHRs
IUI assignment requires (just a bit) more effort compared to current use of general codes from concept-based systems
ifomis.org90
IUIs in structured EHRs
www.medappz.com/
04/07/1990 – 17:10Dr. Peters
Jane SmithOrthopedics
Emergency visit: 04/07/1990 – 17.00
Severe
Left upper leg
Since fall on floor
Constant
26442006 closed fracture of shaft of femur
81134009 fracture, closed, spiral
Replaced by the IUI for thepatient’s left upper legThat IUI might be found byusing “left upper leg” as asearch term to query theRTDB
Both replaced by the IUI for that fractureBy means of PTCO statements is the IUI relatedto the SNOMED-codes
ifomis.org91
Advantage: betterreality
representation5572 04/07/1990 26442006 closed fracture of shaft of femur
5572 04/07/1990 81134009 Fracture, closed, spiral
5572 12/07/1990 26442006 closed fracture of shaft of femur
5572 12/07/1990 9001224 Accident in public building (supermarket)
5572 04/07/1990 79001 Essential hypertension
0939 24/12/1991 255174002 benign polyp of biliary tract
2309 21/03/1992 26442006 closed fracture of shaft of femur
2309 21/03/1992 9001224 Accident in public building (supermarket)
47804 03/04/1993 58298795 Other lesion on other specified region
5572 17/05/1993 79001 Essential hypertension
298 22/08/1993 2909872 Closed fracture of radial head
298 22/08/1993 9001224 Accident in public building (supermarket)
5572 01/04/1997 26442006 closed fracture of shaft of femur
5572 01/04/1997 79001 Essential hypertension
PtID Date ObsCode Narrative
0939 20/12/1998 255087006 malignant polyp of biliary tract
IUI-001
IUI-001
IUI-001
IUI-003
IUI-004
IUI-004
IUI-005
IUI-005
IUI-005
IUI-007
IUI-007
IUI-007
IUI-002
IUI-012
ifomis.org92
Other AdvantagesMappings between ontologies and coding
systems created as by-product of tracking– Descriptions about the same particular using
different systems e.g. in different hospitals
Quality control of ontologies and concept-based systems– Systematically inconsistent descriptions within
or across terminologies may indicate poor definition of the respective terms
ifomis.org93
Other AdvantagesReferent tracking can be used in decision
support when making diagnoses
We can consider the results of assignment of different clinical codes to one and the same collection of IUIs assembled over a period (and thereby uncover new patterns of symptoms, e.g. in a case of multiple sclerosis)
ifomis.org94
ConclusionReferent tracking can solve a number of
problems in an elegant way.Existing (or emerging) technologies can be
used for the implementation.Old technologies can play an interesting
role.Big Brother feeling is to be expected, but
with adequate measures easy to fight.Pilot is being established
ifomis.org95
The Endhttp://ifomis.org