K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I...
-
date post
19-Dec-2015 -
Category
Documents
-
view
214 -
download
0
Transcript of K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I...
![Page 1: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/1.jpg)
1
KNOWLEDGE-BASED METHOD FOR DETERMINING THE MEANING OF AMBIGUOUS BIOMEDICAL TERMS USING INFORMATION CONTENT MEASURES OF SIMILARITY
Bridget McInnes
Ted Pedersen
Ying Liu
Genevieve B. Melton
Serguei Pakhomov
![Page 2: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/2.jpg)
2
OBJECTIVE OF THIS WORK
Develop and evaluate a method than can disambiguate terms in biomedical text by exploiting similarity information extrapolated from the Unified Medical
Language System
Evaluate the efficacy of Information Content-based similarity measures over path-based similarity measures
for Word Sense Disambiguation, WSD
![Page 3: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/3.jpg)
3
WORD SENSE DISAMBIGUATION
Word sense disambiguation is the task of determining the appropriate sense of a term
given context in which it is used.
TERM: tolerance
DrugTolerance
ImmuneTolerance
![Page 4: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/4.jpg)
4
WORD SENSE DISAMBIGUATION
Word sense disambiguation is the task of determining the appropriate sense of a term
given context in which it is used.
Busprione attenuates tolerance to morphine in mice with skin cancer
DrugTolerance
ImmuneTolerance
![Page 5: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/5.jpg)
5
SENSE INVENTORY: UNIFIED MEDICAL LANGUAGE SYSTEM
Unified Medical Language Sources (UMLS) Semantic Network Metathesaurus
~1.7 million biomedical and clinical concepts; integrated semi-automatically
CUIs (Concept Unique Identifiers), linked: Hierarchical: PAR/CHD and RB/RN Non-hierarchical: SIB, RO
Sources viewed together or independently Medical Subject Heading (MSH)
SPECIALIST Lexicon Biomedical and clinical terms, including variants
![Page 6: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/6.jpg)
6
WORD SENSE DISAMBIGUATION
Busprione attenuates tolerance to morphine in mice with skin cancer
DrugTolerance: C0013220
ImmuneTolerance:C0020963
Concept Unique Identifiers: CUIs
![Page 7: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/7.jpg)
7
SENSERELATE ALGORITHM
Each possible sense of a target word is assigned a score [sum similarity between it and its surrounding terms]
Assign target word the sense with highest score
Proposed by Patwardhan and Pedersen 2003 using WordNet
UMLS::SenseRelate is a modification of this algorithm using information from the UMLS
NEXT UP: an example
![Page 8: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/8.jpg)
8
SENSERELATE EXAMPLEBusprione attenuates tolerance to morphine
in mice with skin cancer
![Page 9: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/9.jpg)
9
SENSERELATE EXAMPLEBusprione attenuates tolerance to morphine
in mice with skin cancer
DrugTolerance: C0013220
ImmuneTolerance:C0020963
![Page 10: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/10.jpg)
10
SENSERELATE EXAMPLEBusprione attenuates tolerance to morphine
in mice with skin cancer
DrugTolerance: C0013220
ImmuneTolerance:C0020963
Busprione:
C0006462
Morphine:C0026549
Mice: C0026809
Skin cancer:
C0007114
![Page 11: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/11.jpg)
11
SENSERELATE EXAMPLE
0.090.16
0.11
Busprione attenuates tolerance to morphine in mice with skin cancer
0.09
DrugTolerance: C0013220
ImmuneTolerance:C0020963
Busprione:
C0006462
Morphine:C0026549
Mice: C0026809
Skin cancer:
C0007114
![Page 12: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/12.jpg)
12
SENSERELATE EXAMPLE
0.090.16
0.11
Busprione attenuates tolerance to morphine in mice with skin cancer
0.09
DrugTolerance: C0013220
ImmuneTolerance:C0020963
Busprione:
C0006462
Morphine:C0026549
Mice: C0026809
Skin cancer:
C0007114
Drug ToleranceScore = 0.09 + 0.09 + 0.16 + 0.11 = 0.45
![Page 13: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/13.jpg)
13
SENSERELATE EXAMPLE
0.090.16
0.11
0.090.050.04
Busprione attenuates tolerance to morphine in mice with skin cancer
0.090.09
DrugTolerance: C0013220
ImmuneTolerance:C0020963
Busprione:
C0006462
Morphine:C0026549
Mice: C0026809
Skin cancer:
C0007114
Drug ToleranceScore = 0.09 + 0.09 + 0.16 + 0.11 = 0.45
![Page 14: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/14.jpg)
14
SENSERELATE EXAMPLE
0.090.16
0.11
0.090.050.04
Busprione attenuates tolerance to morphine in mice with skin cancer
0.090.09
DrugTolerance: C0013220
ImmuneTolerance:C0020963
Busprione:
C0006462
Morphine:C0026549
Mice: C0026809
Skin cancer:
C0007114
Drug ToleranceScore = 0.09 + 0.09 + 0.16 + 0.11 = 0.45
Immune ToleranceScore = 0.09 + 0.09 + 0.05 + 0.05 = 0.27
![Page 15: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/15.jpg)
15
SENSERELATE EXAMPLE
0.090.16
0.11
0.090.050.04
Busprione attenuates tolerance to morphine in mice with skin cancer
0.090.09
DrugTolerance: C0013220
ImmuneTolerance:C0020963
Busprione:
C0006462
Morphine:C0026549
Mice: C0026809
Skin cancer:
C0007114
Drug ToleranceScore = 0.09 + 0.09 + 0.16 + 0.11 = 0.45
Immune ToleranceScore = 0.09 + 0.09 + 0.05 + 0.05 = 0.27
![Page 16: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/16.jpg)
16
SENSE RELATE ASSUMPTION
An ambiguous word is often used in the sense
that is most similar to the sense of the terms that surround it
![Page 17: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/17.jpg)
17
SENSERELATE COMPONENTS
Identifying the concepts of surrounding terms
Calculating semantic similarity
![Page 18: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/18.jpg)
18
IDENTIFYING THE CONCEPTS OF THE SURROUNDING TERMS
Use the SPECIALIST LEXICON to identify the terms and map the terms doing a string match to the MRCONSO table in
the UMLS
![Page 19: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/19.jpg)
19
IDENTIFYING THE CONCEPTS OF THE SURROUNDING TERMS
Use the SPECIALIST LEXICON to identify the terms and map the terms doing a string match to the MRCONSO table in
the UMLSBusprione attenuates tolerance to morphine
in mice with skin cancer
![Page 20: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/20.jpg)
20
IDENTIFYING THE CONCEPTS OF THE SURROUNDING TERMS
Use the SPECIALIST LEXICON to identify the terms and map the terms doing a string match to the MRCONSO table in
the UMLS
...skin cancerskin graftingskin disease
...
SPECIALISTLEXICON
Busprione attenuates tolerance to morphine in mice with skin cancer
![Page 21: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/21.jpg)
21
IDENTIFYING THE CONCEPTS OF THE SURROUNDING TERMS
Use the SPECIALIST LEXICON to identify the terms and map the terms doing a string match to the MRCONSO table in
the UMLS
...skin cancerskin graftingskin disease
...
...skin cancer C0007114skin grafting C0037297skin disease C0037274
...
SPECIALISTLEXICON
MRCONSO
Busprione attenuates tolerance to morphine in mice with skin cancer
![Page 22: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/22.jpg)
22
SEMANTIC SIMILARITY MEASURES
Path-based measures Path Wu and Palmer Leacock and Chodorow Ngyuen and Al-Mubaid
Information content (IC)-based measures Resnik Lin Jiang and Conrath
![Page 23: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/23.jpg)
23
PATH-BASED SIMILARITY MEASURES
Use only the path information obtained from a taxonomy
![Page 24: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/24.jpg)
24
PATH-BASED SIMILARITY MEASURES
Use only the path information obtained from a taxonomy
Path measure sim(c1,c2) = 1 / minpath(c2,c2)
where minpath is the shortest path between the two concepts
![Page 25: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/25.jpg)
25
PATH-BASED SIMILARITY MEASURES
Use only the path information obtained from a taxonomy
Path measure sim(c1,c2) = 1/minpath(c2,c2)
where minpath is the shortest path between the two concepts
Wu and Palmer, 1994 sim(c1,c2) = (2*depth(LCS(c2,c2))) /
(depth(c1)+depth(c2)) where LCS is the least common subsumer of the
two concepts
![Page 26: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/26.jpg)
26
PATH-BASED SIMILARITY MEASURES
Use only the path information obtained from a taxonomy
Path measure sim(c1,c2) = 1/ minpath(c2,c2)
where minpath is the shortest path between the two concepts
Wu and Palmer, 1994 sim(c1,c2) = (2*depth(LCS(c2,c2))) / (depth(c1)+depth(c2))
where LCS is the least common subsumer of the two concepts
Leacock and Chodorow, 1998 sim(c1,c2) = -log( minpath(c1,c2) / (2D) )
where D is the total depth of the taxonomy
![Page 27: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/27.jpg)
27
PATH-BASED SIMILARITY MEASURES Use only the path information obtained from a taxonomy
Path measure sim(c1,c2) = 1/ minpath(c2,c2)
where minpath is the shortest path between the two concepts
Leacock and Chodorow, 1998 sim(c1,c2) = -log( minpath(c1,c2) / (2D) )
where D is the total depth of the taxonomy
Wu and Palmer, 1994 sim(c1,c2) = (2*depth(LCS(c2,c2))) / (depth(c1)+depth(c2))
where LCS is the least common subsumer of the two concepts
Nyguen and Al-Mubaid, 2006 sim(c1,c2) = log ( (2 + minpath(c1,c2) - 1) *
(D - depth(LCS(c1,c2))) )
![Page 28: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/28.jpg)
28
PATH-BASED SIMILARITY MEASURES
USE ONLY THE PATH INFORMATION OBTAINED FROM A TAXONOMY
Disease: C0012634
Drug Related
Disorder: C0277579
DrugTolerance: C0013220
Neoplasm: C1302761
Neoplastic Disease: C1882062
Malignant Neoplasm: C0006826
Skin cancer:
C0007114
![Page 29: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/29.jpg)
29
INFORMATION CONTENT-BASED MEASURES
Incorporate the probability of the concepts
IC = -log(P(concept))
![Page 30: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/30.jpg)
30
INFORMATION CONTENT-BASED MEASURES
Incorporate the probability of the concepts
IC = -log(P(concept))
P(concept)
Calculated by summing the probability of the concept and the probability of its descendants
Probabilities are obtained from an external corpus
![Page 31: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/31.jpg)
31
INFORMATION CONTENT-BASED MEASURES
Incorporate the probability of the concepts
IC = -log(P(concept)
Resnik, 1995 sim(c1,c2) = IC(LCS(c1,c2))
![Page 32: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/32.jpg)
32
INFORMATION CONTENT-BASED MEASURES
Incorporate the probability of the concepts
IC = -log(P(concept)
Resnik, 1995 sim(c1,c2) = IC(LCS(c2,c2))
Jiang and Conrath, 1997 sim(c1,c2) = 1 / (IC(c1)+IC(c2) – 2*
IC(LCS(c1,c2))
![Page 33: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/33.jpg)
33
INFORMATION CONTENT-BASED MEASURES
Incorporate the probability of the concepts
IC = -log(P(concept)
Resnik, 1995 sim(c1,c2) = IC(LCS(c2,c2))
Jiang and Conrath, 1997 sim(c1,c2) = 1 ÷ (IC(c1)+IC(c2) – 2* IC(LCS(c1,c2))
Lin, 1998 sim(c1,c2) = (2*IC(LCS(c2,c2))) / (IC(c1)+IC(c2))
![Page 34: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/34.jpg)
34
IC-BASED SIMILARITY MEASURES
Disease: C001263
4Drug
Related Disorder: C0277579Drug
Tolerance:
C0013220
Neoplasm:
C1302761Neoplasti
c Disease: C188206
2Malignan
t Neoplas
m: C000682
6Skin cancer: C000711
4
+
PATH INFORMATIONPROBABILITY OF
CONCEPTS
EXTERNAL CORPUS
![Page 35: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/35.jpg)
35
EXPERIMENTAL FRAMEWORK
Use open-source UMLS::Similarity package to obtain the similarity between the terms and possible senses in the SenseRelate algorithm
Path information: parent/child relations in MSH source
Information content: calculated using the UMLSonMedline dataset created by NLM
Consists of concepts from 2009AB UMLS and the frequency they occurred in Medline using the Essie Search Engine (Ide et al 2007)
Medline: database of citations of biomedical/clinical articles
![Page 36: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/36.jpg)
36
EVALUATION DATA: MSH WSD
MSH-WSD dataset (Jimeno-Yepes, et al 2011) 203 target words (ambiguous word) from Medline
106 terms e.g. tolerance 88 acronyms e.g. CA (calcium, california) 9 mixtures e.g. bat (brown adipose tissue)
Each target word contains ~187 instances (Medline abstracts) abstract = ~ 500 words
Each target word in the instances assigned a concept from MSH by exploiting the manually assigned MSH concepts assigned to the abstract
Average of 2.08 possible senses per target word Majority sense over all the target words is 54.5%
![Page 37: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/37.jpg)
37
RESULTS
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.55
0.7200000000000010.690000000
0000010.700000000
000001
0.720000000000001
0.730000000000001
0.740000000000001
0.740000000000001
baseline path
lch wup
nam
res jcn
accuracy
Path-based IC-based
lin
![Page 38: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/38.jpg)
38
COMPARISON ACROSS SUBSETS OF MSH-WSD
Terms Acronyms Mixture Overall0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.55 0.54 0.53 0.55
0.670000000000002
0.8 0.730000000000001
0.7400000000000020.71000000000
0001
0.870000000000002 0.88
0.8
0.670000000000002
0.850000000000001
0.93
0.78
BaselineSenseRe-lateMRD2-MRD
accuracy
![Page 39: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/39.jpg)
39
COMPARISON ACROSS SUBSETS OF MSH-WSD
Terms Acronyms Mixture Overall0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.55 0.54 0.53 0.55
0.670000000000002
0.8 0.730000000000001
0.7400000000000020.71000000000
0001
0.870000000000002 0.88
0.8
0.670000000000002
0.850000000000001
0.93
0.78
BaselineSenseRe-lateMRD2-MRD
accuracy
![Page 40: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/40.jpg)
40
COMPARISON ACROSS SUBSETS OF MSH-WSD
Terms Acronyms Mixture Overall0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.55 0.54 0.53 0.55
0.670000000000002
0.8 0.730000000000001
0.7400000000000020.71000000000
0001
0.870000000000002 0.88
0.8
0.670000000000002
0.850000000000001
0.93
0.78
BaselineSenseRe-lateMRD2-MRD
accuracy
![Page 41: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/41.jpg)
41
COMPARISON ACROSS SUBSETS OF MSH-WSD
Terms Acronyms Mixture Overall0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.55 0.54 0.53 0.55
0.670000000000002
0.8 0.730000000000001
0.7400000000000020.71000000000
0001
0.870000000000002 0.88
0.8
0.670000000000002
0.850000000000001
0.93
0.78
BaselineSenseRe-lateMRD2-MRD
accuracy
![Page 42: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/42.jpg)
42
COMPARISON ACROSS SUBSETS OF MSH-WSD
Terms Acronyms Mixture Overall0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0.55 0.54 0.53 0.55
0.670000000000002
0.8 0.730000000000001
0.7400000000000020.71000000000
0001
0.870000000000002 0.88
0.8
0.670000000000002
0.850000000000001
0.93
0.78
BaselineSenseRe-lateMRD2-MRD
accuracy
![Page 43: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/43.jpg)
43
WINDOW SIZES
Use the terms surrounding the target word within a specified window: 1, 2, 5, 10, 25, 50, 60, 70
Busprione attenuates tolerance to morphine in mice with skin_cancer
WINDOW SIZE = 2
![Page 44: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/44.jpg)
44
COMPARISON OF WINDOW SIZES FOR LIN
0 1 2 5 10 25 50 60 700
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.50.53
0.650000000000002
0.690000000000001
0.710000000000001
0.740000000000002
0.740000000000002
0.740000000000002
0.740000000000002
lin
accuracy
window size
![Page 45: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/45.jpg)
45
SURROUNDING TERMS
Not all terms have a concept in the UMLS
therefore
Not all surrounding terms in the window mapped to CUIs
![Page 46: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/46.jpg)
46
WINDOW SIZES VERSUS MAPPED TERMS
0 1 2 5 10 25 50 60 700
2
4
6
8
10
12
14
16
18
0 0.27 0.79
1.85
3.49
7.6
12.96
14.28
15.64
lin
number
of
mappings
window size
![Page 47: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/47.jpg)
47
FUTURE WORK: MAPPING TERMS
Currently looking at mapping the terms to CUIs using information from the concept mapping system MetaMap
Obtain the terms from MetaMap and do a dictionary look up in MRCONSO Hypothesis – the terms obtained by MetaMap are more
accurate than using the SPECIALIST Lexicon
Obtain the CUIs from MetaMap Hypothesis – the CUIs obtained by MetaMap will be
more accurate than the dictionary look-up
![Page 48: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/48.jpg)
48
OBJECTIVE #1
Develop and evaluate a method than can disambiguate terms in biomedical text by
exploiting similarity information extrapolated from the UMLS
UMLS::SenseRelate statistically significantly higher disambiguation accuracy than the baseline
On par with previous unsupervised methods for terms
![Page 49: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/49.jpg)
49
OBJECTIVE #2
Evaluate the efficacy of IC-based similarity measures over path-based measures on a
secondary task
There is no statistically significant difference between the accuracies obtained by the IC-based measures
There is a statistically significant difference between the IC-based measures and the path-based measures
![Page 50: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/50.jpg)
50
TAKE HOME MESSAGE:
An ambiguous word is often used in the sense
that is most similar to the sense of the concepts
of the terms that surround it
![Page 51: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/51.jpg)
51
RESOURCES
Software: UMLS::SenseRelate
http://search.cpan.org/dist/UMLS-SenseRelate/ UMLS::Similarity
http://search.cpan.org/dist/UMLS-Similarity/
Data MSH-WSD
http://wsd.nlm.nih.gov/collaboration.shtml
![Page 52: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/52.jpg)
52
RESOURCES
Software: UMLS::SenseRelate
http://search.cpan.org/dist/UMLS-SenseRelate/ UMLS::Similarity
http://search.cpan.org/dist/UMLS-Similarity/
Data MSH-WSD
http://wsd.nlm.nih.gov/collaboration.shtml
THANK YOU
![Page 53: K NOWLEDGE - BASED M ETHOD FOR D ETERMINING THE M EANING OF A MBIGUOUS B IOMEDICAL T ERMS U SING I NFORMATION C ONTENT M EASURES OF S IMILARITY Bridget.](https://reader035.fdocuments.in/reader035/viewer/2022062516/56649d2c5503460f94a02685/html5/thumbnails/53.jpg)
53
RESOURCES
Software: UMLS::SenseRelate
http://search.cpan.org/dist/UMLS-SenseRelate/ UMLS::Similarity
http://search.cpan.org/dist/UMLS-Similarity/
Data MSH-WSD
http://wsd.nlm.nih.gov/collaboration.shtml
QUESTIONS?