Post on 12-Feb-2022
Evaluation of theEvaluation of theRestrict to Restrict to MeSHMeSH algorithmalgorithm
Student: Michael BalesStudent: Michael BalesMentor: Olivier Mentor: Olivier BodenreiderBodenreider
NLM Summer Research RotationNLM Summer Research Rotation
BackgroundBackground
MEDLINEMEDLINEMeSHMeSHNLM Indexing InitiativeNLM Indexing InitiativeUMLS UMLS MetathesaurusMetathesaurus
Collection of terminological systemsCollection of terminological systems1.2 million concepts, each assigned a CUI1.2 million concepts, each assigned a CUI
MeSH descriptor
Noun phrase
UMLS
UMLS CUIC0202022: Primary C0202022: Primary
malignant neoplasm of malignant neoplasm of left lower lobe of lungleft lower lobe of lung
MeSH main headingsD001984: Bronchial D001984: Bronchial neoplasmsneoplasms
D008175: Lung D008175: Lung neoplasmsneoplasms
Restrict to Restrict to MeSHMeSH::Mapping CUIs to Mapping CUIs to MeSHMeSH entitiesentities
Medical text
Restrict to MeSHRestrict to MeSH
Based on the principle of semantic localityBased on the principle of semantic localityFour techniquesFour techniques
Use synonymyUse synonymyUse associated expressions (Use associated expressions (ATXsATXs))
Explore the ancestorsExplore the ancestors
Explore the other related conceptsExplore the other related concepts
Use synonymyUse synonymy
Term mapped to source conceptTerm mapped to source conceptFor this concept, is there a synonym term For this concept, is there a synonym term that comes from that comes from MeSHMeSH??
UMLS CUIC0002006: Aldosterone
MeSH main headingD000450: Aldosterone
Use associated expressionsUse associated expressions
Is there an associated expression (ATX) that Is there an associated expression (ATX) that describes this concept using a combination of describes this concept using a combination of MeSH main headings?MeSH main headings?ATXsATXs correspond to ICD termscorrespond to ICD terms
AND
Foreign Bodies
MH/SH
Esophagus surgery
Endoscopic removal of intraluminalforeign body from oesophagus without incision
Use ancestorsUse ancestorsBuild the graph of the ancestors of the conceptBuild the graph of the ancestors of the concept
using parents and broader conceptsusing parents and broader conceptsall the way to the topall the way to the topexclude ancestors with incompatible semantic typeexclude ancestors with incompatible semantic type
From the graph, select the concepts that come From the graph, select the concepts that come from from MeSHMeSHRemove those that are ancestors of another Remove those that are ancestors of another concept coming from concept coming from MeSHMeSHAlso try children or siblings as seedAlso try children or siblings as seed
UMLS CUIGiant cell sarcoma
MeSH main headingSarcoma
Use other related conceptsUse other related concepts
Explore the other related conceptsExplore the other related conceptsExclude incompatible semantic typesExclude incompatible semantic typesFrom those, select the concepts that come From those, select the concepts that come from from MeSHMeSH
UMLS CUINicotinic Acid 0.15 MG / Riboflavin 0.02 MG / Thiamine 0.06 MGOral Tablet
MeSH main headingsNiacinRiboflavinThiamineTablets
MethodsMethods
Quantitative evaluation (all CUIs)Quantitative evaluation (all CUIs)From three perspectives:From three perspectives:
CUIsCUIsMeSHMeSH main headingsmain headingsMapping methodMapping method
Data used:Data used:UMLS UMLS MetathesaurusMetathesaurus 2006AA2006AARtMRtM--suggested mappingssuggested mappings
Qualitative evaluationQualitative evaluation
Assess performance on individual Assess performance on individual mappings mappings Random sample of 50 CUIsRandom sample of 50 CUIsAnswer a set of questions for each CUI Answer a set of questions for each CUI and mappingand mappingDetailed output of Detailed output of RtMRtM
Quantitative evaluation resultsQuantitative evaluation results
84.5% of UMLS CUIs were mapped to at least one MeSH entity
Percent of CUIs assigned at least one mapping to MeSH
Mapped
Not mapped
from the perspective of from the perspective of CUIsCUIs
From the perspective of From the perspective of CUIsCUIs
Mappings by Restrict to MeSH method, by semantic group
0%
20%
40%
60%
80%
100%
All
Chemica
ls & D
rugs
Anatom
yLiv
ing B
eings
Proced
ures
Occup
ation
sPhy
siolog
y
Geogra
phic
Areas
Pheno
mena
Activit
ies &
Beh
avior
sDev
ices
Disorde
rsObje
cts
Genes
& M
olecu
lar S
eque
nces
Conce
pts &
Idea
sOrg
aniza
tions
Method used for Restrict to MeSH method, by semantic group
0%
20%
40%
60%
80%
100%
All
Che
mic
als
& D
rugs
Ana
tom
y
Livi
ng B
eing
s
Pro
cedu
res
Occ
upat
ions
Phy
siol
ogy
Geo
grap
hic
Are
as
Phe
nom
ena
Act
iviti
es &
Beh
avio
rs
Dev
ices
Dis
orde
rs
Obj
ects
Gen
es &
Mol
ecul
ar S
eque
nces
Con
cept
s &
Idea
s
Org
aniz
atio
ns
No MeSH termO (Other related term)A (associated expression)I (synonymy)G/S (graph of ancestors, seeded by siblings)G/P (graph of ancestors, seeded by parents)G/C (graph of ancestors, seeded by children)
Relative count of CUIs in semantic group
Relative count of MeSH main headings in semantic group
From the perspective of From the perspective of mapping methodmapping method
Maximum proportional tree depth category, by main heading
0
10
20
30
40
50
1 (near root) 2 3 4 5 (near leaves)Tree depth category
Perc
ent m
ain
head
ings
in c
ateg
ory
All of MeSHRtM
From the perspective of From the perspective of MeSHMeSH main headingsmain headings
From the perspective of From the perspective of MeSHMeSH main headingsmain headings
MeSH main headings suggested by Restrict to MeSH method, by treeUMLS Metathesaurus 2006AA
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
A (A
nato
my)
B (O
rgan
ism
s)
C (D
isea
ses)
D (C
hem
ical
s an
d D
rugs
)
E (A
naly
tical
, Dia
gnos
tic, a
ndTh
erap
eutic
Tec
hniq
ues
and
Equ
ipm
ent)
F (P
sych
iatry
and
Psy
chol
ogy)
G (B
iolo
gica
l Sci
ence
s)
H (P
hysi
cal S
cien
ces)
I (A
nthr
opol
ogy,
Edu
catio
n,S
ocio
logy
and
Soc
ial P
heno
men
a)
J (T
echn
olog
y an
d Fo
od a
ndB
ever
ages
)
K (H
uman
ities
)
L (In
form
atio
n S
cien
ce)
M (P
erso
ns)
N (H
ealth
Car
e)
V (P
ublic
atio
n C
hara
cter
istic
s)
Z (G
eogr
aphi
c Lo
catio
ns)
MeSH tree
Tota
l MeS
H m
ain
head
ings
in tr
ee
Main headings suggested by RtM at least twice
Main headings suggested by RtM only once (for MeSH-supplied CUI)
Qualitative evaluationQualitative evaluation
What mapping method is used?What mapping method is used?If there is no mapping to If there is no mapping to MeSHMeSH, why not?, why not?Is the mapping in the same semantic neighborhood as the Is the mapping in the same semantic neighborhood as the CUI?CUI?
If not, how did this lateral semantic drift occur?If not, how did this lateral semantic drift occur?Is the mapping at the same level of specification as the CUI?Is the mapping at the same level of specification as the CUI?
If not, how did this semantic drift occur?If not, how did this semantic drift occur?If the graph of ancestors was used, how many tree levels were If the graph of ancestors was used, how many tree levels were climbed before selecting the suggested mapping?climbed before selecting the suggested mapping?For CUIs that mapped to several For CUIs that mapped to several MeSHMeSH entities, what entities, what proportion were appropriate mappings (none, some, or all)?proportion were appropriate mappings (none, some, or all)?
Assess the quality of mappings on an individual level
Mapping methods in sample were Mapping methods in sample were similar to those used for all similar to those used for all CUIsCUIs
Methods used in random sample
ANot mappedOG/PG/CG/S
Mapping methods used for all of MeSH
ANot mappedOG/PG/CG/S
Qualitative evaluation Qualitative evaluation –– resultsresults13 of 50 CUIs were not mapped to 13 of 50 CUIs were not mapped to MeSHMeSH
Orphans (n=6)Orphans (n=6)Mapping via ancestors not possible (n=6)Mapping via ancestors not possible (n=6)Crossing semantic boundary (n=1)Crossing semantic boundary (n=1)
37 CUIs were mapped37 CUIs were mapped33 of 37 (89%) were in same semantic 33 of 37 (89%) were in same semantic neighborhood as CUIneighborhood as CUIAll 37 were more general than the CUIAll 37 were more general than the CUI
AmiodaroneAmiodarone overdose overdose OverdoseOverdoseGiant cell sarcoma Giant cell sarcoma SarcomaSarcoma
ConclusionConclusion
RtMRtM already achieves good performancealready achieves good performanceMinor enhancements will improve methodMinor enhancements will improve methodRapid growth in biomedical literatureRapid growth in biomedical literatureEffective manual and automated indexing Effective manual and automated indexing methods increasingly neededmethods increasingly needed