Exploiting Semantic Structure for Mapping Clinician-specified Form Terms to SNOMED CT Concepts
Click here to load reader
-
Upload
ritu-khare -
Category
Education
-
view
219 -
download
4
Transcript of Exploiting Semantic Structure for Mapping Clinician-specified Form Terms to SNOMED CT Concepts
Ritu Khare1,3 Yuan An3 Jiexun Li3 Il‐Yeol Song3 Xiaohua Hu3 Michele Follen1,2
Exploiting Semantic Structure for MappingClinician‐specified Form Terms to SNOMED CT Concepts
The elements of clinical databases are usually named after the clinical terms
Motivation, Problem, and Challenges Structure‐based SNOMED‐CT Mapping Framework
Ritu Khare , , Yuan An , Jiexun Li , Il‐Yeol Song , Xiaohua Hu , Michele Follen ,
College of Medicine Center for Women’s Health Research 1, and Obstetrics and Gynecology2 , College of Information Science and Technology3
used in various design artifacts. These terms are instinctively supplied by theusers, and hence, different users often use different terms to describe the sameclinical concept. This term diversity makes future database integration andanalysis a huge challenge.
Form Term SNOMED CT Concept
Semantic Structure Analyzer
Structure –based Classification
Model
Structure –based Classification
Model
Semantic Category
Picker(configurable)
SNOMED CT Category Specific
Mapping (API)
SemanticForm Tree
Training Data
Terms(in Clinical Forms)
SNOMED CT
ConceptsMapping/
Standardization
Semantic Information Extraction
Form
XY
H
Fig. 3. Overall Mapping Framework: (1) The form tree structure is analyzed to derive the form context, (2) Theclassification model (Naïve Bayes) ranks the SNOMED CT semantic categories suitable for the form context, (3) Acategory is picked, (4) The most linguistically matching concept in this category is selected as the winner concept.
Patient History FormPATIENT
Name:
M FGender:DOB: MRN:
Chief Complaints
HISTORY
Diversity Challenge(Well Addressed)
Different cliniciansspecify differentform terms tospecify the samel l
Context Challenge(Less Explored)
The same formterm when used indifferent contexts,may map tod ff
Key IdeasExploit the local semantic structure of form treeto determine the term context, and candidateSNOMED CT semantic categories.
Select a winner semantic category , and map theterm to the linguistically matching concept withinthe determined semantic category.
How can weleverage thesemantic structureof clinical forms tomap the form termsinto standardSNOMED CTconcepts?Preliminaries: SNOMED CT and Semantic Form Trees
Results and Contributions
Future WorkEmpirical Study with Clinician‐designed Forms
About the Data
The data includes 26 forms collected from 5healthcare institutions. The forms containover 1500 terms, out of which 954 (63%) aremappable to SNOMED CT concepts.
Review of Systems:Complaints
Eyes
ENMTRespiratory
clinical concept.e.g.,MRN, orMed.Rec.#.VitalSigns,Constitutional, orPhysical status
different SNOMEDCT concepts.e.g., the termRespiratory in Fig. 1and 2.
Fig 1. A Sample Clinician Designed Form
About the Methods
BASELINE: Linguistic comparison
HYBRID: Linguistic as well asStructural (Contextual)comparison (See Fig. 3)
Leverage other relationships ofSNOMED CT and test with othervocabularies from the UMLS.Test within larger frameworksof health information systems.
The Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) is awidely used medical terminology. It comprises 360,000 clinical CONCEPTSbelonging to various SEMANTIC CATEGORIES. Each concept is represented usinga CONCEPT ID and a FULLY SPECIFIED NAME. A simple search for the term Eyesacross the UMLS SNOMED CT browser leads to the following top results:
concepts?Preliminaries: SNOMED CT and Semantic Form Trees
0.51
0.63 0.640.73
0.660.69
0.890.76
0.650.72
0.89 0.920.87
0.780.84
Mapping Precision
70
80
90Precision
Recall
Concept Id Fully‐specified Name Semantic Category63342001 Sunsetting eyes Finding
HYBRID++: Linguistic as well asadvanced structural comparison
Findings Implications
Structural Knowledgehas the ability toaddress the context
Improvement due tostructure (Fig 4)(R = recall, P=Precision)Hybrid over Baseline:
Conclusion
Apply other classificationtechniques and employsophisticated linguistictechniques.
0.37
0.52 0.490.43
0.450.43
0.69
0.43
0.31
0.43
0.57
0.74
0.510.43
0.52
Mapping Recall
Baseline Hybrid Hybrid++
Set1 Set2 Set3 Set4 Set5
40
50
60
70
Baseline Hybrid Hybrid++
Precision with Term ProcessingRecall with Term Processing
371110006 Immature eyes Disorder362508001 Both eyes, entire Body Structure
Patient Examination FormPATIENT
Name:
M FGender:
TEXAMINATION
root
Patient Examination
Name Gender T Respiratory
Person Procedure
ObservableEntity
ObservableEntity
ObservableEntity
Fig 5. Change in Results with the term processing, advanced linguistic technique
challenge, andimprove the overallmappingperformance.
Hybrid over Baseline:18% (P); 2%(R)Hybrid++ over Hybrid:16% (P); 23%(R)
Linguistic Techniquescan improve the recalland address thediversity challenge to alarge extent.
It is desirable todevelop hybridapproaches that canaddress both thechallenges & lead to asuperior performance
Improvement due toLinguistics (Fig 5)2‐3% (P), >30%(R)
National Cancer Institute (National Biomedical Imaging Branch): Grant #P01‐CA‐82710‐09National Science Foundation Grants: NSF CCF 0905291, NSF CCF 1049864, and NSFC 90920005
Fig. 2. A clinical form and its equivalent Semantic Form Tree. Each node in the tree is tagged with SNOMED CT semantic categories.
Set1 Set2 Set3 Set4 Set5Acknowledgements
RespiratorySymmetric chest expansionNormal Percussion
M F symm. expan.
nl perc.ObservableEntity
FindingFindingQualifierValue
QualifierValue Fig 4. Mapping Results for 3 Methods