Parser Models --- Baker 2
What Will We Be Learning Today? The Task
i2b2 Bake-Off Concepts
Parsing Motivation for New
Models
Creating a Model Text Normalization POS Constraints Phrase Constraints
Bake-Off Results SNoW Reranker Other
Parser Models --- Baker 3
i2b2/VA Challenges in Natural Language Processing for Clinical Data
Three-Part Shared TaskConceptsAssertionRelation
Concept ExtractionProblemTestTreatment
Parser Models --- Baker 4
Concept ExamplesProblem:The man was obese.The obese man was admitted.
Test:Blood Pressure 130/80The patient has high blood pressure.
Treatment:The patient underwent surgery.The patient arrived in the surgery suite.
ConceptNot a Concept
What does a parse look like?
The man was obese .
Parser Models --- Baker 5
S1
S
NP VP ..
DET NNDET NN
JJJJ
VBDVBD ADJP
What does a parse look like?
Parser Models --- Baker 6
(S1 (S (NP (DET The) (NN man))(VP (VBD was) (ADJP (JJ obese))) (. .)))
Concept Examples
The man was obese .
Parser Models --- Baker 7
S1
S
NP VP ..
DET NNDET NN
JJJJ
VBDVBD ADJP
Concept Examples
The man was obese .
Parser Models --- Baker 8
S1
S
NP VP ..
DET NNDET NN
JJJJ
VBDVBD ADJP
Concept Examples
Parser Models --- Baker 9
The obese man was admitted .
S
NP VP ..
DET JJDET JJ NNNN AUXAUX VBD
S1
Parser Models --- Baker 10
Concept Examples
Problem:(S1 (S (NP (DET The) (NN man)) (VP (VBD was)
(ADJP (JJ obese))) (. .)))(S1 (S (NP (DET The) (JJ obese) (NN man))
(VP (AUX was) (VBD admitted))(. .)))
Concept Examples
Test:(S1 (FRAG (NP (NN Blood) (NN Pressure)) (QP (CD 130/80))))(S1 (S (NP (DET The) (NN patient)) (VP (VB has)
(NP (JJ high) (NN blood) (NN pressure))) (. .)))Treatment:(S1 (S (NP (DET The) (NN patient)) (VP (VBD underwent)
(NP (NN surgery))) (. .)))(S1 (S (NP (DET The) (NN patient)) (VP (VBD arrived) (PRP
(IN in) (NP (DET the) (NN surgery) (NN suite)))) (. .)))
Parser Models --- Baker 11
Parser Models --- Baker 12
•Sodium 139 , potassium 3.8 , chloride 101 , bicarb 26 , BUN 9 , creatinine 0.7 , glucose 141 , albumin 4.1 , calcium 8.9 , LDH 665 , AST 44 , ALT of 57 , amylase 41 , CK 32 .•1. Post endoscopic retrograde cholangiopancreatography pancreatitis .•FLANK PAIN URI ?•A/P : 48yo man with h/o HCV , bipolar DO , h/o suicide attempts , a/w overdose of Inderal , Klonopin , Geodon , s/T Jackson stay with intubation for airway protection , with question of L retrocardiac infiltrate , now doing well .•Please note the patient is only Caucasian speaking and information is second hand .•16) Robituss in AC five to ten milliliters p.o. q.h.s. p.r.n. cough .•Pt has h/o colon can to liver , s/p resxn with serosal implants in 9/03 .•She received ASA , nitro SL then gtt , morphine , metoprolol , and heparin gtt .•5. Dulcolax 10 to 20 mg PR b.i.d. p.r.n. constipation .•The pt is a 55yo F s / p Roux en Y GBP in 12/20 presenting to the ED this AM c / o mod severe midepigastric pain .•Her electrocardiogram revealed normal sinus rhythm , left atrial enlargement, left axis deviation , poor R-wave progression in V1 through V4 , consistent with marked clockwise rotation , cannot rule out an old anteroseptal wall myocardial infarction .
The Problem
(S1 (S (NP (NNP CT)) (VP (VB scan) (S (ADJP (JJ normal))))))
Parser Models --- Baker 13
CT scan normal
13Parser Models --- Baker
By-Hand Parses
57 Sentences Parsed by hand Necessary to understand structure of
sentences
Parser Models --- Baker 14
The Problem No VP
CT scan normal
Lists 1. Bactrim double strength
Fragment construction (S1 (FRAG (NP (NN Blood) (NN Pressure)) (QP (CD 130/80))))
…among others
Parser Models --- Baker 15
How does the Charniak Parser work?
Parser Models --- Baker 16
Uses a trained model Models can be trained on different corpra
WSJ PennTreebank corpus Defines probabilistic productions
Example:S 99%, fragment 1%
The Problem
Parser Models --- Baker 17
Problem % in WSJ* % in hand-parsed i2b2
No VP 2.8% 29.8%
Lists 0.1% 8.8%
Fragment Construction 1.2% 33.3%
*WSJ corpus has 39,832 by-hand Parses
The Problem
Parser Output:
(S1 (S (NP (NNP CT)) (VP (VB scan) (S (ADJP (JJ normal))))))
Parser Models --- Baker 18
Desired Parse:
(S1 (FRAG (NP (NN CT) (NN scan)) (ADJP (JJ normal))))
CT scan normal
18Parser Models --- Baker
The Problem
Parser Output:
(S1 (S (NP (NNP CT)) (VP (VB scan) (S (ADJP (JJ normal))))))
Parser Models --- Baker 19
Desired Parse:
(S1 (FRAG (NP (NN CT) (NN scan)) (ADJP (JJ normal))))
CT scan normal
The Problem
Parser Models --- Baker 20
S1
S
NP
NNP
CT scan normal
VP
VB S
ADJP
JJ
S1
FRAG
NP ADJP
JJNN NN
CT scan normal
How are Desirable Parses Obtained?
Text Normalization Part of Speech Constraints Phrase Constraints
Parser Models --- Baker 21
Text Normalization
Pt 's labs were checked Only minimal exertion such as " walking
across the room " The patient is a **AGE[in 50s]- year - old female
well until **DATE[Jan 2007] The MRI was performed here at **INSTITUTION she does have a Foley catheter in for I& ; O
measurement
Parser Models --- Baker 22
Text Normalization
If you experience fever > 100.4 , return to the hospital .
Parser Models --- Baker 23
> = >
If you experience fever > 100.4 , return to the hospital .
Text Normalization
Parser Models --- Baker 24
Raw Text Normalized Text
F-Score* 46.2 46.7
Note: F-Score is taken from the parser output compared against the by-hand parses of the i2b2 data
Medical Acronyms/Abbreviations
Abbreviation Meaning Part of Speech
b.i.d. Twice a Day Adverb
d/c’d Discharged Verb
fh Family History Noun
po Orally Adverb
q2h Every 2 Hours Adverb
rt Right Adjective
VI Six Cardinal Number
y/o year-old Adjective
h/o History Of Preposition
Parser Models --- Baker 25
Constraining with Parts of Speech
He was placed on Unasyn 3 grams qn.
Parser Models --- Baker 26
qn nightly = adverb
(S1 (XX He) (XX was) (XX placed) (XX on)(XX Unasyn) (XX 3) (XX grams) (RB qn) (XX .))
Constraining with Parts of Speech
Parser Models --- Baker 27
Raw Text Normalized Text
Normalized Text + POS Constraints*
F-Score 46.2 46.7 46.4
*Note: There were 5 failed parses for the POS Constraints whereas the Normalized Text had zero.
Constraining with Phrases
Patient has swollen painful L side face .
Concept = swollen painful L side face
(S1 (XX Patient) (XX has) (NP-problem (XX swollen) (XX painful) (XX L) (XX side) (XX face))
(XX .))
Parser Models --- Baker 28
What Next?
Train Model!
30Parser Models --- Baker
No True Concepts on Test Day
Treat phrase-constrained parser as truth
Train model on that data
Concept Extraction: SNoW
33Parser Models --- Baker
(S1 (S (NP (DET The) (NN patient)) (VP (VBD underwent) (NP (NN surgery))) (. .)))
The patient.99 None.01 Problem.00 Test.00 Treatment
surgery.01 None.09 Problem.51 Test.49 Treatment
surgery = Test
SNoW
Concept Extraction: SNoW
Parser Models --- Baker 34
Recall Precision F-Score*
Concept Exact Span
3.1 16.8 5.2
Class Exact Span
1.1 5.8 1.8
Note: These F-Scores are from our predicted concepts compared to the “gold” concepts.
Concept Extraction: Reranker
35Parser Models --- Baker
(S1 (S (NP (DET The) (NN patient)) (VP (VBD underwent) (NP (NN surgery))) (. .)))
Reranker surgery1.Treatment2.Test3.Problem
surgery = Treatment
Concept Extraction: Reranker
Parser Models --- Baker 36
Recall Precision F-Score
SNoW Concept Exact Span
3.1 16.8 5.2
Reranker Concept Exact Span
3.8 39.7 7.0
SNoW Class Exact Span
1.1 5.8 1.8
Reranker Class Exact Span
2.4 24.4 4.3
Other Results from i2b2 Concept
Dependency Parse + External Medical Dictionary F-Score = 53.8
Relation Used Dependency Parses
37Parser Models --- Baker
Recall Precision F-Score
Matching Concept 71.7 71.8 71.7
Concept + Dep 70.9 74.4 72.6
Correct Relation 64.0 64.1 64.1
Relation + Dep 64.0 67.2 65.6
Recap
Domain mismatch is bad Constraining parser decreases domain
mismatch Training new models decreases domain
mismatch
38Parser Models --- Baker
Acknowledgments
Kristy Hollingshead Brian Roark Richard Sproat Margit Bowler
Parser Models --- Baker 39
Aaron Cohen Jianji Yang Kyle Ambert
Thank You…
Parser Models --- Baker 40
Kristy Hollingshead Christian Monson Kevin Burger Isaac Wallis The Interns All OGI Faculty, Staff, and Students
Hierarchical Phrases
(S (NP (EX There)) (VP (VB is) (NP (NP-problem (NN akinesis)) (CC /) (NP-problem (NN dyskinesis))) (CC and) (NP-problem (NN thinning) (PP (IN of) (NP (DT the) (ADJP (JJ mid) (IN to) (JJ distal)) (JJ inferior) (NN septum))) (CC and) (NP (DT the) (NN apex)))))
Parser Models --- Baker 42
There is akinesis / dyskinesis and thinning of the mid to distal inferior septum and the apex.
Top Related