Workshop negations
Transcript of Workshop negations
Using SVMs with the Command Relation Feature to Identify Negated Events in Biomedical Literature
Farzaneh Sarafraz
Goran Nenadic
School of Computer Science
University of Manchester
2 / 27
Outline
• Motivation & aim
• Molecular events
• Data & experiments
• Methods
• Discussion
• Summary
3 / 27
Motivation & aim• Biomedical literature
• 2000 papers published every day
• Biomedical information extraction needed• Improve IE by negation information• Negative results are interesting and reported
• “The IKK complex, but not p90 (rsk), is responsible for the in vivo phosphorylation of I-kappa-B-alpha.”
• Resources• Shared tasks, data • Linguistic tools (syntactic parsers)
4 / 27
Problem statement
• Given• Pubmed abstracts
• Protein/gene mentions annotated
• Molecular events annotated
• Wanted for every event• Negated or not
• Classification problem
5 / 27
Molecular eventstrigger
event
trigger
participant
participant
participation type
{theme, cause}
participant type
{gene/protein, event}
participant
participant
participant type
participation type
{theme, cause}
{gene/protein, event}
event type {binding, transcription, regulation, expression}
“We further show that Nmi interacts with all STATs except Stat2.”
{theme, cause}{theme, cause}
event type {binding, transcription, regulation, expression}
{gene/protein, event} {gene/protein, event}
6 / 27
Molecular events – class I
• One theme (gene/protein)
• “The effect of this synergism was perceptible at the level of induction of the IL-2 gene.”• Trigger: induction
• Type: gene expression
• Theme: IL-2
• Types: transcription, gene expression, phosphorylation, protein catabolism, localization
7 / 27
Molecular events – class II
• One or more themes (gene/protein)
• “We further show that Nmi interacts with all STATs except Stat2.”• Trigger: interacts
• Type: binding
• Themes: Nmi, Stat2
• Negated
• Type: Binding
8 / 27
Molecular events – class III
• Types: regulation types
• 1 theme, 0 or 1 cause
• may be gene/protein or other events
• “Overexpression of full-length ALG-4 induced transcription of FasL and, consequently, apoptosis.”
Event 3Event 1Regulation“induced”Event 4
Event 2Regulation“Overexpression”Event 3
ALG-4Gene expression“Overexpression”Event 2
FasLTranscription“transcription”Event 1
CauseThemeTypeTriggerEvent
Event 3Event 1Regulation“induced”Event 4
Event 2Regulation“Overexpression”Event 3
ALG-4Gene expression“Overexpression”Event 2
FasLTranscription“transcription”Event 1
CauseThemeTypeTriggerEvent
9 / 27
Data: BioNLP’09
• Training: 800 abstracts• Test: 260 abstracts• Gold annotations
• Event trigger, type, participants, negation• Negation cue not annotated
1071,7956159,685Total
669874404,870Class III
1524944887Class II
265591312,858Class I
negatedtotalnegatedtotal
Development dataTraining dataEvent class
1071,7956159,685Total
669874404,870Class III
1524944887Class II
265591312,858Class I
negatedtotalnegatedtotal
Development dataTraining dataEvent class
Test data
10 / 27
Methodologies
• Rule-based• The command relation
• Classification• SVM on event representation
• Lexical features: negation cue, POS
• Syntactic features: command
• Semantic features: event types
• Baseline• NegEx: event triggers as “terms”
11 / 27
Evaluation measuresFP+TP
TP=Precision
FN+TP
TP=ySensitivit=Recall
Recall+Precision
RecallPrecisionF1
××2=
FP+TN
TN=ySpecificit
FP+TP
TP=Precision
FN+TP
TP=ySensitivit=Recall
Recall+Precision
RecallPrecisionF1
××2=
FP+TN
TN=ySpecificit
12 / 27
Baseline results
94%-0%-No negation detection
93%36%37%36%NegEx
81%32%78%20%any negation cue present
Spec.F1RPApproach
94%-0%-No negation detection
93%36%37%36%NegEx
81%32%78%20%any negation cue present
Spec.F1RPApproach
13 / 27
The command relation
• If a and b are nodes in the constituency parse tree of a sentence, then a X-commands b iff the lowest ancestor of a with label X is also an ancestor of b.
Ronald Langacker, On Pronominalization and the Chain of Command, in D. Reibel and S. Schane (eds.) Modern
Studies in English, Prentice-Hall, Englewood Cliffs, NJ. 160-186. 1969.
15 / 27
X-command in action
S
We now VPshow that
S VP
failsa mutant motif that exchanges the terminal 3' C for a G
to bind the p50 homodimer.
16 / 27
Rule-based method
• An event is negated if• Negation cue exists;
and• Negation cue S-commands any participant
• Negation cue S-commands trigger
• Negation cue S-commands both
• Negation cue VP-commands both
17 / 27
Results of rule-based method
42%negation cue VP-commands both
86%35%68%23%negation cue S-commands both
85%34%68%23%negation cue S-commands trigger
84%35%76%23%negation cue S-commands any participant
Spec.F1RPApproach
42%negation cue VP-commands both
86%35%68%23%negation cue S-commands both
85%34%68%23%negation cue S-commands trigger
84%35%76%23%negation cue S-commands any participant
Spec.F1RPApproach
18 / 27
SVM features
• Semantic features• Event type
• Lexical features• Sentence contains negation cue• Negation cue
• Syntactic features• POS of neg cue• POS of event trigger• POS of the participants• Parse tree distance between trigger & cue• Type of smallest phrase containing trigger & cue• Cue S-commands any participant• Cue S-commands trigger
19 / 27
Results of single SVM, incremental feature sets
99.2%51%38%76%Features 1-10
99.2%49%38%71%Features 1-9
99.3%30%19%73%Features 1-8
99.2%14%8%43%Features 1-7
Spec.F1RPFeature set
99.2%51%38%76%Features 1-10
99.2%49%38%71%Features 1-9
99.3%30%19%73%Features 1-8
99.2%14%8%43%Features 1-7
Spec.F1RPFeature set
20 / 27
99.2%51%38%76%Features 1-10
99.2%49%38%71%Features 1-9
99.3%30%19%73%Features 1-8
99.2%14%8%43%Features 1-7
Spec.F1RPFeature set
99.2%51%38%76%Features 1-10
99.2%49%38%71%Features 1-9
99.3%30%19%73%Features 1-8
99.2%14%8%43%Features 1-7
Spec.F1RPFeature set
Results of single SVM, incremental feature sets
1. Event type
2. Sentence contains neg cue
3. Neg cue
4. POS of neg cue
5. POS of event trigger
6. POS of the participants
7. Type of smallest phrase containing trigger & cue
21 / 27
99.2%51%38%76%Features 1-10
99.2%49%38%71%Features 1-9
99.3%30%19%73%Features 1-8
99.2%14%8%43%Features 1-7
Spec.F1RPFeature set
99.2%51%38%76%Features 1-10
99.2%49%38%71%Features 1-9
99.3%30%19%73%Features 1-8
99.2%14%8%43%Features 1-7
Spec.F1RPFeature set
Results of single SVM, incremental feature sets
1. Event type
2. Sentence contains neg cue
3. Neg cue
4. POS of neg cue
5. POS of event trigger
6. POS of the participants
7. Type of smallest phrase containing trigger & cue
8. Cue S-commands any participant
22 / 27
99.2%51%38%76%Features 1-10
99.2%49%38%71%Features 1-9
99.3%30%19%73%Features 1-8
99.2%14%8%43%Features 1-7
Spec.F1RPFeature set
99.2%51%38%76%Features 1-10
99.2%49%38%71%Features 1-9
99.3%30%19%73%Features 1-8
99.2%14%8%43%Features 1-7
Spec.F1RPFeature set
Results of single SVM, incremental feature sets
1. Event type
2. Sentence contains neg cue
3. Neg cue
4. POS of neg cue
5. POS of event trigger
6. POS of the participants
7. Type of smallest phrase containing trigger & cue
8. Cue S-commands any participant
9. Cue S-commands trigger
23 / 27
99.2%51%38%76%Features 1-10
99.2%49%38%71%Features 1-9
99.3%30%19%73%Features 1-8
99.2%14%8%43%Features 1-7
Spec.F1RPFeature set
99.2%51%38%76%Features 1-10
99.2%49%38%71%Features 1-9
99.3%30%19%73%Features 1-8
99.2%14%8%43%Features 1-7
Spec.F1RPFeature set
Results of single SVM, incremental feature sets
1. Event type
2. Sentence contains neg cue
3. Neg cue
4. POS of neg cue
5. POS of event trigger
6. POS of the participants
7. Type of smallest phrase containing trigger & cue
8. Cue S-commands any participant
9. Cue S-commands trigger
10.Parse tree distance between trigger & cue
24 / 27
Results of separate SVMs for each class
99.7%62%47%92%Macro-average(3 classes)
99.4%63%49%88%Micro-average(1,795 events)
99.2%57%44%81%Class III(987 events)
100%50%33%100%Class II (249 events)
99.8%77%65%94%Class I (559 events)
Spec.F1RPEvent class
99.7%62%47%92%Macro-average(3 classes)
99.4%63%49%88%Micro-average(1,795 events)
99.2%57%44%81%Class III(987 events)
100%50%33%100%Class II (249 events)
99.8%77%65%94%Class I (559 events)
Spec.F1RPEvent class
25 / 27
Future work
• Use class-specific features
• Study other variants of command
• Combine negation detection with automatic event detection instead of using ‘gold’ events
• Use negation detection on a larger scale dataset (MEDLINE) to find contradictions & contrasts in the biomedical literature
26 / 27
Conclusions
• SVM for extracting negated events• >99% specificity• 63% F-measure (micro average)
• Different classes of events behave differently• To detect negated molecular event
• Event trigger & surface distances not enough • Semantic & command features useful
• Event participants as important as triggers
• Apply on large scale data – MEDLINE
27 / 27
Acknowledgements
• Organisers of BioNLP’09
• GN TEAM
• Casey Bergman’s lab – Faculty of Life Sciences, University of Manchester
• James Eales – University of Manchester
• Jonathan Caruana – University College London
• Web service soon available at http://gnode1.mib.man.ac.uk/negmole