SWAN/SIOC: Aligning Scientific Discourse Representation and Social Semantics
Overview of scientific discourse annotatoin
-
Upload
anita-de-waard -
Category
Technology
-
view
564 -
download
6
description
Transcript of Overview of scientific discourse annotatoin
A brief introduc.on to current efforts in
scien.fic discourse annota.on
Anita de Waard
Disrup/ve Technology Director, Elsevier Labs-‐ also on behalf of HCLS/UU/D2S...
hEp://elsatglabs.com/labs/anita
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
2
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
Five levels of markup:
2
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
Five levels of markup: 1.Sec/on: “Discussion”, ~ several paragraphs (+HCLS)
2
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
Five levels of markup: 1.Sec/on: “Discussion”, ~ several paragraphs (+HCLS)
2.Module: “Research Ques/on”, ~ paragraph
2
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
Five levels of markup: 1.Sec/on: “Discussion”, ~ several paragraphs (+HCLS)
2.Module: “Research Ques/on”, ~ paragraph
3.Statement: “Hypothesis”, ~ sentence/clause (+UU)
2
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
Five levels of markup: 1.Sec/on: “Discussion”, ~ several paragraphs (+HCLS)
2.Module: “Research Ques/on”, ~ paragraph
3.Statement: “Hypothesis”, ~ sentence/clause (+UU)4.Rela/on: “Supports”, ~ hyperlink(En/ty: “Gene Name”, ~ NP)
2
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
Five levels of markup: 1.Sec/on: “Discussion”, ~ several paragraphs (+HCLS)
2.Module: “Research Ques/on”, ~ paragraph
3.Statement: “Hypothesis”, ~ sentence/clause (+UU)4.Rela/on: “Supports”, ~ hyperlink(En/ty: “Gene Name”, ~ NP)
5.Special case of 3&4: Claim/Evidence network (+D2S)
2
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
Five levels of markup: 1.Sec/on: “Discussion”, ~ several paragraphs (+HCLS)
2.Module: “Research Ques/on”, ~ paragraph
3.Statement: “Hypothesis”, ~ sentence/clause (+UU)4.Rela/on: “Supports”, ~ hyperlink(En/ty: “Gene Name”, ~ NP)
5.Special case of 3&4: Claim/Evidence network (+D2S)
For each level:
2
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
Five levels of markup: 1.Sec/on: “Discussion”, ~ several paragraphs (+HCLS)
2.Module: “Research Ques/on”, ~ paragraph
3.Statement: “Hypothesis”, ~ sentence/clause (+UU)4.Rela/on: “Supports”, ~ hyperlink(En/ty: “Gene Name”, ~ NP)
5.Special case of 3&4: Claim/Evidence network (+D2S)
For each level: a)Why? Use cases
2
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
Five levels of markup: 1.Sec/on: “Discussion”, ~ several paragraphs (+HCLS)
2.Module: “Research Ques/on”, ~ paragraph
3.Statement: “Hypothesis”, ~ sentence/clause (+UU)4.Rela/on: “Supports”, ~ hyperlink(En/ty: “Gene Name”, ~ NP)
5.Special case of 3&4: Claim/Evidence network (+D2S)
For each level: a)Why? Use cases
b)What, by whom? Concepts, ontology, authors?2
Thursday, October 20, 2011
One way of subdividing the field ofScien/fic Discourse Annota/on:
Five levels of markup: 1.Sec/on: “Discussion”, ~ several paragraphs (+HCLS)
2.Module: “Research Ques/on”, ~ paragraph
3.Statement: “Hypothesis”, ~ sentence/clause (+UU)4.Rela/on: “Supports”, ~ hyperlink(En/ty: “Gene Name”, ~ NP)
5.Special case of 3&4: Claim/Evidence network (+D2S)
For each level: a)Why? Use cases
b)What, by whom? Concepts, ontology, authors?
c) How? Manual, automated? 2
Thursday, October 20, 2011
1. Sec/on-‐level markup
3
Thursday, October 20, 2011
1. Sec/on-‐level markup
3
INTRODUCTION
Thursday, October 20, 2011
1. Sec/on-‐level markup
a.Why mark up sec/ons?-‐ Search: e.g. search for en//es in Methods
-‐ Visualisa/on: e.g. structured browse at sec/on level
3
INTRODUCTION
Thursday, October 20, 2011
1. Sec/on-‐level markup
a.Why mark up sec/ons?-‐ Search: e.g. search for en//es in Methods
-‐ Visualisa/on: e.g. structured browse at sec/on level
b.What, by whom?-‐ Background/Contribu/on/Discussion model for CS
-‐ HCLS: Ontology of Rhetorical Blocks (ORB)= IMRaD in OWL
3
INTRODUCTION
Thursday, October 20, 2011
1. Sec/on-‐level markup
a.Why mark up sec/ons?-‐ Search: e.g. search for en//es in Methods
-‐ Visualisa/on: e.g. structured browse at sec/on level
b.What, by whom?-‐ Background/Contribu/on/Discussion model for CS
-‐ HCLS: Ontology of Rhetorical Blocks (ORB)= IMRaD in OWL
3
INTRODUCTION
Annotation
Thursday, October 20, 2011
1. Sec/on-‐level markup
a.Why mark up sec/ons?-‐ Search: e.g. search for en//es in Methods
-‐ Visualisa/on: e.g. structured browse at sec/on level
b.What, by whom?-‐ Background/Contribu/on/Discussion model for CS
-‐ HCLS: Ontology of Rhetorical Blocks (ORB)= IMRaD in OWL
c.Automate? -‐ Yes -‐[Hovy/Ramakrishnan]
3
INTRODUCTION
Annotation
Thursday, October 20, 2011
HCLS SciDis: develop standards for this markup
4
Thursday, October 20, 2011
2.a. Module-‐level markup: why?
5
-‐ BeEer search: e.g. query inside ‘Research ques/on’
-‐ Ini/al idea: content reuse, e.g.
-‐ Write Methods sec/ons once, import/link many
-‐ Different way of crea/ng a collec/on of scholarly content: not standalone narra/ve, but connected set of modules
Thursday, October 20, 2011
2.b. Module-‐level markup: What, by whom?
6
Thursday, October 20, 2011
-‐ Kircz, ’98: “a much more radical approach would be to [break] apart the linear text into independent modules, each with its own unique cogni/ve character.”
2.b. Module-‐level markup: What, by whom?
6
Thursday, October 20, 2011
-‐ Kircz, ’98: “a much more radical approach would be to [break] apart the linear text into independent modules, each with its own unique cogni/ve character.”
2.b. Module-‐level markup: What, by whom?
6
Thursday, October 20, 2011
-‐ Kircz, ’98: “a much more radical approach would be to [break] apart the linear text into independent modules, each with its own unique cogni/ve character.”
-‐ Harmsze, ‘00: modular model for physics papers
2.b. Module-‐level markup: What, by whom?
6
Thursday, October 20, 2011
-‐ Kircz, ’98: “a much more radical approach would be to [break] apart the linear text into independent modules, each with its own unique cogni/ve character.”
-‐ Harmsze, ‘00: modular model for physics papers
2.b. Module-‐level markup: What, by whom?
6
Thursday, October 20, 2011
-‐ Kircz, ’98: “a much more radical approach would be to [break] apart the linear text into independent modules, each with its own unique cogni/ve character.”
-‐ Harmsze, ‘00: modular model for physics papers
-‐ LiquidPub, 2010: Structured Knowledge Objects
2.b. Module-‐level markup: What, by whom?
6
Thursday, October 20, 2011
-‐ Kircz, ’98: “a much more radical approach would be to [break] apart the linear text into independent modules, each with its own unique cogni/ve character.”
-‐ Harmsze, ‘00: modular model for physics papers
-‐ LiquidPub, 2010: Structured Knowledge Objects
2.b. Module-‐level markup: What, by whom?
6
Thursday, October 20, 2011
-‐ Kircz, ’98: “a much more radical approach would be to [break] apart the linear text into independent modules, each with its own unique cogni/ve character.”
-‐ Harmsze, ‘00: modular model for physics papers
-‐ LiquidPub, 2010: Structured Knowledge Objects
-‐ HCLS: Medium-‐grained structure: core narra/ve components
2.b. Module-‐level markup: What, by whom?
6
Thursday, October 20, 2011
-‐ Kircz, ’98: “a much more radical approach would be to [break] apart the linear text into independent modules, each with its own unique cogni/ve character.”
-‐ Harmsze, ‘00: modular model for physics papers
-‐ LiquidPub, 2010: Structured Knowledge Objects
-‐ HCLS: Medium-‐grained structure: core narra/ve components
2.b. Module-‐level markup: What, by whom?
6
Thursday, October 20, 2011
7
The Story of Goldilocks and the Three Bears
Story Grammar Paper The AXH Domain of Ataxin-1 Mediates Neurodegeneration through Its Interaction with Gfi-1/Senseless Proteins
Once upon a time Time Setting Background The mechanisms mediating SCA1 pathogenesis are still not fully understood, but some general principles have emerged.
a little girl named Goldilocks Characters
Setting
Objects of study the Drosophila Atx-1 homolog (dAtx-1) which lacks a polyQ tract,
She went for a walk in the forest. Pretty soon, she came upon a house.
Location
Setting
Experimental setup
studied and compared in vivo effects and interactions to those of the human protein
She knocked and, when no one answered,
Goal Theme Researchgoal
Gain insight into how Atx-1's function contributes to SCA1 pathogenesis. How these interactions might contribute to the disease process and how they might cause toxicity in only a subset of neurons in SCA1 is not fully understood.she walked right in. Attempt
Theme
Hypothesis Atx-1 may play a role in the regulation of gene expression
At the table in the kitchen, there were three bowls of porridge.
Name Episode 1 Name dAtX-1 and hAtx-1 Induce Similar Phenotypes When Overexpressed in Files
Goldilocks was hungry. Subgoal
Episode 1
Subgoal test the function of the AXH domain
She tasted the porridge from the first bowl.
Attempt
Episode 1
Method overexpressed dAtx-1 in flies using the GAL4/UAS system (Brand and Perrimon, 1993) and compared its effects to those of hAtx-1. This porridge is too hot! she
exclaimed.Outcome
Episode 1
Results Overexpression of dAtx-1 by Rhodopsin1(Rh1)-GAL4, which drives expression in the differentiated R1-R6 photoreceptor cells (Mollereau et al., 2000 and O'Tousa et al., 1985), results in neurodegeneration in the eye, as does overexpression of hAtx-1[82Q]. Although at 2 days after eclosion, overexpression of either Atx-1 does not show obvious morphological changes in the photoreceptor cells
So, she tasted the porridge from the second bowl.
Activity
Episode 1
Data (data not shown),
This porridge is too cold, she said Outcome
Episode 1
Results both genotypes show many large holes and loss of cell integrity at 28 days
So, she tasted the last bowl of porridge.
Activity
Episode 1
Data (Figures 1B-1D).
Ahhh, this porridge is just right, she said happily and
Outcome
Episode 1
Results Overexpression of dAtx-1 using the GMR-GAL4 driver also induces eye abnormalities. The external structures of the eyes that overexpress dAtx-1 show disorganized ommatidia and loss of interommatidial bristles she ate it all up.
Episode 1
Data (Figure 1F),
Story grammar model for science
Thursday, October 20, 2011
2.c. Module-‐level markup: how?
8
Thursday, October 20, 2011
2.c. Module-‐level markup: how?-‐ Automated recogni/on: very difficult:
8
Thursday, October 20, 2011
2.c. Module-‐level markup: how?-‐ Automated recogni/on: very difficult:-‐ How do you know where the boundaries are?
8
Thursday, October 20, 2011
2.c. Module-‐level markup: how?-‐ Automated recogni/on: very difficult:-‐ How do you know where the boundaries are?
-‐ Even difficult for author to iden/fy!
8
Thursday, October 20, 2011
2.c. Module-‐level markup: how?-‐ Automated recogni/on: very difficult:-‐ How do you know where the boundaries are?
-‐ Even difficult for author to iden/fy!
-‐ Author creates: templates.
8
Thursday, October 20, 2011
2.c. Module-‐level markup: how?-‐ Automated recogni/on: very difficult:-‐ How do you know where the boundaries are?
-‐ Even difficult for author to iden/fy!
-‐ Author creates: templates.-‐ XPharm, 2001: modular text book in pharmacology:
8
Thursday, October 20, 2011
2.c. Module-‐level markup: how?-‐ Automated recogni/on: very difficult:-‐ How do you know where the boundaries are?
-‐ Even difficult for author to iden/fy!
-‐ Author creates: templates.-‐ XPharm, 2001: modular text book in pharmacology:
8
Thursday, October 20, 2011
2.c. Module-‐level markup: how?-‐ Automated recogni/on: very difficult:-‐ How do you know where the boundaries are?
-‐ Even difficult for author to iden/fy!
-‐ Author creates: templates.-‐ XPharm, 2001: modular text book in pharmacology:
-‐ Only works if you pay authors!
8
Thursday, October 20, 2011
2.c. Module-‐level markup: how?-‐ Automated recogni/on: very difficult:-‐ How do you know where the boundaries are?
-‐ Even difficult for author to iden/fy!
-‐ Author creates: templates.-‐ XPharm, 2001: modular text book in pharmacology:
-‐ Only works if you pay authors!
-‐ Mo/f detec/on workmight offer help?
8
Thursday, October 20, 2011
3. Statement-‐level markup
9
Thursday, October 20, 2011
3. Statement-‐level markup
9
a.Why?-‐ Automated summarisa/on?
-‐ Towards claim-‐evidence networks
Thursday, October 20, 2011
3. Statement-‐level markup
9
a.Why?-‐ Automated summarisa/on?
-‐ Towards claim-‐evidence networks
b.How, by whom? -‐ Comparison of three groups: 1.Liakata et al.: CoreSC2.Ananiadou et al.: Metaknowledge annota/on3.De Waard/Pander Maat: Discourse Segment Types
-‐ Annotated three texts: compare schemes, levels, annota/on overlap
Thursday, October 20, 2011
3.1 Liakata et al.: Core-‐Scien/fic Concepts (CoreSC) Annota/on Scheme
s
Thursday, October 20, 2011
3.1 Liakata et al.: Core-‐Scien/fic Concepts (CoreSC) Annota/on Scheme
Three-‐layer, ontology-‐mo/vated annota/on scheme for sentence annota/on, which views a paper as the humanly readable representa0on of a scien0fic inves0ga0on: [45-‐page guideline: Liakata & Soldatova 2008] -‐ 1st layer: Core Scien4fic Concepts (CoreSCs): Hypothesis, Mo/va/on, Goal, Object, Background, Method, Experiment, Model, Observa/on, Result, Conclusion
-‐ 2nd layer: Proper4es of CoreSCs. Novelty (New/Old) and Advantage (advantage/disadvantage)
-‐ 3rd layer: Concept Iden4fiers: linking sentences together which refer to the same instance of a CoreSC
s
Thursday, October 20, 2011
3.1 CoreSC Annota/on tool:
Thursday, October 20, 2011
3.1 CoreSC Annota/on tool:
• Automated annota/on with CoreSC system well underway!
Thursday, October 20, 2011
Class / Type(Grounded to an event
ontology)
Knowledge Type• Inves4ga4on• Observa4on• Analysis• General
Manner• High• Low• Neutral
Certainty Level•L3•L2•L1
Polarity• Nega4ve• Posi4ve
Source• Other• Current
Par4cipants• Theme(s)• Actor(s)
Bio-‐Event(Centred on an Event
Trigger)
3.2. Ananiadou et al: Metaknowledge annota/on:
Thursday, October 20, 2011
S3 = These results suggest that Y has no effect on expression of X
EventKnowledge
Type
Certainty
Level
Lexical
PolarityManner Source
E1 General L3 Posi4ve Neutral Current
E2 Analysis L2 Nega4ve Neutral Current
3.2 Example of Metaknowledge annota/on:
Thursday, October 20, 2011
S3 = These results suggest that Y has no effect on expression of X
EventKnowledge
Type
Certainty
Level
Lexical
PolarityManner Source
E1 General L3 Posi4ve Neutral Current
E2 Analysis L2 Nega4ve Neutral Current
3.2 Example of Metaknowledge annota/on:
Thursday, October 20, 2011
S3 = These results suggest that Y has no effect on expression of X
EventKnowledge
Type
Certainty
Level
Lexical
PolarityManner Source
E1 General L3 Posi4ve Neutral Current
E2 Analysis L2 Nega4ve Neutral Current
3.2 Example of Metaknowledge annota/on:
Thursday, October 20, 2011
S3 = These results suggest that Y has no effect on expression of X
EventKnowledge
Type
Certainty
Level
Lexical
PolarityManner Source
E1 General L3 Posi4ve Neutral Current
E2 Analysis L2 Nega4ve Neutral Current
• Manual annota/on underway of the GENIA event corpus (1000 MEDLINE abstracts)
3.2 Example of Metaknowledge annota/on:
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
Both seminomas and the EC component of nonseminomas share features with ES cells.To exclude thatthe detection of miR-371-3 merely reflects its expression pattern in ES cells,we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004).In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8),suggesting thatmiR-371-3 expression is a selective event during tumorigenesis.
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
Both seminomas and the EC component of nonseminomas share features with ES cells.To exclude thatthe detection of miR-371-3 merely reflects its expression pattern in ES cells,we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004).In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8),suggesting thatmiR-371-3 expression is a selective event during tumorigenesis.
Fact
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
Both seminomas and the EC component of nonseminomas share features with ES cells.To exclude thatthe detection of miR-371-3 merely reflects its expression pattern in ES cells,we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004).In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8),suggesting thatmiR-371-3 expression is a selective event during tumorigenesis.
Fact
Hypothesis
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
Both seminomas and the EC component of nonseminomas share features with ES cells.To exclude thatthe detection of miR-371-3 merely reflects its expression pattern in ES cells,we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004).In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8),suggesting thatmiR-371-3 expression is a selective event during tumorigenesis.
Fact
Hypothesis
Method
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
Both seminomas and the EC component of nonseminomas share features with ES cells.To exclude thatthe detection of miR-371-3 merely reflects its expression pattern in ES cells,we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004).In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8),suggesting thatmiR-371-3 expression is a selective event during tumorigenesis.
Fact
Hypothesis
Method
Result
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
Both seminomas and the EC component of nonseminomas share features with ES cells.To exclude thatthe detection of miR-371-3 merely reflects its expression pattern in ES cells,we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004).In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8),suggesting thatmiR-371-3 expression is a selective event during tumorigenesis.
Fact
Hypothesis
Method
Result
Implication
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
Both seminomas and the EC component of nonseminomas share features with ES cells.To exclude thatthe detection of miR-371-3 merely reflects its expression pattern in ES cells,we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004).In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8),suggesting thatmiR-371-3 expression is a selective event during tumorigenesis.
Fact
Hypothesis
Method
Result
Implication
Goal
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
Both seminomas and the EC component of nonseminomas share features with ES cells.To exclude thatthe detection of miR-371-3 merely reflects its expression pattern in ES cells,we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004).In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8),suggesting thatmiR-371-3 expression is a selective event during tumorigenesis.
Fact
Hypothesis
Method
Result
Implication
Goal
Reg-Implication
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
Both seminomas and the EC component of nonseminomas share features with ES cells.To exclude thatthe detection of miR-371-3 merely reflects its expression pattern in ES cells,we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004).In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8),suggesting thatmiR-371-3 expression is a selective event during tumorigenesis.
Fact
Hypothesis
Method
Result
Implication
Goal
Reg-Implication
Conceptual knowledge
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
Both seminomas and the EC component of nonseminomas share features with ES cells. To exclude that the detection of miR-371-3 merely reflects its expression pattern in ES cells, we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004). In many of the m i R - 3 7 1 - 3 e x p r e s s i n g s e m i n o m a s a n d nonseminomas, miR-302a-d was undetectable (Figs S7 and S8), suggesting that miR-371-3 expression is a selective event during tumorigenesis.
Both seminomas and the EC component of nonseminomas share features with ES cells.To exclude thatthe detection of miR-371-3 merely reflects its expression pattern in ES cells,we tested by RPA miR-302a-d, another ES cells-specific miRNA cluster (Suh et al, 2004).In many of the miR-371-3 expressing seminomas and nonseminomas, miR-302a-d was undetectable (Figs S7 and S8),suggesting thatmiR-371-3 expression is a selective event during tumorigenesis.
Fact
Hypothesis
Method
Result
Implication
Goal
Reg-Implication
Conceptual knowledge
ExperimentalEvidence
3.3 de Waard/Pander Maat: Discourse Segment Types
Thursday, October 20, 2011
3.3 Iden/fica/on of DSTs:
15
Thursday, October 20, 2011
3.3 Iden/fica/on of DSTs:-‐ Verb form: tense, e.g. -‐ Concepts in ‘state’ (gnomic) present:‘Dopaminergic innerva/on plays a major role in the control of mood and its perturba/on’
-‐ Experiments in ‘event’ past: ‘Four out of seven cell lines expressed this cluster’
15
Thursday, October 20, 2011
3.3 Iden/fica/on of DSTs:-‐ Verb form: tense, e.g. -‐ Concepts in ‘state’ (gnomic) present:‘Dopaminergic innerva/on plays a major role in the control of mood and its perturba/on’
-‐ Experiments in ‘event’ past: ‘Four out of seven cell lines expressed this cluster’
-‐ Seman4c verb class: -‐ Research verbs (Inves/ga/on, Predic/on, Procedure, Observa/on, Interpreta/on)
-‐ Discourse verbs -‐ Proper/es and rela/onships -‐ between things and concepts
15
Thursday, October 20, 2011
3.3 Iden/fica/on of DSTs:-‐ Verb form: tense, e.g. -‐ Concepts in ‘state’ (gnomic) present:‘Dopaminergic innerva/on plays a major role in the control of mood and its perturba/on’
-‐ Experiments in ‘event’ past: ‘Four out of seven cell lines expressed this cluster’
-‐ Seman4c verb class: -‐ Research verbs (Inves/ga/on, Predic/on, Procedure, Observa/on, Interpreta/on)
-‐ Discourse verbs -‐ Proper/es and rela/onships -‐ between things and concepts
-‐ Modality -‐ Types: Source {Author, others, unknown}, Basis {Data, Reasoning, 0}, Value {Certain, probable, possible, unknown}
-‐ Markers: Modal aux, verb class Interpreta/on, epistemic adverbs15
Thursday, October 20, 2011
3. Same Statement annotated three ways:
Thursday, October 20, 2011
3. Same Statement annotated three ways:CoreSC:<annotationART atype="GSC" type="Res" conceptID="Res24" novelty="None" advantage="None">Here we show that BOB.1/OBF.1 regulates Btk gene expression.</annotationART> BioEvent/MetaKnowledge:<sentence id="S6">Here we show that <term id="T13" sem="Protein_family_or_group">
<gene-or-gene-product id="G9">BOB.1</gene-or-gene-product>/<gene-or-gene-product id="G10">OBF.1</gene-or-gene-product>
</term> regulates <term id="T14" sem="Biological_process">
<term id="T15" sem="DNA_domain_or_region"><gene-or-gene-product id="G11">Btk
</gene-or-gene-product> gene</term> expression
</term>. </sentence>
Discourse Segments:<segment segID ="286" section = "D" segtype = "RegImplication">Here we show that</segment><segment segID ="287" section = "D" segtype = "Implication">
Thursday, October 20, 2011
3. Same Statement annotated three ways:CoreSC:<annotationART atype="GSC" type="Res" conceptID="Res24" novelty="None" advantage="None">Here we show that BOB.1/OBF.1 regulates Btk gene expression.</annotationART> BioEvent/MetaKnowledge:<sentence id="S6">Here we show that <term id="T13" sem="Protein_family_or_group">
<gene-or-gene-product id="G9">BOB.1</gene-or-gene-product>/<gene-or-gene-product id="G10">OBF.1</gene-or-gene-product>
</term> regulates <term id="T14" sem="Biological_process">
<term id="T15" sem="DNA_domain_or_region"><gene-or-gene-product id="G11">Btk
</gene-or-gene-product> gene</term> expression
</term>. </sentence>
Discourse Segments:<segment segID ="286" section = "D" segtype = "RegImplication">Here we show that</segment><segment segID ="287" section = "D" segtype = "Implication">BOB.1/OBF.1 regulates Btk gene expression.</segment> Thursday, October 20, 2011
3. Same Statement annotated three ways:CoreSC:<annotationART atype="GSC" type="Res" conceptID="Res24" novelty="None" advantage="None">Here we show that BOB.1/OBF.1 regulates Btk gene expression.</annotationART> BioEvent/MetaKnowledge:<sentence id="S6">Here we show that <term id="T13" sem="Protein_family_or_group">
<gene-or-gene-product id="G9">BOB.1</gene-or-gene-product>/<gene-or-gene-product id="G10">OBF.1</gene-or-gene-product>
</term> regulates <term id="T14" sem="Biological_process">
<term id="T15" sem="DNA_domain_or_region"><gene-or-gene-product id="G11">Btk
</gene-or-gene-product> gene</term> expression
</term>. </sentence>
Discourse Segments:<segment segID ="286" section = "D" segtype = "RegImplication">Here we show that</segment><segment segID ="287" section = "D" segtype = "Implication">BOB.1/OBF.1 regulates Btk gene expression.</segment> Thursday, October 20, 2011
Who Why What How
Liakata: CoreSC Iden/fy main components of scien/fic inves/ga/on for machine learning
Sentence Manual corpus, automated annota/on tools -‐ working on automated detec/on
Ananiadou: MetaKnowledge/BioEvents
Enhance informa/on extrac/on for biomedical texts to enable metadiscourse annota/on
Events (intra-‐senten/al): can be several per sentence, or one in more sentences
Manual corpus, working on automated detec/on
de Waard:Discourse Segment Types
Iden/fy mechanisms of conveying (epistemic) knowledge in scien/fic discourse
Clause Manual, ideas (but no real plans!) for automated iden/fica/on
3. Comparing statement-‐level annota/on models
Thursday, October 20, 2011
4. Rela.ons
a.Why?-‐ Argumenta/on visualisa/on
b.What, by whom? -‐ Harmsze (1999): Ontology of content rela/onships
-‐ IBIS, ClaiMaker (2001)
-‐ Diligent argumenta/on ontology (2005)
-‐ SALT: RST (2007)
-‐ SWAN (2010)
c.How? -‐ So far: Manually 18
Thursday, October 20, 2011
4. Rela.ons
a.Why?-‐ Argumenta/on visualisa/on
b.What, by whom? -‐ Harmsze (1999): Ontology of content rela/onships
-‐ IBIS, ClaiMaker (2001)
-‐ Diligent argumenta/on ontology (2005)
-‐ SALT: RST (2007)
-‐ SWAN (2010)
c.How? -‐ So far: Manually 18
Thursday, October 20, 2011
4. Rela.ons
a.Why?-‐ Argumenta/on visualisa/on
b.What, by whom? -‐ Harmsze (1999): Ontology of content rela/onships
-‐ IBIS, ClaiMaker (2001)
-‐ Diligent argumenta/on ontology (2005)
-‐ SALT: RST (2007)
-‐ SWAN (2010)
c.How? -‐ So far: Manually 18
Thursday, October 20, 2011
4. Rela.ons
a.Why?-‐ Argumenta/on visualisa/on
b.What, by whom? -‐ Harmsze (1999): Ontology of content rela/onships
-‐ IBIS, ClaiMaker (2001)
-‐ Diligent argumenta/on ontology (2005)
-‐ SALT: RST (2007)
-‐ SWAN (2010)
c.How? -‐ So far: Manually 18
Thursday, October 20, 2011
4. Rela.ons
a.Why?-‐ Argumenta/on visualisa/on
b.What, by whom? -‐ Harmsze (1999): Ontology of content rela/onships
-‐ IBIS, ClaiMaker (2001)
-‐ Diligent argumenta/on ontology (2005)
-‐ SALT: RST (2007)
-‐ SWAN (2010)
c.How? -‐ So far: Manually 18
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?-‐ Buckingham Shum, 1999
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?-‐ Buckingham Shum, 1999
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?-‐ Buckingham Shum, 1999
-‐ SWAN: Clark, Ciccarese et al., 2005
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?-‐ Buckingham Shum, 1999
-‐ SWAN: Clark, Ciccarese et al., 2005
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?-‐ Buckingham Shum, 1999
-‐ SWAN: Clark, Ciccarese et al., 2005
-‐ Nanopublica/ons: Mons, 2010
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?-‐ Buckingham Shum, 1999
-‐ SWAN: Clark, Ciccarese et al., 2005
-‐ Nanopublica/ons: Mons, 2010
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?-‐ Buckingham Shum, 1999
-‐ SWAN: Clark, Ciccarese et al., 2005
-‐ Nanopublica/ons: Mons, 2010
-‐ Nanopublica/ons + SWAN, 2011
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?-‐ Buckingham Shum, 1999
-‐ SWAN: Clark, Ciccarese et al., 2005
-‐ Nanopublica/ons: Mons, 2010
-‐ Nanopublica/ons + SWAN, 2011
c.How?
Thursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?-‐ Buckingham Shum, 1999
-‐ SWAN: Clark, Ciccarese et al., 2005
-‐ Nanopublica/ons: Mons, 2010
-‐ Nanopublica/ons + SWAN, 2011
c.How?
-‐ So far: manuallyThursday, October 20, 2011
5.Claim-‐Evidence Networks (statement + rela/ons)
19
a.Why?Show argumenta/on across body of work
b.What, by whom?-‐ Buckingham Shum, 1999
-‐ SWAN: Clark, Ciccarese et al., 2005
-‐ Nanopublica/ons: Mons, 2010
-‐ Nanopublica/ons + SWAN, 2011
c.How?
-‐ So far: manuallyThursday, October 20, 2011
20
A. Philips’ Electronic Patient Records B. Elsevier-published Clinical Guideline
C. Elsevier (or other publisher’s) Research Report or Data
D2S Use case: Claim-‐Evidence Network in Medicine
Thursday, October 20, 2011
20
A. Philips’ Electronic Patient Records B. Elsevier-published Clinical Guideline
C. Elsevier (or other publisher’s) Research Report or Data
Step 1: Patient data + diagnosis link to Guideline recommendation
D2S Use case: Claim-‐Evidence Network in Medicine
Thursday, October 20, 2011
20
A. Philips’ Electronic Patient Records B. Elsevier-published Clinical Guideline
C. Elsevier (or other publisher’s) Research Report or Data
Step 1: Patient data + diagnosis link to Guideline recommendation
Step 2: Guideline recommendation links to evidence in report or data
D2S Use case: Claim-‐Evidence Network in Medicine
Thursday, October 20, 2011
20
A. Philips’ Electronic Patient Records B. Elsevier-published Clinical Guideline
C. Elsevier (or other publisher’s) Research Report or Data
Step 1: Patient data + diagnosis link to Guideline recommendation
Step 2: Guideline recommendation links to evidence in report or data
D2S Use case: Claim-‐Evidence Network in Medicine
Related HCLS Use Case: Accelerate uptake of medical research on drug-‐drug interac/on in product inserts
Thursday, October 20, 2011
Five levels of scien/fic discourse annota/on
21
Thursday, October 20, 2011
Why What/Who How
Sec/on Search, UI ABCDE, ORB/HCLSAutomated!
Publisher helps?
Module Content reuseHarmsze/Kircz, LiquidPub,
HCLSManual: templates
Statement Summaries; towards networks
Teufel, Ananiadou, Liakata, UU
Working towards automated detec/on
Rela/ons NetworksSALT, ScholOnto, Diligent,
SWANManual: some tools, never took off...
Claim/Evidence
Argumenta/on networks
ScholOnto, SWAN, Nanopublica/ons, D2S
Manual-‐ towards automa/on?
Five levels of scien/fic discourse annota/on
21
Thursday, October 20, 2011
Why What/Who How
Sec/on Search, UI ABCDE, ORB/HCLSAutomated!
Publisher helps?
Module Content reuseHarmsze/Kircz, LiquidPub,
HCLSManual: templates
Statement Summaries; towards networks
Teufel, Ananiadou, Liakata, UU
Working towards automated detec/on
Rela/ons NetworksSALT, ScholOnto, Diligent,
SWANManual: some tools, never took off...
Claim/Evidence
Argumenta/on networks
ScholOnto, SWAN, Nanopublica/ons, D2S
Manual-‐ towards automa/on?
Five levels of scien/fic discourse annota/on
21
Thursday, October 20, 2011
Why What/Who How
Sec/on Search, UI ABCDE, ORB/HCLSAutomated!
Publisher helps?
Module Content reuseHarmsze/Kircz, LiquidPub,
HCLSManual: templates
Statement Summaries; towards networks
Teufel, Ananiadou, Liakata, UU
Working towards automated detec/on
Rela/ons NetworksSALT, ScholOnto, Diligent,
SWANManual: some tools, never took off...
Claim/Evidence
Argumenta/on networks
ScholOnto, SWAN, Nanopublica/ons, D2S
Manual-‐ towards automa/on?
Five levels of scien/fic discourse annota/on
21
Thursday, October 20, 2011
Why What/Who How
Sec/on Search, UI ABCDE, ORB/HCLSAutomated!
Publisher helps?
Module Content reuseHarmsze/Kircz, LiquidPub,
HCLSManual: templates
Statement Summaries; towards networks
Teufel, Ananiadou, Liakata, UU
Working towards automated detec/on
Rela/ons NetworksSALT, ScholOnto, Diligent,
SWANManual: some tools, never took off...
Claim/Evidence
Argumenta/on networks
ScholOnto, SWAN, Nanopublica/ons, D2S
Manual-‐ towards automa/on?
Five levels of scien/fic discourse annota/on
21
Thursday, October 20, 2011
Why What/Who How
Sec/on Search, UI ABCDE, ORB/HCLSAutomated!
Publisher helps?
Module Content reuseHarmsze/Kircz, LiquidPub,
HCLSManual: templates
Statement Summaries; towards networks
Teufel, Ananiadou, Liakata, UU
Working towards automated detec/on
Rela/ons NetworksSALT, ScholOnto, Diligent,
SWANManual: some tools, never took off...
Claim/Evidence
Argumenta/on networks
ScholOnto, SWAN, Nanopublica/ons, D2S
Manual-‐ towards automa/on?
Five levels of scien/fic discourse annota/on
21
Thursday, October 20, 2011
Why What/Who How
Sec/on Search, UI ABCDE, ORB/HCLSAutomated!
Publisher helps?
Module Content reuseHarmsze/Kircz, LiquidPub,
HCLSManual: templates
Statement Summaries; towards networks
Teufel, Ananiadou, Liakata, UU
Working towards automated detec/on
Rela/ons NetworksSALT, ScholOnto, Diligent,
SWANManual: some tools, never took off...
Claim/Evidence
Argumenta/on networks
ScholOnto, SWAN, Nanopublica/ons, D2S
Manual-‐ towards automa/on?
Five levels of scien/fic discourse annota/on
21
Thursday, October 20, 2011
Why What/Who How
Sec/on Search, UI ABCDE, ORB/HCLSAutomated!
Publisher helps?
Module Content reuseHarmsze/Kircz, LiquidPub,
HCLSManual: templates
Statement Summaries; towards networks
Teufel, Ananiadou, Liakata, UU
Working towards automated detec/on
Rela/ons NetworksSALT, ScholOnto, Diligent,
SWANManual: some tools, never took off...
Claim/Evidence
Argumenta/on networks
ScholOnto, SWAN, Nanopublica/ons, D2S
Manual-‐ towards automa/on?
Five levels of scien/fic discourse annota/on
21
MoFfs?
Thursday, October 20, 2011
Ques.ons:
22
Thursday, October 20, 2011
Ques.ons:
22
Scien/fic discourse -‐> fairy tales: how can we transfer knowledge here to scien/fic discourse community?
Thursday, October 20, 2011
Ques.ons:
22
Scien/fic discourse -‐> fairy tales: how can we transfer knowledge here to scien/fic discourse community?
Issue: Scien/sts do not like being told that they write fairy tales!
Thursday, October 20, 2011
Ques.ons:
22
Scien/fic discourse -‐> fairy tales: how can we transfer knowledge here to scien/fic discourse community?
Issue: Scien/sts do not like being told that they write fairy tales!
Fairy tales -‐> scien/fic discourse: is anyone here/in this community interested in working on inter-‐domain transfer of tools, technologies and theories?
Thursday, October 20, 2011
Ques.ons:
22
Scien/fic discourse -‐> fairy tales: how can we transfer knowledge here to scien/fic discourse community?
Issue: Scien/sts do not like being told that they write fairy tales!
Fairy tales -‐> scien/fic discourse: is anyone here/in this community interested in working on inter-‐domain transfer of tools, technologies and theories?
Anita de Waard, [email protected]
-‐HCLS: hEp://www.w3.org/wiki/HCLSIG/SWANSIOC
-‐UU: hEp://elsatglabs.com/labs/anita
-‐D2S: hEp://www.data2seman/cs.org
Thursday, October 20, 2011