Lecture 9 Microarray experiments MA plots Normalization of microarray data
Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation
description
Transcript of Semantic Relations for Interpreting DNA Microarray Data and for Novel Hypotheses Generation
1
Semantic Relations for Interpreting DNA Microarray Data
and for Novel Hypotheses Generation
Dimitar Hristovski,1 PhD, Andrej Kastrin,2 Borut Peterlin,2 MD PhD, Thomas C Rindflesch,3 PhD
1Institute of Biomedical Informatics, Medical Faculty, University of Ljubljana, Slovenia
2Institute of Medical Genetics, University Medical Centre, Ljubljana, Slovenia3National Library of Medicine, National Institutes of Health, Bethesda, MD,
U.S.A.
e-mail: [email protected]
2
Introduction
Microarray experiments:
• great potential to support progress in biomedical research,
• results NOT EASY to interpret,
• information about functions and relations of relevant genes needs to be extracted from the vast biomedical literature
Related Work
• Text mining and microarray analysis
• Literature-based Discovery
4
Proposed Solution
• Computerized text analysis system• Extract semantic relations from literature
– SemRep
• Integrate with microarray experiments• Develop tools for:
– Interpretation– Novel hypotheses generation
Overall Design
Medline GEO
SemRepSem.rels Extraction
R Bioconductorscripts
Integrated Database=semantic relations +
microarrays
Interpretation & Discovery Tools
semantic relationsmicroarra
ys
SemRep
• Extracts semantic relations from biomedical text (implemented in Prolog)
• Based on UMLS Metathesaurus and Semantic Network– <MetaConc> SEMNET RELATION <MetaConc>
• Database of relations extracted from MEDLINE– 6.7M citations (01/01/1999 through 03/31/2009)– 43M sentences– 21M relation instances– 7M relation types
6
7
Semantic Relations Extracted
• Wide range of relations in:– Clinical medicine– Molecular genetics– Pharmacogenomics
• Genetic Etiology: associated_with, predisposes, causes• Substance Relations: interacts_with, inhibits, stimulates • Pharmacological Effects: affects, disrupts, augments • Clinical Actions: administered_to, manifestation_of, treats, • Organism Characteristics: location_of, part_of, process_of • Co-existence: co-exists_with
8
Examples
• “… the loss of Mbd1 could lead to autism-like behavioral phenotypes …”
• Relation: MDB1 causes Autistic Disorder • “… Mbd1 can directly regulate the
expression of Htr2c, one of the serotonin receptors, …”
• Relation: MBD1 interacts_with HTR2C
10
Interpretation of Microarrays
Find known facts from the literature:
• Desease related:– Associated genes– Current treatments– …
• Microarray Genes:– Relations between genes (INHIBITS, STIMULATES, …)– Relations between the genes and anything else
Relations with “Parkinson” as Argument?
What Treats Parkinson?
What (causes, associated_with) Parkinson?
Sentences from which Relations are Extracted
Genes from the Microarray Related to Anything?
16
Novel Hypotheses Generation
• Based on discovery patterns
• Discovery patterns:– search templates that have a higher likelihood of
returning a new discovery
• Specific discovery patterns for specific discovery tasks
17
Discovery Patterns
• Inhibit the upregulated:– Search for substances, genes, ... which, according to the
literature, inhibit the top N (e.g. 300) genes that are upregulated on a given microarray
– Such substances, genes, … might be used to regulate the upregulated genes
• Stimulate the downregulated:– Search for substances, genes, ... which, according to the
literature, stimulate the top N (e.g. 300) genes that are downregulated on a given microarray
– Such substances, genes, … might be used to regulate the downregulated genes
Discovery Patterns – Graphical View
Disease X
Maybe_Treats2?
Upregulated
Downregulated
Genes Y1
Genes Y2
Drug Z1
(or substance)
Drug Z2
(or substance)
Inhibits
Stimulates
Maybe_Treats1?
Microarray Literature
19
Results – Inhibit the Upregulated
Paclitaxel INHIBITS HSPB1|HSPB1 protein
Paclitaxel completely inhibited the expression of HSP27 (PMID: 15304155)
Quercetin INHIBITS HSPB1|HSPB1 gene
Quercetin …, inhibited the expression of both HSP70 and HSP27 (PMID: 12926076)
•Parkinson microarray GSE8397
•HSP27 (HSPB1) gene is upregulated on the microarray
•We identified paclitaxel and quercetin as substances that inhibit the expression of this gene
Inhibit the Upregulated
21
Results – Stimulate the
Downregulated• NR4A2 downregulated on the microarray• We found out that:
– Pramipexol stimulates expression of NR4A2 – NR4A2 is associated with Parkinson disease
pramipexol STIMULATES NR4A2
… the increase of Nurr1 gene expression induced by PRX, ... (PMID: 15740846)
… the induction of Nurr1 gene expression by PRX ... (PMID: 15740846)
NR4A2 ASSOCIATED_WITH
Parkinson Disease
… lower levels of NURR1 gene expression were associated with significantly increased risk for PD (PMID: 18684475)
Explaining a Relation - Closed Discovery
Closed Discovery – Aligned Relations
Evaluation• Estimate – based on [Masseroli, BMC Bioinformatics
2006]:• Extract known facts – baseline precision on 2,042
extracted relations:– Gene – Disease (causes, assoc_with, …) P=74.2%– Gene – Gene (inhibits, stimulates, …) P=41.95%
• Propose Argument-Predicate distance for filtering (Gene-Gene):– At distance no more than 1: P=70.75%; R=43.6%– At distance no more than 2: P=55.88%; R=66.28%
• We use Argument-Predicate distance for ranking of semantic relations and we show relations more likely to be correct first.
25
Conclusion
• A new bioinformatics tool for interpretation and novel hypotheses generation
• Based on integration of semantic relations extracted from literature with microarrays
• Available at:
• http://sembt.mf.uni-lj.si
Syntactic Processing
Mbd1 can directly regulate the expression of Htr2c• MedPost tagger and shallow parser[ NP[head([… inputmatch(mdb1),tag(noun)])], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… inputmatch(htr2c),tag(noun)])] ]
26
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)])], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358)])] ]
27
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
28
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
29
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
• Apply indicator rule: Verb(regulate) INTERACTS_WITH
30
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
• Apply indicator rule: Verb(regulate) INTERACTS_WITH
31
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
• Apply indicator rule: Verb(regulate) INTERACTS_WITH
• Substitute concepts for semantic types:
32
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
• Apply indicator rule: Verb(regulate) INTERACTS_WITH
• Substitute concepts for semantic types:
33
Semantic Processing
• Identify concepts: MetaMap and ABGene
[ NP[head([… semtype(gngm),entrez(MBD1,4152)], ...
[verb([inputmatch(regulate),lexmatch(regulate),tag(verb)])],...
NP[… head([… semtype(gngm),entrez(HTR2C,3358])] ]
• Match semantic type patterns to ontology:
<gngm> INTERACTS_WITH <gngm>
• Apply indicator rule: Verb(regulate) INTERACTS_WITH
• Substitute concepts for semantic types:
MBD1 INTERACTS_WITH HTR2C
34