Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics...
-
Upload
ross-linton -
Category
Documents
-
view
217 -
download
0
Transcript of Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics...
![Page 1: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/1.jpg)
Gene SetEnrichment Analysis
Genome 559: Introduction to Statistical and Computational Genomics
Elhanan Borenstein
![Page 2: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/2.jpg)
Gene expression profiling Which molecular processes/functions
are involved in a certain phenotype (e.g., disease, stress response, etc.)
The Gene Ontology (GO) Project Provides shared vocabulary/annotation Terms are linked in a complex structure
Enrichment analysis: Find the “most” differentially expressed
genes Identify over-represented annotations Modified Fisher's exact test
A quick review
![Page 3: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/3.jpg)
Enrichment AnalysisClassA ClassB
Gen
es ra
nked
by
expr
essi
on c
orre
latio
n to
Cla
ss A
Cutoff
Biologicalfunction?
![Page 4: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/4.jpg)
Enrichment AnalysisClassA ClassB
Gen
es ra
nked
by
expr
essi
on c
orre
latio
n to
Cla
ss A
Cutoff
Biologicalfunction?
2 /
10
Func
tion
1(e
.g.,
met
abol
ism
)
5 /
11
Func
tion
2(e
.g.,
sign
alin
g)
3 /
10
Func
tion
3(e
.g.,
regu
latio
n)
![Page 5: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/5.jpg)
After correcting for multiple hypotheses testing, no individual gene may meet the threshold due to noise.
Alternatively, one may be left with a long list of significant genes without any unifying biological theme.
The cutoff value is often arbitrary!
We are really examining only a handful of genes, totally ignoringmuch of the data
Problems with cutoff-based analysis
![Page 6: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/6.jpg)
MIT, Broad Institute V 2.0 available since Jan 2007
Gene Set Enrichment Analysis
(Subramanian et al. PNAS. 2005.)
![Page 7: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/7.jpg)
Calculates a score for the enrichment of a entire set of genes rather than single genes!
Does not require setting a cutoff!
Identifies the set of relevant genes as part of the analysis!
Provides a more robust statistical framework!
GSEA key features
![Page 8: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/8.jpg)
Gene Set Enrichment AnalysisClassA ClassB
Gen
es ra
nked
by
expr
essi
on c
orre
latio
n to
Cla
ss A
Cutoff
Biologicalfunction?
2 /
10
5 /
11
3 /
10
Func
tion
1(e
.g.,
met
abol
ism
)
Func
tion
2(e
.g.,
sign
alin
g)
Func
tion
3(e
.g.,
regu
latio
n)
![Page 9: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/9.jpg)
Gene Set Enrichment AnalysisClassA ClassB
Gen
es ra
nked
by
expr
essi
on c
orre
latio
n to
Cla
ss A
Running sum:Increase when gene is in set
Decrease otherwise
Func
tion
1(e
.g.,
met
abol
ism
)
Func
tion
2(e
.g.,
sign
alin
g)
Func
tion
3(e
.g.,
regu
latio
n)
![Page 10: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/10.jpg)
Gene Set Enrichment Analysis
What would you expect if the hits were randomly distributed?
What would you expect if most of the hits cluster at the top of the list?
![Page 11: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/11.jpg)
Gene Set Enrichment Analysis
Genes within functional set
(hits)
Running sum
Enrichment score (ES) = max deviation from 0
Leading Edge genes
![Page 12: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/12.jpg)
Gene Set Enrichment AnalysisLow ES (evenly distributed)ES = 0.43 ES = -0.45
![Page 13: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/13.jpg)
Ducray et al. Molecular Cancer 2008 7:41
Gene Set Enrichment Analysis
![Page 14: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/14.jpg)
1. Calculation of an enrichment score(ES) for each functional category
2. Estimation of significance level of the ES An empirical permutation test
Phenotype labels are shuffled and the ES for this functional set is recomputed. Repeat 1000 times.
Generating a null distribution
3. Adjustment for multiple hypotheses testing Necessary if comparing multiple gene sets (i.e.,functions)
Computes FDR (false discovery rate)
GSEA Steps
![Page 15: Gene Set Enrichment Analysis Genome 559: Introduction to Statistical and Computational Genomics Elhanan Borenstein.](https://reader038.fdocuments.in/reader038/viewer/2022103121/56649c5b5503460f94905582/html5/thumbnails/15.jpg)