Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets
description
Transcript of Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets
![Page 1: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/1.jpg)
www.cmmt.ubc.ca
MOTIF ENRICHMENT ANALYSIS IN CO-EXPRESSED GENE SETS AND HIGH-
THROUGHPUT SEQUENCE SETS
Wyeth WassermanJan. 18, 2012
opossum.cisreg.ca/oPOSSUM3
![Page 2: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/2.jpg)
Welcome
• If you encounter any technical difficulties during the webinar– Type a report using the chat option
• Slide presentation ~20 min• Compile Questions as they are submitted
and answer them during the final Q&A/discussion period
• During the discussion session, we’ll allow audience speaking
2
![Page 3: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/3.jpg)
Webinar Format
• Introduction• Walk-Through• Summary• Q&A
3
![Page 4: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/4.jpg)
INTRODUCTION
4
![Page 5: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/5.jpg)
Overview
• Given co-expressed gene sets, what are the key mediators of co-expression?– Focus on TFs
• Web-based software system for motif enrichment analysis– Co-expressed genes or sequences– Multiple sets of analysis methods– Available for human, mouse, fly, worm, yeast
5
![Page 6: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/6.jpg)
Motif Enrichment Analysis
6
Background Target
0
0.2
0.4
0.6
0.8
1
TFBS1 TFBS2 TFBS3
Prop
ortio
n of
gen
es c
onta
inin
g TF
BS
BackgroundTarget
p=0.04 p=0.55 p=0.66
Finds over-represented TFBS in co-expressed gene sets
![Page 7: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/7.jpg)
What do we need?
• Region selection– Where to look for enriched binding sites– Use conservation filter to restrict search
space• TFBS profiles to search for
– Need a pool of validated profiles• Scoring metrics for enrichment
– How to measure motif over-representation
7
![Page 8: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/8.jpg)
GeneCR1 CR2 CR4CR3
Threshold
Genomic Position
phastConsScore
Conserved Region Selection
8
![Page 9: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/9.jpg)
TFBS Profiles• JASPAR 2010: Portales-Casamar et al. Nucleic
Acids Research 2009.• Expanded collection of TFBS profiles
– 130 vertebrate profiles– 105 insect profiles– 5 nematode profiles– 177 yeast profiles– PBM (104), PBM_HOMEO (176), PBM_BHLH (19)
• Standardized 2-level TF classification (class, family)
9
![Page 10: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/10.jpg)
Scoring Metrics
• Z scores– Based on the number of occurrences of the TFBS
relative to background– Normalized for sequence length– Simple binomial distribution model
• Fisher scores– Fisher exact probability test
• Fisher score = -log(Fisher p-value)– Based on the number of genes containing the TFBS
relative to background
10
![Page 11: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/11.jpg)
Additional Metric for Seq-Based• KS scores
– Kolmogorov-Smirnoff test– Compares the empirical
distribution of the distances of the binding sites from the maximum point of confidence (MPC) to the background
– Expect real binding sites to be centered around the MPC
11
MPC
Foreground
Background
KS score = -log(KS test p-value)
![Page 12: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/12.jpg)
Analysis Methods
12
![Page 13: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/13.jpg)
WALK-THROUGH
13
![Page 14: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/14.jpg)
14
http://opossum.cisreg.ca/oPOSSUM3
![Page 15: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/15.jpg)
Human SSA - Input
15
![Page 16: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/16.jpg)
16
![Page 17: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/17.jpg)
17
![Page 18: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/18.jpg)
Human SSA - Results
18
![Page 19: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/19.jpg)
19
TF HNF1A
JASPAR ID MA0046.1
Class Helix-Turn-Helix
Family Homeo
Tax Group Vertebrates
IC 15.548
GC Content 0.259
![Page 20: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/20.jpg)
20
Target Gene Hits 19
Target Gene Non-Hits 36
Background Gene Hits 1113
Background Gene Non-Hits 3887
Target TFBS Hits 41
Target TFBS Nucleotide Rate 0.0269
Background TFBS Hits 2127
Background TFBS Nucleotide Rate 0.009
![Page 21: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/21.jpg)
21
Z-score 15.134
Fisher score 3.646
![Page 22: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/22.jpg)
22
![Page 23: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/23.jpg)
oPOSSUM methods
23
![Page 24: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/24.jpg)
24
![Page 25: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/25.jpg)
Human aCSA - Input
25
![Page 26: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/26.jpg)
Human aCSA - Input
26
![Page 27: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/27.jpg)
Human aCSA - Input
27
![Page 28: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/28.jpg)
Human aCSA - Results
28
![Page 29: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/29.jpg)
29
![Page 30: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/30.jpg)
30
![Page 31: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/31.jpg)
TFBS Cluster Analysis
31
TFBS ProfileCluster
![Page 32: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/32.jpg)
GeneCR1 CR2 CR4CR3
TFBSs
TFBS Cluster Hits
Merge
Overrepresentation Analysisbased on merged TFBS cluster hits
TFBS Cluster Analysis (TCA)
32
![Page 33: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/33.jpg)
Human TCA – TFBS cluster selection
33
![Page 34: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/34.jpg)
Human TCA - Results
34
![Page 35: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/35.jpg)
TFCluster Info Page
35
![Page 36: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/36.jpg)
36
![Page 37: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/37.jpg)
Seq SSA - Input
37
![Page 38: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/38.jpg)
Seq SSA - Input
38
![Page 39: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/39.jpg)
39
![Page 40: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/40.jpg)
40
![Page 41: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/41.jpg)
41
![Page 42: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/42.jpg)
42
![Page 43: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/43.jpg)
43
![Page 44: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/44.jpg)
44
![Page 45: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/45.jpg)
Seq SSA - Results
45
![Page 46: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/46.jpg)
46
KS score
![Page 47: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/47.jpg)
47
![Page 48: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/48.jpg)
Seq TCA - Input
48
![Page 49: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/49.jpg)
SUMMARY
49
![Page 50: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/50.jpg)
oPOSSUM-3
• Web-based system for motif enrichment analysis in co-expressed gene sets and sequences from high-throughput experiments
• Important functionalities– Gene-based vs. Sequence-based– Single site vs. Anchored combination site– Individual vs. clusters of TFBS profiles– Human, mouse, fly, worm and yeast
50
![Page 51: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/51.jpg)
Development Team
51
Version 1 CSA Version 2 Version 3• Ho Sui, SJ• Mortimer, JR• Arenillas, DJ• Brumm, J• Walsh, CJ• Kennedy, BP• Wasserman,
WW
• Huang, S• Fulton, DL• Arenillas, DJ• Perco, P• Ho Sui, SJ• Mortimer, JR• Wasserman,
WW
• Ho Sui, SJ• Fulton, DL• Arenillas, DJ• Kwon, AT• Wasserman,
WW
• Kwon, AT• Arenillas, DJ• Worsely
Hunt, R• Wasserman,
WW
![Page 52: Motif Enrichment Analysis in Co-Expressed Gene Sets and High-Throughput Sequence Sets](https://reader035.fdocuments.in/reader035/viewer/2022062222/56816133550346895dd08995/html5/thumbnails/52.jpg)
QUESTIONS & ANSWERS
Please take a moment to type questions/comments into the chat box.The questions will be answered shortly.
52