great.stanford€¦ · GREAT improves functional interpretation of cis-regulatory regions Cory Y...
Transcript of great.stanford€¦ · GREAT improves functional interpretation of cis-regulatory regions Cory Y...
![Page 1: great.stanford€¦ · GREAT improves functional interpretation of cis-regulatory regions Cory Y McLean 1, Dave Bristor1,2, Michael Hiller2, Shoa L Clarke3, Bruce T Schaar2, Craig](https://reader035.fdocuments.in/reader035/viewer/2022070812/5f0b3f397e708231d42f920d/html5/thumbnails/1.jpg)
GREAT improves functional interpretation of cis-regulatory regions Cory Y McLean1, Dave Bristor1,2, Michael Hiller2, Shoa L Clarke3, Bruce T Schaar2, Craig B Lowe4, Aaron M Wenger1, Gill Bejerano1,2
Departments of 1. Computer Science, 2. Developmental Biology, 3. Genetics, Stanford University
4. Department of Biomolecular Engineering, University of California Santa Cruz
ChIP-seq identifies functional
non-coding regions of the genome
Gene-based enrichment tools
are inappropriate to analyze
cis-regulatory elements
• Variation in gene
distribution makes genes
with no others nearby more
likely to be selected than
genes in clusters2,3
• Shown are results from
sets scattered randomly
throughout the genome,
associating regions with the
nearest genes within 1 Mb
1. Fields, S., Science, 2007
2. Lowe et al., Proc. Natl. Acad. Sci., 2007
3. Taher, L. & Ovcharenko, I., Bioinformatics, 2009
GREAT supports many ontologies
in both human and mouse
• Human: NCBI Build 36.1 (hg18)
• Mouse: NCBI Build 37 (mm9)
• Twenty supported ontologies span Gene Ontology,
pathways, gene expression, regulatory motifs, phenotypes
and human disease, and gene families
Example: GREAT infers many
functions of Serum Response
Factor (SRF) from its binding profile
• ChIP-seq identified SRF binding profile in human5
• Gene-based enrichment analysis identified only general
terms as highly enriched5
• GREAT detects enrichments for specific functions of SRF
GREAT is easy to navigate and
provides detailed information
• GREAT accurately assesses statistical enrichments of
cis-regulatory sequences such as those generated by
ChIP-seq, open chromatin, comparative genomics, etc.
• GREAT supports 20 diverse ontologies for both human
and mouse
• Application to multiple transcription-associated factors
in a variety of contexts shows both detailed enrichment
for known functions and potential avenues for further
investigation of the assayed factors4
• Online tool available at http://great.stanford.edu
•ChIP-seq peaks identify cis-regulatory elements of interest
(transcription factor binding sites, methylation domains, etc.)
• Identified regions work in cis to affect expression of nearby
genes
http://great.stanford.edu
Input: BED regions of interest
Advanced options: Alter association rules
between genomic regions and genes
Output: Ontology Term Enrichments
Summary
4. McLean, C.Y. et al., Nat. Biotechnol., 2010
5. Valouev, A. et al., Nat. Methods, 2008
6. Kent, W. et al., Genome Res., 2002
Gene-based GO Enrichments of SRF Peaks5
GREAT Ontology Enrichments of SRF Peaks
[adapted from ref. 1]
Genomic regions annotated
as “actin cytoskeleton”
Seamless integration with
UCSC Genome Browser6
visualization tools
Ontology Term Hypothesis Binomial Experimental
P-value support
GO: Cellular
Component
GO: Mol.Func.
TF Targets
Promoter
Motifs
Pathway
Commons
TreeFam
actin cytoskeleton
cortical cytoskeleton
actin binding
Targets of SRF
Targets of YY1
Targets of E2F4 and p130
SRF variants
GABPA/GABPB
Motif NGGGACTTTCCA
EGR1
TRAIL signaling
Class I PI3K signaling
FOSL2 / JDP2 / FOS /
FOSL1 / FOSB / ATF3
location
location
function
co-regulator
co-regulator
co-regulator
co-regulator
co-regulator
co-regulator
pathway
pathway
gene
family
6.91 x 10-9
4.03 x 10-6
5.21 x 10-5
4.97 x 10-76
1.45 x 10-6
4.73 x 10-3
4.54 x 10-28
to 4.19 x 10-12
4.20 x 10-9
1.02 x 10-4
1.71 x 10-4
2.37 x 10-7
9.92 x 10-7
9.66 x 10-9
Novel testable hypothesis Positive control Known from literature*
1. Current methods ignore distal binding
2. Including distal binding using
gene-based tools results in bias
3. Distal binding events comprise a
large fraction of all binding events
Details page for “actin cytoskeleton”
Genes annotated as
“actin cytoskeleton” with
associated genomic regions
nearby
Genomic regions annotated
as “actin cytoskeleton”
Frame holding http://www.geneontology.org
definition of “actin
cytoskeleton”
All input genomic regions
Miano et al 2007
Miano et al 2007
Miano et al 2007
Natesan & Gilman 1995
Novel
Novel
Novel
Novel
Bertolotto et al 2000
Poser et al 2000
Chai & Tarnawski 2002
20 ontologies
Term statistics
Display filters
Data export options
Multiple hypothesis
correction options
H: human, M: mouse
• SRF, NRSF, GABP from [4]
• Stat3, p300 in ESC from
Chen et al., 2008
• p300 in other tissues from
Visel et al., 2009
The Genomic Regions Enrichment
of Annotations Tool (GREAT)4
accurately analyzes cis-regulatory
element enrichments
* Known interactions typically implicate a small subset of genes that GREAT identifies as
potentially related to the processes of interest.
Version 1.2
interface
Open API allows submission of data from other tools as well
Crosslink proteins to
DNA and lyse cells Fragment chromatin, add a protein-specific
antibody and purify protein-DNA complexes Reverse crosslinks, sequence isolated
DNA and map reads to genome