Post on 22-Nov-2014
description
ENCODEEncyclopedia of DNA Elements
Outline
What and who is ENCODE
Key ENCODE topics and most important papers for our research
ENCODE data – make use of the encyclopedia…
Maté Ongenaert
What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $
Who?International consortium
Funded by NHGRI – National Human Genome Research Institute200 million dollar
Main collaborators (for human data)Broad Institute (ChIP-seq)
HudsonAlpha Institute for Biotechnology (methylation)Sanger Institute (RNA-seq)Duke University (DNAse)
Yale University (Pol II)EBI (data analysis)
Main aims “Build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and
RNA levels, and regulatory elements that control cells and circumstances in which a gene is active”
What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $
What’s so hot… It has been running for years?
Started in 2007 – pilot project1% of the genome
2007-2012Since then, introduction of new technologies
Higher throughput Genome-wide
Much more samples and different tissues (different ‘tiers’ – see later)
Better data analysis and integration
What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $
What’s so hot… It has been running for years?
World wide press attention
What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $
What’s so hot… It has been running for years?
World wide press attention…and criticisms
“Popular” media focus on the “junk DNA aspect”
The authors also claim in their press-release that > 80% of the genome is
‘biologically active’ (<> may be involved in regulation in one way or another <>
junk DNA)
ENCODE reveals for the fist time a lot of factors of the very complex switching
board controlling expression / …
What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $
What’s so hot… It has been running for years?
30 (!) research papers published in three journals at the same time
ENCODEEncyclopedia of DNA Elements
Outline
What and who is ENCODE
Key ENCODE topics and most important papers for our research
ENCODE data – make use of the encyclopedia…
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Key topics
Transcription factor binding motifsChromatin patterns at transcription factor binding sites
Characterization of intergenic regions and gene definitionsRNA and chromatin modification patterns around promoters
Epigenetic regulation of RNA processingNon-coding RNA characterisation
DNA-methylationEnhancer discovery and characterization
3D connections across the genomeCharacterisation of network topology
Machine learning approaches to genomicsImpact of functional information on understanding variation
Impact of evolutionary selection on functional regions
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Main paper
95% of the genome lies within 8 kilobases (kb) of a DNA–protein interaction
Classifying the genome into seven chromatin states indicates an initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoter-like features
It is possible to correlate quantitatively RNA sequence production and processing with both chromatin marks and transcription factor binding at promoters, indicating that promoter
functionality can explain most of the variation in RNA expression
Single nucleotide polymorphisms (SNPs) associated with disease by GWAS are enriched within non-coding functional elements, with a majority residing in or near ENCODE-defined
regions that are outside of protein-coding genes. In many cases, the disease phenotypescan be associated with a specific cell type or transcription factor
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Main paper
Techniques used:RNA-seqChIP-seq
DNAse-seqDNA-methylation arrays and bisulfite seq
FAIRE-seq
Tier 1: three cell lines (K652 – GM12878 – H1 hESC)Tier 2: cell line panel (HeLa-S3 – HepG2 – HUVECs)
Tier 3 (all other cell types)
Total: 1640 datasets / 147 different cell types
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Main paper
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Main paper
95% of the genome lies within 8 kilobases (kb) of a DNA–protein interaction
Classifying the genome into seven chromatin states indicates an initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoter-like features
It is possible to correlate quantitatively RNA sequence production and processing with both chromatin marks and transcription factor binding at promoters, indicating that promoter
functionality can explain most of the variation in RNA expression
Single nucleotide polymorphisms (SNPs) associated with disease by GWAS are enriched within non-coding functional elements, with a majority residing in or near ENCODE-defined
regions that are outside of protein-coding genes. In many cases, the disease phenotypescan be associated with a specific cell type or transcription factor
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Expression – chromatin state Expression – transcription factors
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Expression – transcription factors
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Chromatin state patterns at transcription-factor binding
sites
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Co-association between transcription factors (K562)
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Insight in genomic variation – allele specific variation
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Main paper
95% of the genome lies within 8 kilobases (kb) of a DNA–protein interaction
Classifying the genome into seven chromatin states indicates an initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoter-like features
It is possible to correlate quantitatively RNA sequence production and processing with both chromatin marks and transcription factor binding at promoters, indicating that promoter
functionality can explain most of the variation in RNA expression
Single nucleotide polymorphisms (SNPs) associated with disease by GWAS are enriched within non-coding functional elements, with a majority residing in or near ENCODE-defined
regions that are outside of protein-coding genes. In many cases, the disease phenotypescan be associated with a specific cell type or transcription factor
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Overlap SNPs withregulatory elements
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Overlap SNPs with regulatory elements and ‘open’ chromatin
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Other important papers to us
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Accessible chromatin landscape
DNAseI treatmentCombined analysis with TFs and H3K4me3
Identification of “accessible” chromatin regions
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Accessible chromatin landscape – location of accessible regions
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Accessible chromatin landscape – association with ChIP-seq and TFs
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Accessible chromatin landscape – novel transcripts
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Other important papers to us
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Landscape of transcription
RNA-seq
Get a grip on what is transcribed, including novel transcripts and RNAs
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Landscape of transcription – nucleolar fraction vs. whole cell
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Landscape of transcription
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Other important papers to us
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Long-range interaction of promoters
5C mapping (chromatin interaction mapping technology)
Long-range interactions of promoter regions
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Long-range interaction of promoters
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Other important papers to us
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Other important papers to us
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Transcriptional regulation
ChIP-seq <> expression detection
Predict transcriptional regulation
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Transcriptional regulation – predict transcription
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Transcriptional regulation – expression prediction
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Transcriptional regulation – TFs predict location of histone modifications
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Transcriptional regulation – model
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Other important papers to us
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Cell-type specific gene expression from open chromatin regions
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Other important papers to us
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Cell-type specific TF binding
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Other important papers to us
Key ENCODE topicsMain ENCODE topics and selection of most important papers
SNPs in regulatory regions
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Other important papers to us
Key ENCODE topicsMain ENCODE topics and selection of most important papers
TF binding - interactions
Key ENCODE topicsMain ENCODE topics and selection of most important papers
TF binding – cell-type specificity
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Other important papers to us
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Classification of genomic regions
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Classification of genomic regions
Key ENCODE topicsMain ENCODE topics and selection of most important papers
Classification of genomic regions
ENCODEEncyclopedia of DNA Elements
Outline
What and who is ENCODE
Key ENCODE topics and most important papers for our research
ENCODE data – make use of the encyclopedia…
ENCODE dataData availability
Data availability
All data is available, from raw data to final processed data
For end-level users:
- Tracks in the UCSC browser with desired level of detail Visualize tracks and explore genomic context
For end-level users and bio-IT:- In UCSC “Table browser” and other UCSC tools
Export genomic information, including processed data
For high end-level users and Bio-IT:- Raw data and semi-processed data in GEO and others
ENCODE dataData availability
Tracks in the UCSC browser with desired level of detail
ENCODE dataData availability
Tracks in the UCSC table browser
ENCODE dataData availability
Raw data
ENCODE dataData availability
Raw data
Blokde Van…
ETER