ENCODE project: brief summary of main findings

59
ENCODE Encyclopedia of DNA Elements Outline What and who is ENCODE Key ENCODE topics and most important papers for our research ENCODE data – make use of the encyclopedia… Maté Ongenaert

description

A brief summary of the ENCODE project and ist main finding. Most important publications for cancer researchers and how to make use of the ENCODe data.

Transcript of ENCODE project: brief summary of main findings

Page 1: ENCODE project: brief summary of main findings

ENCODEEncyclopedia of DNA Elements

Outline

What and who is ENCODE

Key ENCODE topics and most important papers for our research

ENCODE data – make use of the encyclopedia…

Maté Ongenaert

Page 2: ENCODE project: brief summary of main findings

What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $

Who?International consortium

Funded by NHGRI – National Human Genome Research Institute200 million dollar

Main collaborators (for human data)Broad Institute (ChIP-seq)

HudsonAlpha Institute for Biotechnology (methylation)Sanger Institute (RNA-seq)Duke University (DNAse)

Yale University (Pol II)EBI (data analysis)

Main aims “Build a comprehensive parts list of functional elements in the human genome, including elements that act at the protein and

RNA levels, and regulatory elements that control cells and circumstances in which a gene is active”

Page 3: ENCODE project: brief summary of main findings

What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $

What’s so hot… It has been running for years?

Started in 2007 – pilot project1% of the genome

2007-2012Since then, introduction of new technologies

Higher throughput Genome-wide

Much more samples and different tissues (different ‘tiers’ – see later)

Better data analysis and integration

Page 4: ENCODE project: brief summary of main findings

What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $

What’s so hot… It has been running for years?

World wide press attention

Page 5: ENCODE project: brief summary of main findings

What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $

What’s so hot… It has been running for years?

World wide press attention…and criticisms

“Popular” media focus on the “junk DNA aspect”

The authors also claim in their press-release that > 80% of the genome is

‘biologically active’ (<> may be involved in regulation in one way or another <>

junk DNA)

ENCODE reveals for the fist time a lot of factors of the very complex switching

board controlling expression / …

Page 6: ENCODE project: brief summary of main findings

What and who is ENCODEMain aims, funding and the institutions/labs behind the 200 M $

What’s so hot… It has been running for years?

30 (!) research papers published in three journals at the same time

Page 7: ENCODE project: brief summary of main findings

ENCODEEncyclopedia of DNA Elements

Outline

What and who is ENCODE

Key ENCODE topics and most important papers for our research

ENCODE data – make use of the encyclopedia…

Page 8: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Key topics

Transcription factor binding motifsChromatin patterns at transcription factor binding sites

Characterization of intergenic regions and gene definitionsRNA and chromatin modification patterns around promoters

Epigenetic regulation of RNA processingNon-coding RNA characterisation

DNA-methylationEnhancer discovery and characterization

3D connections across the genomeCharacterisation of network topology

Machine learning approaches to genomicsImpact of functional information on understanding variation

Impact of evolutionary selection on functional regions

Page 9: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Main paper

95% of the genome lies within 8 kilobases (kb) of a DNA–protein interaction

Classifying the genome into seven chromatin states indicates an initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoter-like features

It is possible to correlate quantitatively RNA sequence production and processing with both chromatin marks and transcription factor binding at promoters, indicating that promoter

functionality can explain most of the variation in RNA expression

Single nucleotide polymorphisms (SNPs) associated with disease by GWAS are enriched within non-coding functional elements, with a majority residing in or near ENCODE-defined

regions that are outside of protein-coding genes. In many cases, the disease phenotypescan be associated with a specific cell type or transcription factor

Page 10: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Main paper

Techniques used:RNA-seqChIP-seq

DNAse-seqDNA-methylation arrays and bisulfite seq

FAIRE-seq

Tier 1: three cell lines (K652 – GM12878 – H1 hESC)Tier 2: cell line panel (HeLa-S3 – HepG2 – HUVECs)

Tier 3 (all other cell types)

Total: 1640 datasets / 147 different cell types

Page 11: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Main paper

Page 12: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Main paper

95% of the genome lies within 8 kilobases (kb) of a DNA–protein interaction

Classifying the genome into seven chromatin states indicates an initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoter-like features

It is possible to correlate quantitatively RNA sequence production and processing with both chromatin marks and transcription factor binding at promoters, indicating that promoter

functionality can explain most of the variation in RNA expression

Single nucleotide polymorphisms (SNPs) associated with disease by GWAS are enriched within non-coding functional elements, with a majority residing in or near ENCODE-defined

regions that are outside of protein-coding genes. In many cases, the disease phenotypescan be associated with a specific cell type or transcription factor

Page 13: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Expression – chromatin state Expression – transcription factors

Page 14: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Expression – transcription factors

Page 15: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Chromatin state patterns at transcription-factor binding

sites

Page 16: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Co-association between transcription factors (K562)

Page 17: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Insight in genomic variation – allele specific variation

Page 18: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Main paper

95% of the genome lies within 8 kilobases (kb) of a DNA–protein interaction

Classifying the genome into seven chromatin states indicates an initial set of 399,124 regions with enhancer-like features and 70,292 regions with promoter-like features

It is possible to correlate quantitatively RNA sequence production and processing with both chromatin marks and transcription factor binding at promoters, indicating that promoter

functionality can explain most of the variation in RNA expression

Single nucleotide polymorphisms (SNPs) associated with disease by GWAS are enriched within non-coding functional elements, with a majority residing in or near ENCODE-defined

regions that are outside of protein-coding genes. In many cases, the disease phenotypescan be associated with a specific cell type or transcription factor

Page 19: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Overlap SNPs withregulatory elements

Page 20: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Overlap SNPs with regulatory elements and ‘open’ chromatin

Page 21: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Other important papers to us

Page 22: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Accessible chromatin landscape

DNAseI treatmentCombined analysis with TFs and H3K4me3

Identification of “accessible” chromatin regions

Page 23: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Accessible chromatin landscape – location of accessible regions

Page 24: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Accessible chromatin landscape – association with ChIP-seq and TFs

Page 25: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Accessible chromatin landscape – novel transcripts

Page 26: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Other important papers to us

Page 27: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Landscape of transcription

RNA-seq

Get a grip on what is transcribed, including novel transcripts and RNAs

Page 28: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Landscape of transcription – nucleolar fraction vs. whole cell

Page 29: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Landscape of transcription

Page 30: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Other important papers to us

Page 31: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Long-range interaction of promoters

5C mapping (chromatin interaction mapping technology)

Long-range interactions of promoter regions

Page 32: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Long-range interaction of promoters

Page 33: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Other important papers to us

Page 34: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Other important papers to us

Page 35: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Transcriptional regulation

ChIP-seq <> expression detection

Predict transcriptional regulation

Page 36: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Transcriptional regulation – predict transcription

Page 37: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Transcriptional regulation – expression prediction

Page 38: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Transcriptional regulation – TFs predict location of histone modifications

Page 39: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Transcriptional regulation – model

Page 40: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Other important papers to us

Page 41: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Cell-type specific gene expression from open chromatin regions

Page 42: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Other important papers to us

Page 43: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Cell-type specific TF binding

Page 44: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Other important papers to us

Page 45: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

SNPs in regulatory regions

Page 46: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Other important papers to us

Page 47: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

TF binding - interactions

Page 48: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

TF binding – cell-type specificity

Page 49: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Other important papers to us

Page 50: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Classification of genomic regions

Page 51: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Classification of genomic regions

Page 52: ENCODE project: brief summary of main findings

Key ENCODE topicsMain ENCODE topics and selection of most important papers

Classification of genomic regions

Page 53: ENCODE project: brief summary of main findings

ENCODEEncyclopedia of DNA Elements

Outline

What and who is ENCODE

Key ENCODE topics and most important papers for our research

ENCODE data – make use of the encyclopedia…

Page 54: ENCODE project: brief summary of main findings

ENCODE dataData availability

Data availability

All data is available, from raw data to final processed data

For end-level users:

- Tracks in the UCSC browser with desired level of detail Visualize tracks and explore genomic context

For end-level users and bio-IT:- In UCSC “Table browser” and other UCSC tools

Export genomic information, including processed data

For high end-level users and Bio-IT:- Raw data and semi-processed data in GEO and others

Page 55: ENCODE project: brief summary of main findings

ENCODE dataData availability

Tracks in the UCSC browser with desired level of detail

Page 56: ENCODE project: brief summary of main findings

ENCODE dataData availability

Tracks in the UCSC table browser

Page 57: ENCODE project: brief summary of main findings

ENCODE dataData availability

Raw data

Page 58: ENCODE project: brief summary of main findings

ENCODE dataData availability

Raw data

Page 59: ENCODE project: brief summary of main findings

Blokde Van…

ETER