Post on 22-Feb-2016
description
Introduction to epigenetics: chromatin modifications, DNA methylation and the CpG Island landscape
Héctor Corrada BravoCMSC702 Spring 2013
(many slides courtesy of Rafael Irizarry)
Genetics: the alphabet of life
• Letters of DNA sequence carry the information
• How is this information read and parsed.
• We need grammar!
Differentiation
Different genes are expressed during different stages and in different tissues
(3.4x10-10 meters/bp) x (6x109 bp/genome) = ~2 meters/genome
Radius of the nucleus is ~ 10 µM !!!
Klug and Cummings, 1997
[(6 x 109 bp/genome) / (195 bp/nucleosome)] = ~ 30.8 x 106 nucleosomes/genome~ 5 % of nuclear volume
http://www.albany.edu/~achm110/solenoidchriomatin.html
Conformation is dynamic! (we’ll discuss methods to assay this conformation later on…)
We’ll study methods to assay a number of mechanisms of epigenetic regulation
DNA methylation Nucleosome positioning andhistone modifications
In eukaryotes, DNA methylation usually occur at CpG dinucleotides
Transcriptional regulation by nucleosome and histone modification
Nucleosome positioning is mainly repressive
Histone modification can be either active or repressive
TF
TF
TF target site
TF
TF
Ace
TFH3K27me3
H3K9ac
TF
Histone code hypothesis“… multiple histone modifications, acting in a combinatorial or sequential fashion on one or multiple histone tails, specify unique downstream functions …” ― Strahl and Allis,
Nature, (2000)
DNA Methylation is sometimes repressive
Robertson and Wolffe, Nat Rev Genet, 2000
Unmethylated CpG dinucleotides
Methylated CpG dinucleotides
Transcription repressors bound to methyl-group
DNA methylation in cancer
DNA methylation, histone modifications andnucleosome positioning are coordinated!
New technologies are allowing us to now assay this coordination
[Brinkman, et al., Genome Research 2012]
Epigenetics: the grammar of life
Epigenetics literally means above the genome
One more thing….
How does a cell retain epigenetic state?
Methylation
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
What happens during cell replication?
TTCGATTACGA
AAGCTAATGCT
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
What happens during cell replication?
TTCGATTACGA
AAGCTAATGCT
TTCGATTACGA
AAGCTAATGCT
CH3 CH3CH3 CH3
What happens during cell replication? DNA methylation is replicated!
TTCGATTACGA
AAGCTAATGCT
TTCGATTACGA
AAGCTAATGCT
Liver
Brain
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
Liver
Brain
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
Liver
Brain
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
AAGCTAATGCT
CH3
CH3
TTCGATTACGA
CpG Islands
We said DNA methylation occurs at CpG dinucleotides.
Where are they in the genome?
CpGs are depleted
Remaining ones cluster
Remaining ones cluster
Proportion of CpG’s stratified by CG content.
Two modes:high CpG ratelow CpG rate
The clusters refered to as CpG Islands
CpGs are depletedRemaining CpGs cluster into islands enriched near promoters
New CGI definition: Irizarry et al. (2009) Mammalian Genome
CpG Island definition
Gardiner-Garden and Frommer • N > 200• GC-content > 50%• obs/exp > 0.6• Lists contain 20,000 CGI
• Irizarry et al. (2009) Mammalian Genome
• Wu et al (2010) Biostatistics• Lists contain 100,000 CGI
Observed versus expected
Obs
erve
d di
nucl
eotid
es
Expected (%G x %C)
Takai and Jones PNAS 2002 use a stricter definitionHMM based definition
GpC
CpG
How do we measure DNA methylation?
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
Liver Brain
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
TTCGATTACGA
AAGCTAATGCT
CH3
TTCGATTACGA
AAGCTAATGCT
CH3
CH3TTCGATTACGA
AAGCTAATGCT
CH3
CH3
TTCGATTACGA
AAGCTAATGCT
CH3
CH3
85% Methylationchr3:44,031,616-44,031,626
Bisulfite Treatment
Bisulfite Treatment
GGGGAGCAGCATGGAGGAGCCTTCGGCTGACT
GGGGAGCAGTATGGAGGAGTTTTCGGTTGATT
BS-seq
GTCGTAGTATTTGTCT GTCGTAGTATTTGTNN TGTCGTAGTATCTGTC TATGTCGTAGTATTTG TATATCGTAGTATTTT TATATCGTAGTATTTG NATATCGTAGTATNTG TTTTATATCGCAGTAT ATATTTTATGTCGTA ATATTTTATCTCGTA ATATTTTATGTCGTA GA-TATTTTATGTCGTGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTATGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATCCTATTATTTATCGCACCTAC
GTTCAATATT
Coverage: 13Methylation Evidence: 13Methylation Percentage: 100%
BS-seq
GTCGTAGTATTTGTCT GTCGTAGTATTTGTNN TGTCGTAGTATCTGTC TATGTCGTAGTATTTG TATATTGTAGTATTTT TATATCGTAGTATTTG NATATTGTAGTATNTG TTTTATATTGCAGTAT ATATTTTATGTCGTA ATATTTTATCTTGTA ATATTTTATGTCGTA GA-TATTTTATGTCGTGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTATGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATCCTATTATTTATCGCACCTAC
GTTCAATATT
Coverage: 13Methylation Evidence: 9Methylation Percentage: 69%
BS-seq
GTCGTAGTATTTGTCT GTCGTAGTATTTGTNN TGTTGTAGTATCTGTC TATGTTGTAGTATTTG TATATTGTAGTATTTT TATATTGTAGTATTTG NATATTGTAGTATNTG TTTTATATTGCAGTAT ATATTTTATGTCGTA ATATTTTATCTTGTA ATATTTTATGTTGTA GA-TATTTTATGTCGTGATCACAGGTCTATCACCCTATTAACCACTCACGGGAGCTCTCCATGCATTTGGTATTTTCGTCTGGGGGGTATGCACGCGATAGCATTGCGAGACGCTGGAGCCGGAGCACCCTATGTCGCAGTATCTGTCTTTGATTCCTGCCTCATCCTATTATTTATCGCACCTAC
GTTCAATATT
Coverage: 13Methylation Evidence: 4Methylation Percentage: 31%
BS-seq
• Alignment is much trickier:– Naïve strategy: do nothing, hope not many CpG in
a single read– Smarter strategy: “bisulfite convert” reference:
turn all Cs to Ts– Smartest strategy: be unbiased and try all
combinations of methylated/un-methylated CpGs in each read
BS-seq
• There are similarities to SNP calling• EXCEPT: we want to measure percentages
– Use a binomial model to estimate p, percentage of methylation
– Allow for sequencing errors, coverage differences, etc.
Measuring DNA Methylation
• Estimating percentages• Use “local-likelihood”
method– Based on loess
(Plot courtesy of Kasper Hansen)
Crosslink YLyse & Sonicate
IP Reverse crosslinks
Total Reverse crosslinks Amplify
Amplify
Sequence
Sequence
MeDIP (like ChIPchip)
Other controls for IP(e.g., no antibody, non-
specific antibody)
Next few lectures
• Measuring DNA methylation– How to find genomic regions that are differentially
methylated in two groups (say, cancer and normal)• Measuring nucleosome occupancy and
histone modifications– First stabs at decoding the histone code
• Determining genomic 3d structure