Post on 24-Feb-2016
description
Comparative Genomics II:Functional comparisons
Caterino and Hayes, 2007
Overview
I. Comparing genome sequences• Concepts and terminology• Methods
- Whole-genome alignments- Quantifying evolutionary conservation (PhastCons, PhyloP, GERP)- Identifying conserved elements
• Utility and limitations of conservation• Available datasets at UCSC
II. Comparative analyses of function• Evolutionary dynamics of gene regulation• Case studies• Insights into regulatory variation within and across species
Functional variation within and among species
Human
Chimp
Rhesus
Mouse
Modularity of developmental gene expression
forebrain
gene A
Brain TFs
neural tube
gene A
Neural TFs
limb
Limb TFs
gene A
Regulatory changes introduce variance without disrupting protein function
Regulatory variation contributes to human phenotypic variation
overall
Lettice et al. Hum Mol Genet 12:1725 (2003) Sagai et al. Development 132:797 (2005)
Regulatory mutations affecting pleiotropic genes cause discrete developmental changes
Neutral Constrained Directional
Patterns of selection on gene expression and regulation
Romero et al., Nat Rev Genet. 13:505 (2012)
Comparative approaches to identify conserved andvariant regulatory functions
Visel and Pennacchio, Nat Genet 42:557 (2010)
Regulatory conservation
Regulatory rewiring
Furey and Sethupathy, Science 2013
Genetic drivers of gene regulatory variation
• H3K4me2• H3K27ac
• H3K4me2• H3K27ac
Comparative analysis of ChIP-seq datasets
Human
Mouse
Compare TF binding, histone modifications,DNase hypersensitivity in equivalent tissues
Requires a statistical framework to reliably quantify changes inChIP-seq signals
•Input data are noisy: ChIP-seq, RNA-seq data are signal based, subjectto considerable experimental variation
•Using comparable biological states within and across species(e.g., human liver vs. mouse liver) = variation across tissues?
•How do epigenetic states and gene expression diverge among individuals and across species (Neutral? Constrained?)
•Can we identify variants or substitutions that drive regulatory changes?
Issues in comparative functional genomics
•10 human lymphoblastoid cell lines3 major population groups: European, East Asian, Nigerian9 females, 1 male9 analyzed by HapMap and 1000 Genomes
Science 328: 232 (2010)
•Targets:RNA Polymerase IINFkB
NFk
BPo
lII
Pair
wis
e di
ffer
ence
in b
indi
ng
Frac
tion
of r
egio
nsbo
und
# individuals
Variation in TF binding is common
Science 342: 747 (2013)
•10 human lymphoblastoid cell lines1 population group (Nigerian)All analyzed by HapMap and 1000 Genomes
•Targets:RNA Polymerase IIH3K4me1, H3K4me3, H3K27ac, H3K27me3DNase hypersensitivity
Measuring allelic imbalance in histone modification profiles
G allele
T allele
Need to map reads reliably to individual alleles
ChIP-seq reads
Allelicimbalance
Cis-quantitative trait loci
~1200 identified
Science 328: 1036 (2010)
•Targets:CCAAT/enhancer binding protein a (CEBPA)Hepatocyte nuclear factor 4a (HNF4A)Essential for normal liver development and function
•Tissue:Adult liver from 4 mammal species plus chicken
Lineage-specific gain and loss of CEBPA binding in liverLineage-specific: 0 bp overlap in multiple species alignment
Widespread variation in CEBPA binding in mammals
Widespread variation in CEBPA binding in mammals
Cell 154: 530 (2013)
Enhancer-associatedhistone modification
Single TF binding events may not indicate regulatory function
• Many TFs are present at high concentrationsin the nucleus
• TF motifs are abundant in the genome
• Single TF binding events may be incidental
Combinatorial TF binding events are more conserved
Many TF binding changes do not have obvious genetic causes
In mammalian liver:
Many TF binding changes do not have obvious genetic causes
In mouse liver:
Human
Rhesus
Mouse
Bud stage; digitspecification Digit separation
Cell 154: 185 (2013)
Identifying human-lineage changes in promoter and enhancer function
• Compare H3K27ac signal at orthologous sites
• ‘Stable marking’: 1.5-fold or less change in H3K27ac among human, rhesus and mouse
• Human gain: require significant, reproducible gain in human versus all 12 datasets in rhesus and mouse
Mapping active promoters and enhancers in human limb
ENCODE cell lines
H3K27ac
Gains in promoter and enhancer activity
• Bone morphogenesis• Chondrogenesis• Digit malformations in mouse
Human-specific H3K27ac marking correlates with changes inenhancer function
Epigenetic signatures reflect tissue identity and species relationships
H3K27ac signal in human and mouse
Primate
Mouse
H3K27ac in human, rhesus, mouse
• Human• Chimpanzee• Bonobo• Gorilla• Orangutan• Macaque• Mouse• Opossum• Platypus• Chicken
• Custom gene models based on Ensembl + RNA-seq• 5,636 1:1 orthologs in amniotes• 13,277 1:1 orthologs in primates• Only constitutive exons
Nature 478: 343 (2011)
Global patterns of gene expression differences
Gene expression recapitulates species phylogenies
Gene expression divergence rates are tissue-specific
liver
testis
brain
Gene expression divergence increases with evolutionary time
Conservation of core organ functions restricts divergence
•Comparative functional genomics identifies regulatory differenceswithin and among species
•TF binding is variable within species and highly variable among species
•Epigenetic comparisons provide more insight into biologicallyrelevant regulatory diversity and divergence
•Gene regulation and expression diverges with increasingphylogenetic distance – they mirror neutral expectation
Summary