Using ENCODE Data To Interpret Disease-associated ...Using ENCODE Data To Interpret...
Transcript of Using ENCODE Data To Interpret Disease-associated ...Using ENCODE Data To Interpret...
UsingENCODEDataToInterpretDisease-associatedGene8c
Varia8on
MikePazinNa8onalHumanGenomeResearchIns8tute,NIH
ENCODEUsersMee8ngJune8,2016
Welcome• Objec8ves
– WewanttotellthecommunityabouttheENCODEresource
– Wewanttohearcommunityexperiencesandsugges8ons
2
EliseFeingold
DanGilchrist
Overview• TheENCODEResource• UseofENCODEtoilluminatetheroleofgene8cvaria8oninhumandisease
• AccessingENCODEmaterials
3
ReadingTheHumanGenomeIsDifficult• Gene8ccodeverypowerfulfor1%ofthehumangenome• Nocorrespondinglypowerfulregulatorycode• Sequenceconserva8oncaniden8fysomecandidatefunc8onal
elements(butnotwhenorwheretheyact)• Regulatoryregionsaren’talwaysinthesameorderasgenetargetsØ Needunbiasedexperimentalinves8ga8on
99 % Non-coding
1 % Coding
IL10 IL19 4
Non-codingDNAIsImportantForDiseaseAndGeneRegula8on
• Vastmajorityofcommondiseaseassocia8onsandheritabilitylieoutsideofprotein-codingregions
• Non-codingDNAvariantsareknowntocausehumandiseasesandalterhumantraits(FXS,ALS)
Func8onalinforma8onisneededtointerprettheroleofgene8cvaria8oninhumandisease,andtoapplygenomicsintheclinic.
5
PMID:22955828,PMID:25439723,PMID:23128226PMID:17477822,PMID:25679767
1,500LeeersOfOur3BillionLeeerGenome
agccaagcagcaaageegctgctgetaeetgtagctceactataectaceeaccaegaaaataegaggaageaetataetctaeeeatataeatataeeatgtaeeaataeactaeacacataaeaeeeatatatatgaagtaccaatgacecceeccagagcaataatgaaaetcacagtatgaaaatggaagaaatcaataaaaeatacgtgacctgtggcgaagtacctatcgtggacaaggtgagtaccatggtgtatcacaaatgctcetccaaagccctctccgcagctcecccceatgacctctcatcatgccagcaeacctccctggaccccetctaagcatgtcetgagaeectaagaaeceatceggcaacatcegtagcaagaaaatgtaaageectgeccagagcctaacaggaceacataetgactgcagtaggcaeataetagctgatgacataataggectgtcatagtgtagatagggataagccaaaatgcaataagaaaaaccatccagaggaaactceeeeeceetceeeeeeccagatggagtctcgcacectctgtcacccgggctggagcgcagtggtgcaatceggctcactgcaacctccacctcctgggecaggtgaectcccacctcagcctcccgagtagtagctggaaeacaggtgcgcgctcccacacctggctaaeeegtaeceagtagagatggggetcaccatgeggccaggctggtctcaaactcctgccctcaggtgatctgcccacceggcctcccagtgegggetacaggcgtgagccaccgcgcctggcctggaggaaactceaacagggaaactaagaaagagegaggctgaggaactggggcatctgggegcectggccagaccaccaggctcegaatcctcccagccagagaaagagetccacaccagccaegeecctctggtaatgtcagcctcatctgegecctaggceacegatatgetgtaaatgacaaaaggctacagagcataggecctctaaaataecececctgtgtcagataegaatacatagaaatacggtctgatgccgatgaaaatgtatcagcectgataaaaggcggaaeataactaccgagtggtgatgctgaagggagacacagcceggatatgcgaggacgatgcagtgctggacaaaaggcaggtatctcaaaagcctggggagccaactcacccaagtaactgaaagagagaaacaaacatcagtgcagtggaagcacccaaggctacacctgaatggtgggaagctcetgctgctatataaaatgaatcaggctcagctactaeaeacactctcctgaagctaaccaacaetcctgcaacaeatgtagace
6
MapsAndAnnota8onHelpUsToUnderstandTheSequence
agccaagcagcaaageegctgctgetaeetgtagctceactataectaceeaccaegaaaataegaggaageaetataetctaeeeatataeatataeeatgtaeeaataeactaeacacataaeaeeeatatatatgaagtaccaatgacecceeccagagcaataatgaaaetcacagtatgaaaatggaagaaatcaataaaaeatacgtgacctgtggcgaagtacctatcgtggacaaggtgagtaccatggtgtatcacaaatgctcetccaaagccctctccgcagctcecccceatgacctctcatcatgccagcaeacctccctggaccccetctaagcatgtcetgagaeectaagaaeceatceggcaacatcegtagcaagaaaatgtaaageectgeccagagcctaacaggaceacataetgactgcagtaggcaeataetagctgatgacataataggectgtcatagtgtagatagggataagccaaaatgcaataagaaaaaccatccagaggaaactceeeeeceetceeeeeeccagatggagtctcgcacectctgtcacccgggctggagcgcagtggtgcaatceggctcactgcaacctccacctcctgggecaggtgaectcccacctcagcctcccgagtagtagctggaaeacaggtgcgcgctcccacacctggctaaeeegtaeceagtagagatggggetcaccatgeggccaggctggtctcaaactcctgccctcaggtgatctgcccacceggcctcccagtgegggetacaggcgtgagccaccgcgcctggcctggaggaaactceaacagggaaactaagaaagagegaggctgaggaactggggcatctgggegcectggccagaccaccaggctcegaatcctcccagccagagaaagagetccacaccagccaegeecctctggtaatgtcagcctcatctgegecctaggceacegatatgetgtaaatgacaaaaggctacagagcataggecctctaaaataecececctgtgtcagataegaatacatagaaatacggtctgatgccgatgaaaatgtatcagcectgataaaaggcggaaeataactaccgagtggtgatgctgaagggagacacagcceggatatgcgaggacgatgcagtgctggacaaaaggcaggtatctcaaaagcctggggagccaactcacccaagtaactgaaagagagaaacaaacatcagtgcagtggaagcacccaaggctacacctgaatggtgggaagctcetgctgctatataaaatgaatcaggctcagctactaeaeacactctcctgaagctaaccaacaetcctgcaacaeatgtagace
7
RicherMapsProvideMoreInforma8on
8
RicherMapsProvideMoreInforma8on
9
RicherMapsProvideMoreInforma8on
10
RicherMapsProvideMoreInforma8on
11
ENCODE:EncyclopediaOfDNAElements
• Iden8fyallcandidatefunc8onalelementsinthegenome
• Makeresourcefreelyavailabletocommunity– gene8cbasisofdisease– generegula8on
12
ENCODEDataTypes
ModifiedfromPLoSBiol9:e1001046,2011Science306:636,200413
ENCODEDataAreCell-TypeSpecific
14Stamatoyannopoulos,Cell154:888,2013
ENCODEAccomplishments• Sharing1000sofdatasets
– Noembargo– Unrestrictedaccess– Highquality– Uniformlyprocessed
• Sharingsohware• Datainteroperability
15
Publica8onsUsingENCODEData
16
HundredsofConsor8umpublica8ons
~1500communitypublica8onsusingENCODEdata:
~675HumanDisease~600BasicBiology~225Methods/SohwareDevelopment
Cancer38%
Allergy,Autoimmunity13%
HumanGene8cs10%
Neurologic,Psychiatric9%
Cardiovascular6%
Metabolic6%
Summary-ENCODEResource
• Freelysharedcatalogofgenomicdataandcandidategenomicfunc8onalelements
• ENCODEisbuiltuponestablishedtechniquesandinterpreta8onsdevelopedforthestudyofgeneregula8on
• ENCODEmapscanbeusedtomakepredic8onsaboutgenomefunc8on
17
Overview
• TheENCODEResource• UseofENCODEtoilluminatetheroleofgene8cvaria8oninhumandisease
• AccessingENCODEmaterials
18
StandardENCODEUseCases:HypothesisGenera8on
19
Majoruse:Hypothesisgenera8onandrefinement• Predic8onofcausalvariants/regulatoryelements
• Predic8onoftargetgenes• Predic8onoftargetcelltypes• Predic8onofupstreamregulators
Predic8onofCausalVariants• Mul8plevariantsmaybeinlinkagedisequilibrium
• Thecausalvariantmaynothavebeentestedduringdatacollec8on
• Mul8plevariantsmaybecausal
20Snyder,GenomeResearch22:1748,2012
Stamatoyannopoulos,Science337:1190,2012
ManyGWASAssocia8onsLieInRegionsAnnotatedByENCODE
AndCFEpigenomicsData
21
1
2
ENCODE/EpigenomicsDataFromHaploReg
www.broadins8tute.org/mammals/haploreg/22 WardandKellis,NucleicAcidsResearch40-D930,2011WorkshopSession3
ENCODEDataFromRegulomeDB
hep://regulomedb.org/23 Cherry,Snyder,GenomeResearch22-1790,2012
1
2
3
WorkshopSession3
heps://www.encodeproject.org/data/annota8ons/
ENCODEcis-elementBrowser
24WorkshopSession3
heps://www.encodeproject.org/data/annota8ons/
ENCODEcis-elementBrowser
25WorkshopSession3
Predic8onofTargetGenes• Regulatoryregionscanoperateonmul8ple,distalgenes
• Thetargetgenecouldbeanon-codingRNA
26 Stamatoyannopoulos,Crawford,Nature489:75,2012
Stamatoyannopoulos,Science337:1190,2012
ManyGWASAssocia8onsArePredictedToBeLinkedToDistalGenes
27
Stamatoyannopoulos,Science337:1190,2012
ManyGWASAssocia8onsArePredictedToBeLinkedToDistalGenes
28
Predic8onofLinkageBetweenRegulatoryElementsandGenes
29hep://dnase.genome.duke.edu
Furey,Crawford,Stamatoyannopoulos,GenomeRes.23:777,2013
Predic8onofLinkageBetweenRegulatoryElementsandGenes
30hep://dnase.genome.duke.edu
Furey,Crawford,Stamatoyannopoulos,GenomeRes.23:777,2013
Predic8onofLinkageBetweenRegulatoryElementsandGenes
31hep://dnase.genome.duke.edu
Furey,Crawford,Stamatoyannopoulos,GenomeRes.23:777,2013
ENCODEcis-elementBrowser
32 heps://www.encodeproject.org/data/annota8ons/WorkshopSession3
ENCODEcis-elementBrowser
33 heps://www.encodeproject.org/data/annota8ons/WorkshopSession3
Predic8onofTargetCellTypes• Somediseasesareknowntoaffectmul8plecelltypes
• Thedefectmaynotbeintrinsictothecelltypewithobviouspathology
• Thediseasee8ologymaynotbecompletelyknown
34
Predic8onofLinkageBetweenRegulatoryElementsandCellType
35hep://dnase.genome.duke.edu
Furey,Crawford,Stamatoyannopoulos,GenomeRes.23:777,2013
Predic8onofLinkageBetweenRegulatoryElementsandCellType
hep://regulomedb.org/
36 Cherry,Snyder,GenomeResearch22-1790,2012www.broadins8tute.org/mammals/haploreg/WardandKellis,NucleicAcidsResearch40-D930,2011
WorkshopSession3
ENCODEcis-elementBrowser
37 heps://www.encodeproject.org/data/annota8ons/WorkshopSession3
ENCODEcis-elementBrowser
38 heps://www.encodeproject.org/data/annota8ons/WorkshopSession3
heps://github.com/mauranolab/GWAS_plotsStamatoyannopoulos,Science337:1190,2012
ENCODEAndEpigenomicsDataCanBeUsedToPredictCellTypes
39
Summary-ENCODEUseCases
Majoruse:Hypothesisgenera8onandrefinement• Predic8onofcausalvariants/regulatoryelements• Predic8onoftargetgenes• Predic8onoftargetcelltypes• Predic8onofupstreamregulators
Ø Gene8cv.epigene8cØ Germlinev.soma8c
40
Overview• TheENCODEResource• UseofENCODEbytheresearchcommunity
• AccessingENCODEmaterials
41
WorkshopSession2,1,5
heps://www.encodeproject.org
ENCODEData
42 WorkshopSession2
heps://www.encodeproject.org
ENCODEData
43 WorkshopSession2
heps://www.encodeproject.org
ENCODEData
44 WorkshopSession2
heps://www.encodeproject.org
ENCODEData
45 WorkshopSession2
heps://www.encodeproject.org
ENCODEData
46 WorkshopSession2
heps://www.encodeproject.org
ENCODEData
47 WorkshopSession2
heps://www.encodeproject.org
ENCODEData
48 WorkshopSession2
ENCODEEncyclopedia
49 heps://www.encodeproject.orgWorkshopSession1,5
ENCODEEncyclopedia
50 heps://www.encodeproject.orgWorkshopSession1,5
ENCODEEncyclopedia
51 heps://www.encodeproject.orgWorkshopSession1,5
ENCODEEncyclopedia
52 heps://www.encodeproject.orgWorkshopSession1,5
ENCODEEncyclopedia
53 heps://www.encodeproject.orgWorkshopSession1,5
Publica8ons
heps://www.encodeproject.org54
Publica8ons
heps://www.encodeproject.org55
ENCODEDataStandards
heps://www.encodeproject.org56
ENCODEDataStandards
heps://www.encodeproject.org57
ENCODESohwareTools
heps://www.encodeproject.org58
ENCODESohwareTools
heps://www.encodeproject.org59
Interna8onalHumanEpigenomeConsor8um(IHEC)
• DataPortal:hep://epigenomesportal.ca/ihec/• Goal:Coordinateproduc8onof1000humanepigenomemapsforcellularstatesrelevanttohealthanddiseasehep://ihec-epigenomes.org
• Canviewbyconsor8um,byassay,bycelltype• Datafrom8consor8a
60CEEHRC
Summary-AccessingENCODEResources
• ENCODEportalheps://www.encodeproject.org– Display/downloadENCODEandRoadmapEpigenomicsdata– DataStandards– Sohwaretools– Publica8ons– Encyclopediaprototype
• ENCODEAnalysisTools– RegulomeDBhep://regulomedb.org/– HaploReghep://www.broadins8tute.org/mammals/haploreg/– RegulatoryElementsDatabasehep://dnase.genome.duke.edu– RegulomeDBGWASDatabasehep://www.regulomedb.org/GWAS/
• ENCODETutorials– hep://www.genome.gov/27553900– heps://www.encodeproject.org/tutorials/
hep://www.ncbi.nlm.nih.gov/pubmed/25762420• ENCODEmailinglist:
– heps://mailman.stanford.edu/mailman/lis8nfo/encode-announce• IHECresources
– IHECHomePagehep://ihec-epigenomes.org– IHECDataPortalhep://epigenomesportal.ca/ihec/
61
GoalsOfENCODE• Catalogallfunc8onalelementsinthegenome• Developfreelyavailableresourceforresearchcommunity
ENCODEdataarebeingusedinthestudyofhumandiseaseandbasicbiology
62