20160208 introduction to bioinformatics - Utrecht...
Transcript of 20160208 introduction to bioinformatics - Utrecht...
![Page 1: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/1.jpg)
2/8/16
1
IntroductiontoBioinformatics
BasE.DutilhSystemsBiology:BioinformaticDataAnalysis
UtrechtUniversity,February8th 2016
Infoanddocumentation• http://tbb.bio.uu.nl/BDA/
• http://www.google.com/ http://www.wikipedia.org/– …butonly forguidanceandhints:never taketheinternetforgranted
• Campbell Biology,9th or10th edition, Pearson
• Reader– Printedinblackandwhite– DownloadfullcolorPDFat:http://tbb.bio.uu.nl/BDA/BioInf2016.pdf
– Errata:http://tbb.bio.uu.nl/BDA/errata.html
![Page 2: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/2.jpg)
2/8/16
2
Courseevaluation• Final markcourse
– 40%markofBioinformaticDataAnalysis• BasDutilh
– 10%markofBasicMaths• KirstentenTusscher
– 50%markofMathematics/Theoretical Biology• KirstentenTusscher enRobdeBoer
• BioinformaticDataAnalysisexam– Written exam– “Cheatsheet”allowed:onehand-written A4,double-sided isOK– Date:March 14th 2015at13:30-16:30inEducatorium Gamma
• BioinformaticDataAnalysisbonuspoint– Makeall exercises andhavethem signed by your assistant
• This hasto be done inthe same weekofthe practical• Incaseofemergency: lastchanceto sign offisonMonday before lecture
– Themaximummarkisa10– Mini-articlewascancelled
Howwouldyoufigureoutthefunctionofaprotein?
Knock-outmouse
X-raystructureActivityassay
BLASTsearch
![Page 3: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/3.jpg)
2/8/16
3
Howaboutforallproteinsinagenome?
Genomesizes
Tb: Tera basepairs(1012)Gb:Gigabasepairs(109)Mb:Megabasepairs(106)Kb:Kilobasepairs(103)
Chaos chaos (1.4 Tb,Friz 1968)
![Page 4: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/4.jpg)
2/8/16
4
Genedensityandnon-codingDNA• Mammals(including humans) havethelowestgenedensity– NumberofgenesinagivenlengthofDNA
• Introns withingenes• Noncoding DNAbetweengenes
Componentsofthehumangenome• 20,000– 25,000protein-codinggenes(1.5%)
• Introns (25.9%)
• Transposable elements(44.7%)– DNAtransposons– Longterminalrepeat(LTR)retrotransposons– Shortinterspersednuclearelements(SINEs)– Longinterspersednuclearelements(LINEs)– Endogenous retroviruses– Miniatureinvertedrepeattransposableelements(MITEs)
![Page 5: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/5.jpg)
2/8/16
5
Largestgenomes
Largestsequencedgenome:Loblollypine(Pinus taeda)20,000,000,000bp (20Gb)
Kinugasasō (Parisjaponica)149,000,000,000bp (149Gb)
Smallestgenomes• Eukaryota– Free:Ostreococcus tauri (12.6Mb)– Endosymb:Encephalitozoon intestinalis (2.3Mb)
• BacteriaandArchaea– Free:Mycoplasma genitalium (580kb)– Endosymb:Cand. Carsonella ruddii (160kb)
• Viruses– Circoviridae (1.8kb– onlytwoproteins!)
![Page 6: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/6.jpg)
2/8/16
6
Humangenome• 3,000,000,000 bp (3Gb)• HumanGenomeProject (HGP)
– 1990-2003– Draftgenomesequencecompletein2000
• Referencegenome– Source:blood (female)andsperm(male)– Samplestakenfrommanydonors,butonlyafewwereusedtoprotectdonor identities
– Sequenceisnot fromoneindividual• >70%fromonemaledonor
• CostHGP:$3,000,000,000– Target:$1,000genome
Prokaryotes
Geneticdiversity• PhylogeneticTreeofLife
Bacteria
Archaea
Eukaryotes
![Page 7: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/7.jpg)
2/8/16
7
Genomesequencing
Clonedgenomes
Segmentsknownorder
Fragmentandsequence
Assemblesequences
Consensusgenome
WholeGenomeShotgun (WGS)approach
![Page 8: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/8.jpg)
2/8/16
8
Personalgenomesequences
CraigVenter JamesWatson
ReferenceGenome
~5.000.000differences
~2.000.000differences
~5.000.000differences
Yourpersonalgenomesequence
![Page 9: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/9.jpg)
2/8/16
9
Sowehavea$200personalgenome…
• …nowthemillion dollarquestionis:
WhatcanIlearnfrommy3,000,000,000A’s,C’s,G’s,andT’s?
Personalizedmedicine
• Fromreactivetoproactivemedicine– Identifyhighriskalleles– Adaptlifestyle(e.g.riskofhighbloodpressure)– Preventivescreeningortreatment(e.g.riskofcancer)
• Pharmacogenomics:– Impactofgeneticvariationonresponsetomedication
SergeyBrinCo-founder
LRRK2polymorphismonchromosome12- 28%riskofParkinson’satage59- 51% atage69- 74% atage79
Co-invester
![Page 10: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/10.jpg)
2/8/16
10
Biology isBigData science#sequ
encedgeno
mes
Moore'sLaw: computerpowerdoublesevery~2years.
RNA Protein
Omics sciences• Thesuffix -ome referstoa totality ofsomesort• Gene(genetics)• Transcript(RNA)• Protein
• Metabolite• Lipid• Microbe
• Genome• Transcriptome• Proteome
• Metabolome• Lipidome• Microbiome
• Genomics• Transcriptomics• Proteomics
• Metabolomics• Lipidomics• Microbiomics (?!)
DNA
![Page 11: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/11.jpg)
2/8/16
11
Genomics• Identifydifferencesingenecontentbetweengenomes• Discovernewspecies:“BiologicalDarkMatter”• Analyzegenomeevolution• Predictgenefunctions
Chordata ↔Echinodermata
1,000,000,000,000 specieson earth?
10,000 speciescultured
30,000 genomessequenced
![Page 12: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/12.jpg)
2/8/16
12
Sample
Filter
Microbesorviruses
Metagenomics
![Page 13: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/13.jpg)
2/8/16
13
Spangetal. Nature2015
Metagenomicdiscovery ofLokiarchaeota
Prokaryotes
Geneticdiversity• PhylogeneticTreeofLife
Bacteria
Archaea
Eukaryotes
![Page 14: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/14.jpg)
2/8/16
14
Image:LisaBrownfor
Humanmicrobiomeandvirome• Inyourbody: ~1013 humancells~1014 bacteria~1015 viruses
Bioinformatics• Bioinformatics:studyofinformatic processesinbioticsystems
PaulienHogeweg andBenHesper (UtrechtUniversity,1970)• BioinformaticDataAnalysis:usingcomputationalmethodstoanalyzebiologicaldata
![Page 15: 20160208 introduction to bioinformatics - Utrecht Universitytheory.bio.uu.nl/BDA/2016/20160208_introduction_to_bioinformatics.… · 2/8/16 1 Introduction to Bioinformatics Bas E.](https://reader035.fdocuments.in/reader035/viewer/2022070712/5eccf386ae5ddd605f29327c/html5/thumbnails/15.jpg)
2/8/16
15
Bioinformatics inUtrechttoday
Bringyourlaptop