How to make a monkey: functional adaptation in the primate genome
-
Upload
rutger-vos -
Category
Technology
-
view
1.086 -
download
4
description
Transcript of How to make a monkey: functional adaptation in the primate genome
![Page 1: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/1.jpg)
How to make a monkey: functional adaptation in the
primate genomeRutger Vos
Marie Curie Research Fellow
![Page 2: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/2.jpg)
Outline• Introduction
– The question – Primate genomes– Homology across genomes– Finding evidence for natural selection– Characterizing gene function
• Methods– Computational infrastructure– Basic workflow steps– Workflow design
• Results– Preliminary findings
• Conclusions• Acknowledgements
![Page 3: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/3.jpg)
The question
Which gene functions were under directional selection in primate evolutionary history?
![Page 4: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/4.jpg)
Primate genomes
Homo sapiensHuman
Pongo pygmaeusOrangutan
Tarsius syrichtaPhilippine tarsier
Pan troglodytesChimpanzee
Macaca mulattaRhesus monkey
Otolemur garnettiiGreater galago
Gorilla gorillaGorilla
Callithrix jacchusCommon marmoset
Microcebus murinusGray mouse lemur
![Page 5: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/5.jpg)
Primate genomes
~65 MYA (K/T boundary)
Apes
Old world monkeys
New world monkeys
TarsiersLemurs
Bush babies
![Page 6: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/6.jpg)
Homology: Orthologs and paralogs
![Page 7: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/7.jpg)
Evidence of selection: dN/dS ratio
![Page 8: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/8.jpg)
Evidence of selection: dN/dS ratio
• Or Ka/Ks or ω, the ratio of non-synonymous over synonymous substitutions– dN/dS > 1: positive selection– dN/dS ≈ 1: neutral evolution?– dN/dS < 1: stabilizing selection
![Page 9: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/9.jpg)
Gene function: the Gene Ontology
• GO is a hierarchical database of terms for genes
• Terms are structured in a directed acyclic graphs
• Terms are organized in three domains: biological process, cellular component and molecular function
![Page 10: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/10.jpg)
Gene function: the Gene Ontology
![Page 11: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/11.jpg)
Methods: Basic workflow steps
1. Protein BLAST all vs. all2. Find Reciprocal Best protein Hit clusters3. Protein align RBH clusters4. Backtranslate protein alignments to cDNAs5. Perform dN/dS ratio tests on all branches6. Lookup GO terms for sequence GIs7. Interpret results
![Page 12: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/12.jpg)
Methods: Basic workflow design
• Build a single BLAST database of all genomes, then,
• To parallelize the analysis:– Split the data into nine sets (for nine species)– Split each of nine genomes into files for each gene
(~20k files per species)– Process files in parallel
![Page 13: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/13.jpg)
Methods: File processing
…
Homo_sapiens.sh
Pan_troglodytes.sh
…Makefile
qsub setenv
qsub setenv
mak
e -j
4 al
l
![Page 14: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/14.jpg)
Methods: Software used
• NCBI standalone BLAST (formatdb, blastp, fastacmd)
• Muscle• GeneWise• HyPhy• BioPerl/Bio::Phylo (for parsing, logging and
wrapping, all scripts under svn)
![Page 15: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/15.jpg)
Methods: Project organization
From: Noble, W.S., 2009. A Quick Guide to Organizing Computational Biology Projects. PLoS Comput. Biol. 5(7).
![Page 16: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/16.jpg)
Methods: ThamesBlue hardware
• One of the 100 fastest supercomputers in the world
• IBM BladeCenter cluster • JS21 and JS20 Blade servers
with 60TB of storage connected via a Myrinet 2G network.
• SuSE Linux Enterprise Server • General Parallel File System• Batch jobs managed with
Torque.
![Page 17: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/17.jpg)
Results
• 5952 loci with >= 2 RBHs relative to humans• 2346 loci with dN/dS deviation somewhere
(p<0.05) Homo sapiens
Pan troglodytes
Gorilla gorilla
Pongo pygmaeus
Macaca mulatta
Callithrix jacchus
Tarsius syrichta
Microcebus murinus
Otolemur garnettii
![Page 18: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/18.jpg)
Results: some interesting terms
• Forebrain development, lifespan (and apoptosis), learning and social behavior in apes, including “deep” nodes
• Eye development in “higher” monkeys• Terms to do with pregnancy• Terms to do with male-male competition• Etc. Etc. (…lots of hard to interpret molecular
processes, of course…)
![Page 19: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/19.jpg)
“Brain genes”
![Page 20: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/20.jpg)
Visual system
• Primates have a highly variable visual system:– Old World monkeys: three types of cones (unique
among mammals)– New World monkeys: females trichromatic, males
dichromatic
![Page 21: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/21.jpg)
Biological conclusions
• Very, very, very, very preliminary: highest dN/dS ratios in functions for which there are multiple “optima” among primates:– Different placentation systems– Different mating systems– Different visual systems– Different life histories and brain mass investments
![Page 22: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/22.jpg)
Methodological conclusions
• Nine genomes is not that much. As FASTA files, it’s a 14Gb zipped archive (AA+cDNA).
• The problem was trivially parallelizable, so I didn’t use any MPI versions of softwares.
• Simple, consistent workflow and project design conventions are a lifesaver.
• Make each step small enough so you can rerun it, because you will.
![Page 23: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/23.jpg)
Summary
• I discussed:– Primate evolution and adaptation– Ortholog-finding– Alignment (multiple proteins, cDNA to protein)– Tree-based dN/dS ratio tests– Gene Ontology term enrichment– Methodological challenges
![Page 24: How to make a monkey: functional adaptation in the primate genome](https://reader035.fdocuments.in/reader035/viewer/2022062418/5549470ab4c905194d8b5852/html5/thumbnails/24.jpg)
Acknowledgements
• Funding: FP7-PEOPLE-IEF-2008/N°237046• DBCLS for their kind invitation• Mark Pagel, Andrew Meade for discussion and
help designing the workflow