Cracking the (bio)code -- Professional Development Session at SACNAS 2014
-
Upload
tracy-heath -
Category
Science
-
view
93 -
download
3
Transcript of Cracking the (bio)code -- Professional Development Session at SACNAS 2014
Cracking the (bio)code Resources for research careers in computational biology & bioinformatics
Felipe Zapata, PhDBrown University@zapata_f
Conner Sandefur, PhDUniv. North Carolina @oshehoma
Emilia Huerta-Sanchez, PhDUniv. California, Merced @emiliahsc
Tracy Heath, PhDIowa State Univ.@trayc7
Visit our website: crackingthebiocode.github.io● Information about the session● Resources for learning to program: workshops, online courses, tutorials, etc.● Links to many degree programs in the U.S. for studying computational
biology/bioinformatics● Profiles of computational biologists and bioinformaticians
How small changes can make a big difference Bioinformatics @UNC-Pembroke Investigating how changes in gene
expression drive system-wide behaviorComputational Biology @UNC-Chapel HillPredicting therapies to improve mucus clearance in cystic
fibrosis (CF) and chronic obstructive pulmonary disease (COPD) 1 hr 24 hrs
-4 0 4
Tools I use:
Dr. Conner I. SandefurSPIRE Postdoctoral Scholar at UNC-CHVisiting Assistant Professor at UCNP
PhD BioinformaticsUniversity of Michigan Ann Arbor, Michigan
BA Computer Science George Washington UniversityWashington, DC
email: [email protected]: http://www.unc.edu/~sandefurtwitter: @oshehoma
What is the evolutionary history of species?Using transcriptomes and genomes to
resolve ancient animal radiationsPhylogeny of snails, slugs, and relatives
What genes are homologous?Using graph-based approaches to infer homology
Gene clusters inferred to be the “same” gene family across multiple species
AGALMA: https://bitbucket.org/caseywdunn/agalmaBitBucket (Git)
Dr. Felipe ZapataPostdoctoral Research AssociateBrown University
COLOMBIA
email: [email protected]: http://felipezapata.metwitter: @zapata_f
PhD Ecology, Evolution & SystematicsUniversity of Missouri-St. Louis St. Louis, Missouri
BSc Biology Universidad de Los AndesBogotá, Colombia
What does genetics tell us about human history?
Dr. Emilia Huerta SanchezAssistant ProfessorUC Merced
email: [email protected]: http://www.stat.berkeley.edu/~emiliahstwitter: @emiliahsc
Postdoc in Integrative Biology and Statistics, UC Berkeley, Berkeley, CA
PhD Applied MathematicsCornell University, Ithaca, NY
BA Mathematics & FrenchMills College, Oakland, CA
Modeling macro- & molecular evolutionary processes to infer phylogenetic relationships
● How have rates of molecular and morphological
evolution changed across the tree of life?
● How do patterns of fossilization, preservation, and
recovery change across different taxa?
● Can we detect relationships between geological
events and species diversification?
● What are the evolutionary processes acting on
different regions of the genome and how have those
factors shaped the evolution of different genes?
C++RevBayes
Probabilistic graphical models
Dr. Tracy A. HeathAssistant Professor (Jan. 2015)Iowa State University
email: [email protected]: phyloworks.orgtwitter: @trayc7
Postdoctoral FellowU. Kansas & U.C. Berkeley
PhD Ecology, Evolution & BehaviorUniversity of Texas at Austin
BA Biology Boston University
What is Computational Biology?
What is Bioinformatics?
http://crackingthebiocode.github.io/
Modeling infectious disease transmission
Compartmental models are one type of mathematical model used to investigate the spread of infectious disease
Rate of infectionRate of recovery
Change in proportion of Susceptible (S) people over time = - Susceptible (S) X Infected (I) X β
Susceptible Infected Recovered
=
Infection dynamics for different diseases can be simulated by selecting appropriate parameters
We can use models to predict how interventions change disease transmission dynamics
Infection dynamics with R0 = 2
Infection dynamics after intervention at day 10, which reduced R
0 to 0.8
R0 > 1, infection peaks then disappears R
0 < 1, infection dies out
Simulations run in Python 3.4 (downloaded as part of Anaconda package: http://continuum.io/downloads)
Agalma: automated and reproducible phylogenetic
analyses
From…a few key genes (e.g. 16S RNA, mitochondria, chloroplasts)across many species
To…High-Throughput Sequencing of 1000s of genes across many species
genes
spec
ies
spec
ies
genes
Phylogenetics
Challenges to phylogenetics• Many steps
• Many programs must be used together
• Computationally intensive
• Difficult to reproduce
Challenges to phylogenetics• Many steps
• Many programs must be used together
• Computationally intensive
• Difficult to reproduce
Automate!
Why automate?• Results are reproducible
• Results can be easily explored and extended
• Methods can be compared in a controlled setting
• Facilitate method development without reinventing
everything
https://bitbucket.org/caseywdunn/agalmaThe tool
The paper
https://bitbucket.org/caseywdunn/dunnhowisonzapata2013/The example analysis
For each transcriptome:• Quality control• Assemble transcriptome • Translate and annotate genes • Quantify gene expression• Put sequences in database
Can also:• Import DNA sequences from national databases (e.g., NCBI)• Process externally produced assemblies
Across transcriptomes (many species):• Identify homologous genes
• Build phylogenies using all genes!
silh
ouet
te im
ages
from
http
://ph
ylop
ic.o
rg/
What tools do you need?
http://crackingthebiocode.github.io/
A biological question
programming skills
statistical modeling
C++
a mathematical model
Questions?
• What programming language should I learn?• How do I get started learning a programming language?• What is the best way to become proficient in a programming language?• What is the difference between C++ and python and java and R and
MatLab and ruby and ...?• What is version control? Do I need to know it?• Do I need a GitHub account?• Where are jobs or degree programs in computational
biology/bioinformatics listed?• What does it mean to be open source? Why is it important?• and ...?
http://crackingthebiocode.github.io/
Take-Home Messages • You don’t have to be an expert programmer to do computational
biology.• Anyone can learn to program, it’s just a matter of getting started.• Computational skills are extremely helpful for streamlining biology
research.• The skills you need to learn depend heavily on you background and
your research interests. • Quantitative skills – a firm understanding of math and statistics – are
important for any research field.• Don’t be overwhelmed by all there is to know, these skills grow over
time. If you consistently seek to improve them & use them for your work you will be amazed at how your expertise will develop.
http://crackingthebiocode.github.io/
Find out more!http://crackingthebiocode.github.io/profiles.html