Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf ·...
Transcript of Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf ·...
![Page 1: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/1.jpg)
Introduction to Bioinformatics
Henrik Nielsen, Associate Professor
Bent Petersen, Associate Professor
DTU Bioinformatics
Department of Bio and Health Informatics
formerly known as:
Center for Biological Sequence Analysis
![Page 2: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/2.jpg)
Overview
Data & Databases
Methods
•Taxonomy
•DNA
•Protein
•Protein structure
•Alignment
•Pairwise + Multiple
•BLAST (sequence searches)
•DNA / Protein
•PSI-Blast
•LOGOs
•Phylogenetic trees
•PyMOL (3D visualization)
![Page 3: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/3.jpg)
What is bioinformatics?
![Page 4: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/4.jpg)
What are bioinformaticians up to, actually?
• Manage molecular biological data
– Store in databases, organise, formalise, describe...
• Compare molecular biological data
• Find patterns in molecular biological data
– phylogenies
– correlations (sequence / structure / expression / function /
disease)
Goals:
• characterise biological patterns & processes
• predict biological properties
– low level data ⇒ high level properties
(eg., sequence ⇒ function)
![Page 5: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/5.jpg)
Bioinformatics: neighbour disciplines
• Computational biology
– Broader concept: includes computational ecology,
physiology, neurology etc...
• -omics:
– Genomics
– Transcriptomics
– Proteomics
• Systems biology
– Putting it all together...
– Building models, identify control & regulation
![Page 6: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/6.jpg)
Bioinformatics: prerequisites
• Bio- side:
– Molecular biology
– Cell biology
– Genetics
– Evolutionary theory
• -informatics side:
– Computer science
– Statistics
– Theoretical physics
![Page 7: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/7.jpg)
Molecular biology data...
>alpha-DATGCTGACCGACTCTGACAAGAAGCTGGTCCTGCAGGTGTGGGAGAAGGTGATCCGCCAC
CCAGACTGTGGAGCCGAGGCCCTGGAGAGGTGCGGGCTGAGCTTGGGGAAACCATGGGCA
AGGGGGGCGACTGGGTGGGAGCCCTACAGGGCTGCTGGGGGTTGTTCGGCTGGGGGTCAG
CACTGACCATCCCGCTCCCGCAGCTGTTCACCACCTACCCCCAGACCAAGACCTACTTCC
CCCACTTCGACTTGCACCATGGCTCCGACCAGGTCCGCAACCACGGCAAGAAGGTGTTGG
CCGCCTTGGGCAACGCTGTCAAGAGCCTGGGCAACCTCAGCCAAGCCCTGTCTGACCTCA
GCGACCTGCATGCCTACAACCTGCGTGTCGACCCTGTCAACTTCAAGGCAGGCGGGGGAC
GGGGGTCAGGGGCCGGGGAGTTGGGGGCCAGGGACCTGGTTGGGGATCCGGGGCCATGCC
GGCGGTACTGAGCCCTGTTTTGCCTTGCAGCTGCTGGCGCAGTGCTTCCACGTGGTGCTG
GCCACACACCTGGGCAACGACTACACCCCGGAGGCACATGCTGCCTTCGACAAGTTCCTG
TCGGCTGTGTGCACCGTGCTGGCCGAGAAGTACAGATAA
>alpha-AATGGTGCTGTCTGCCAACGACAAGAGCAACGTGAAGGCCGTCTTCGGCAAAATCGGCGGC
CAGGCCGGTGACTTGGGTGGTGAAGCCCTGGAGAGGTATGTGGTCATCCGTCATTACCCC
ATCTCTTGTCTGTCTGTGACTCCATCCCATCTGCCCCCATACTCTCCCCATCCATAACTG
TCCCTGTTCTATGTGGCCCTGGCTCTGTCTCATCTGTCCCCAACTGTCCCTGATTGCCTC
TGTCCCCCAGGTTGTTCATCACCTACCCCCAGACCAAGACCTACTTCCCCCACTTCGACC
TGTCACATGGCTCCGCTCAGATCAAGGGGCACGGCAAGAAGGTGGCGGAGGCACTGGTTG
AGGCTGCCAACCACATCGATGACATCGCTGGTGCCCTCTCCAAGCTGAGCGACCTCCACG
CCCAAAAGCTCCGTGTGGACCCCGTCAACTTCAAAGTGAGCATCTGGGAAGGGGTGACCA
GTCTGGCTCCCCTCCTGCACACACCTCTGGCTACCCCCTCACCTCACCCCCTTGCTCACC
ATCTCCTTTTGCCTTTCAGCTGCTGGGTCACTGCTTCCTGGTGGTCGTGGCCGTCCACTT
CCCCTCTCTCCTGACCCCGGAGGTCCATGCTTCCCTGGACAAGTTCGTGTGTGCCGTGGG
CACCGTCCTTACTGCCAAGTACCGTTAA
• DNA sequences
![Page 8: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/8.jpg)
Molecular biology data...
• Amino acid sequences
• Protein structure:
– X-ray crystallography
– NMR
![Page 9: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/9.jpg)
Cell biology & proteomics data...
• Subcellular localization
![Page 10: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/10.jpg)
Cell biology & proteomics data...
protein-protein
interactions
![Page 11: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/11.jpg)
DNA microarray technologyTranscriptomics: DNA microarray technology
![Page 12: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/12.jpg)
Phenotype data: human diseases
![Page 13: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/13.jpg)
Prediction methods
• Homology / Alignment
• Simple pattern (“word”) recognition
• Statistical methods– Weight matrices: calculate amino acid probabilities
– Other examples: Regression, variance analysis, clustering
• Machine learning– Like statistical methods, but parameters are estimated by
iterative training rather than direct calculation
– Examples: Neural Networks (NN), Hidden Markov Models (HMM), Support Vector Machines (SVM)
• Combinations
![Page 14: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/14.jpg)
The computer
• Everything can
be reduced to
bits (0 or 1)
![Page 15: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/15.jpg)
Digital information
• A byte = 8 bits
0 1 0 0 0 0 0 1
Can be interpreted as
• The number 65
• The letter ”A”
• Part of a machine code instruction
• Part of a colour specification
• Part of a sound encoding
• …
![Page 16: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/16.jpg)
Text files
A text file is a file
where every byte is
interpreted as a
character
ExamplesPlain text .txt
Program settings .ini
C source code .c
Python script .py
TEX source .tex
Web page source .html
Sequences .fasta
The ASCII table
![Page 17: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/17.jpg)
Extended character sets
The are many ways to interpret characters with
values above 127. Here, you see two of them.
![Page 18: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/18.jpg)
Text files—line endings
• UNIX standard (including Mac OS X):
• 10 — LF ("Line feed" char).
• Old Mac (System 9 and before):
• 13 — CR ("Carriage Return" char).
• DOS/Windows:
• 13, 10 — both CR and LF.
A good text editor can handle all three systems.
Notepad for Windows cannot!
![Page 19: Introduction to Bioinformaticsteaching.healthtech.dtu.dk/.../Intro+bioinformatics.pdf · 2018-08-27 · Introduction to Bioinformatics Henrik Nielsen, Associate Professor Bent Petersen,](https://reader034.fdocuments.in/reader034/viewer/2022042314/5ee1d044ad6a402d666c8e76/html5/thumbnails/19.jpg)
jEditw
ww
.jed
it.o
rg