BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin [email protected]
Transcript of BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin [email protected]
![Page 1: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/1.jpg)
BIOC3010: Bioinformatics - Revision lecture
Dr. Andrew C.R. Martin
http://www.bioinf.org.uk/
![Page 2: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/2.jpg)
Data Creation
Analysis
Prediction
Presentation
Searching
OrganizingSequences
DNAProtein
ComputersStructures
![Page 3: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/3.jpg)
Introductionary Lecture
![Page 4: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/4.jpg)
Introduction
• helps you create data
• example of fragment assembly
Bioinformatics…
![Page 5: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/5.jpg)
Introduction
• provides tools to store and search data
• databases and databanks• primary/secondary/composite/gateways
Bioinformatics…
![Page 6: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/6.jpg)
Introduction
• allows you to make predictions
• prediction techniques– moving windows, – computer learning
Bioinformatics…
![Page 7: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/7.jpg)
Introduction
• allows you to create 3D models
• separate lecture
Bioinformatics…
![Page 8: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/8.jpg)
Introduction
• allows transfer of annotations
• homologous proteins likely to perform similar functions
Bioinformatics…
![Page 9: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/9.jpg)
IntroductionAnnotations…
• Pre-genome world• Post-genome world
• Annotations will change
![Page 10: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/10.jpg)
Genomes and Gene Prediction Lectures
![Page 11: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/11.jpg)
Genome structure• C-paradox
• Compare prokaryotes and eukaryotes
• Complexity of eukaryotes:– Introns/exons,– Repeated sequences,– Transposable elements,– Pseudogenes
• Problems introduced by these...
![Page 12: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/12.jpg)
ORF Scanning in Eukaryotes
exon intron exon5’ 3’
Intron/exonsplice sites
![Page 13: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/13.jpg)
Finding Genes in Genomic DNA
Ab initio methods Similarity based methods
Integrated approaches
30 40
TRY4
![Page 14: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/14.jpg)
Prediction accuracy
• Nucleotide level• Exon level
• Measures for assessment
![Page 15: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/15.jpg)
Computing Lecture
![Page 16: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/16.jpg)
Computers
![Page 17: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/17.jpg)
Operating systems
• What is an operating system?• Examples of operating systems• Choice of operating systems for different areas of
research
![Page 18: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/18.jpg)
Computers and computer science
• Data structures and information retrieval
– Relational databases– Design of databases to reduce errors in data
• Simple examples of SQL and structuring data into tables
Must handle:
![Page 19: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/19.jpg)
Computers and computer science
• Algorithms: how to solve a problem
– Defined an algorithm– Looked at an example
Must handle:
![Page 20: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/20.jpg)
Computers and computer science
• Data mining and machine learning
– Extract patterns, etc from data– Computer software which learns from examples and
is then able to make predictions
Must handle:
![Page 21: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/21.jpg)
Comparative Modelling Lecture
![Page 22: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/22.jpg)
What is comparative modelling?
• Build a three-dimensional (3D) model of a protein...
• …based on known structure of a (generally) homologous protein sequence
• "Homology Modelling" is misleading:– fold recognition and threading allow recognition of
non-homologous sequences which adopt the same fold
![Page 23: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/23.jpg)
Stages in CM
1. Identify templates (or ‘parents’)
2. Align the target sequence with the parent(s),
3. Find:structurally conserved regionsstructurally variable regions
4. Inherit the SCRs from the parent(s)
5. Build the SVRs
6. Build the sidechains
7. Refine the model
8. Evaluate errors in the model
![Page 24: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/24.jpg)
Correct alignment is the structural alignment.
Align target with parent(s)
Structure ofTarget
Optimal alignmentbased on
Structural Equivalents
Structure ofParent
We don’t have this!
Guess structural alignment
from sequence alignment
![Page 25: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/25.jpg)
An example MLSA
![Page 26: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/26.jpg)
Sequence alignment quality
![Page 27: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/27.jpg)
Assessing the model
• Ideal is to compare the model with the true target structure - 4-6Å; 2Å; 0.5Å
NidRMS
2
![Page 28: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/28.jpg)
Model quality
The main factors are:
The sequence identity with the primary parent The number and size of indels The quality of the alignment The amount of change which has been necessary to the
parent(s) to create the model.
![Page 29: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/29.jpg)
Summary of CASP2results
CASP8 ransummer 2008
http://predictioncenter.gc.ucdavis.edu
![Page 30: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/30.jpg)
Medical Applications Lecture
![Page 31: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/31.jpg)
Mutations, Alleles & Polymorphisms
• Mutation:– any change in DNA sequence
• Allele:– alternative form of a genetic locus; one inherited from
each parent– e.g. eye colour locus - brown and blue alleles
• Polymorphism:– genetic variation present in >= 1% of a normal
population
![Page 32: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/32.jpg)
How are SNPs useful?
• Understanding evolution– Some alleles may be advantageous in one
environment, but disadvantageous in another
• DNA fingerprinting• Markers to map traits
– diseases, characteristics
• Pharmacogenomics– genotype-specific medications
![Page 33: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/33.jpg)
Drug responses
Drug efficacy may be affected by:
• transporters• metabolism• receptors• signalling pathways, etc.
![Page 34: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/34.jpg)
Potentially lethal SNPs
First described ~2000 years ago
“What is food to some men may be fierce poison toothers”
Lucretius Caro
![Page 35: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/35.jpg)
Protein Sequence
DNA Sequence
Protein Structure Protein Function
Mutation
Altered Sequence
Altered Structure
AlteredFunction
UnderstandStructure &
Function
Restore Structure
RestoredFunction
DesignDrugs
![Page 36: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/36.jpg)
• Looked at p53...
• Local level - effects of mutations
• General classes– Functional– Fold Preventing– Destabilizing
Types of mutations
![Page 37: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/37.jpg)
![Page 38: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/38.jpg)
How human?
Chimeric: 67% human
Humanized: 90% human
Mouse: 0% human
![Page 39: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/39.jpg)
Antibody Humanization
![Page 40: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/40.jpg)
Summary
– Diagnosis of disease– Prediction of disease risk– Prognosis– Customized response to disease– Identifying drug targets - treatments– Engineering of proteins for therapy
![Page 41: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/41.jpg)
Docking and Drug Design Lecture
![Page 42: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/42.jpg)
Van der Waals forcesElectrostatic (Salt bridge) InteractionHydrogen bondsHydrophobic bonding
+ + -+ +
Surface complementarity
-+ + + ++
![Page 43: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/43.jpg)
Six degrees of freedom- protein and ligand both treated as rigid- 3 rotations / 3 translations
Docking methods - rigid body
Just like docking the space shuttle with a satellite
Image from NASA
![Page 44: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/44.jpg)
Treat receptor as static / ligand as flexible
Dock ligand into binding pocket- generate large number of possible orientations
Evaluate and select by energy function
Docking methods - flexible ligand
![Page 45: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/45.jpg)
Ligand Matching
• Match sphere centres against ligand atoms• Find possible ligand orientations• Often >10,000 orientations possible
Find the transformation (rotation + translation) to maximize sphere matching
DOCK
![Page 46: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/46.jpg)
Virtual Screening
• Docking can be used for virtual screening
• Scan a library of potential drug molecules• Identify leads
![Page 47: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/47.jpg)
LUDI (InsightII) - find fragments that can bind
GRID - uses molecular mechanics potential to find interaction sites for probe groups
X-site - uses an empirical potential to find interaction sites for probe groups
De Novo Drug Design
![Page 48: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/48.jpg)
Stupid mistakes...
• Don't confuse secondary databases with secondary structure!
• Ensure you know the difference between SCOP/PFam functional domains and CATH structural domains
![Page 49: BIOC3010: Bioinformatics - Revision lecture Dr. Andrew C.R. Martin martin@biochem.ucl.ac.uk](https://reader034.fdocuments.in/reader034/viewer/2022051315/56649e205503460f94b0c93c/html5/thumbnails/49.jpg)
Summary
• Find pockets• Principles for docking - complementarity• Docking
– rigid body / ligand flexibility
• Virtual screening• Identifying probe interaction sites
– build ligands de novo