CS273 Algorithms for Structure and Motion in Biology
description
Transcript of CS273 Algorithms for Structure and Motion in Biology
![Page 1: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/1.jpg)
CS273CS273Algorithms for Structure and Algorithms for Structure and
Motion in BiologyMotion in BiologyInstructors:
Serafim Batzoglou and Jean-Claude Latombe
Teaching Assistant: Sam Gross
| serafim | latombe | ssgross | @ cs.stanford.edu
Spring 2006 – http://www.stanford.edu/class/cs273/
![Page 2: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/2.jpg)
Need a Scribe!!
![Page 3: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/3.jpg)
Range of Bio-CS InteractionRange of Bio-CS Interaction
Gene
Molecules
Tissue/Organs
Body system
Robotic surgery
Molecular structures,similaritiesand motions
Soft-tissue simulation andsurgical trainingCells
Simulation ofcell interaction
CS273Sequencealignment
Enormous range over space and time
![Page 4: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/4.jpg)
Focus on Proteins
Proteins are the workhorses of all living organisms
They perform many vital functions, e.g:• Catalysis of reactions• Transport of molecules• Building blocks of muscles• Storage of energy• Transmission of signals• Defense against intruders
![Page 5: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/5.jpg)
Proteins are also of great interest from a computational
viewpoint They are large molecules (few 100s
to several 1000s of atoms) They are made of building blocks
(amino acids) drawn from a small “library” of 20 amino-acids
They have an unusual kinematic structure: long serial linkage (backbone) with short side-chains
![Page 6: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/6.jpg)
Proteins are associated with many challenging
problems Predict folded structures and motion pathways Understand why some proteins misfold or
partially fold, causing such diseases as: cystic fibrosis, Parkinson, Creutzfeldt-Jakob (mad cow)
Find structural similarities among proteins and classify proteins
Find functional structural motifs in proteins Predict how proteins bind against other proteins
and smaller molecules Design new drugs Engineer and design proteins and protein-like
structures (polymers)
![Page 7: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/7.jpg)
Central Dogma Central Dogma of Molecular Biologyof Molecular Biology
![Page 8: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/8.jpg)
Central Dogma Central Dogma of Molecular Biologyof Molecular Biology
transcription
translation
![Page 9: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/9.jpg)
Protein SequenceProtein Sequence
O
N
NN
N
OO
O
Long sequence of amino-acids (dozens to thousands), also called residues
Dictionary of 20 amino-acids (several billion years old)
(residue i-1)
![Page 10: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/10.jpg)
O
N
NN
N
OO
O
Protein SequenceProtein Sequence
Peptide bond(partial double bond character)
T
![Page 11: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/11.jpg)
Central Dogma Central Dogma of Molecular Biologyof Molecular Biology
Physiological conditions: aqueous solution, 37°C, pH 7,atmospheric pressure
![Page 12: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/12.jpg)
Levels of Protein StructuresLevels of Protein Structures
hemoglobin (4 polypeptide chains)
Quaternary
![Page 13: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/13.jpg)
Mostly -helicesMostly -sheets
Mixed
![Page 14: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/14.jpg)
Intermediate states
FoldingFoldingUnfolded (denatured) state
Folded (native) state
Many pathways
![Page 15: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/15.jpg)
http://www-shakh.harvard.edu/ProFold2.html
How (we think) a protein folds ...
G = H - TS
![Page 16: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/16.jpg)
http://www-shakh.harvard.edu/ProFold2.html
How (we think) a protein folds ...
G = H - TS
![Page 17: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/17.jpg)
http://www-shakh.harvard.edu/ProFold2.html
How (we think) a protein folds ...
G = H - TS
![Page 18: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/18.jpg)
http://www-shakh.harvard.edu/ProFold2.html
How (we think) a protein folds ...
G = H - TS
![Page 19: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/19.jpg)
http://www-shakh.harvard.edu/ProFold2.html
How (we think) a protein folds ...
G = H - TS
![Page 20: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/20.jpg)
Motion of Proteins Motion of Proteins in Folded Statein Folded State
HIV-1 protease
![Page 21: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/21.jpg)
Structural variability of the overall ensemble of native ubiquitin structures
[Shehu, Kavraki, Clementi, 2005]
![Page 22: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/22.jpg)
Amylosucrase
Flexible Loop
Loop 7
![Page 23: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/23.jpg)
Central Dogma Central Dogma of Molecular Biologyof Molecular Biology
![Page 24: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/24.jpg)
BindingBinding
Inhibitor binding to HIV protease
Protein-protein binding
Ligand-protein binding
![Page 25: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/25.jpg)
Binding of Pyruvate to LDH
(reduction of pyruvate to lactase)
ASP-195HIS-193
ASP-166
ARG-169
+
+
+
THR-245
C
C
OO
O
CH3
NADH
GLN-101
ARG-106Loop
Lactate dehydrogenase environment
Pyruvate
Nicotinamide adenine dinucleotide (coenzyme)
![Page 26: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/26.jpg)
What is CS273 about?What is CS273 about?
Algorithms and computational schemes for molecular biology problems
Molecular biology seen by computer scientists
![Page 27: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/27.jpg)
y = f(x)
Biologists like experiments, specifics and classifications
They like it better to know many (xi,yi) – i.e., facts – and classify them, than to know f
Computer scientists like simulation, abstractions, and general algorithms
They want to know f – the explanation of the facts – and efficient ways to compute it, but rarely care for any (xi,yi)
One challenge of Computational Biology is to fuse these two cultures
The Shock of Two Cultures
![Page 28: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/28.jpg)
Two Views of a BioComputation Class
Where are IT resources for biology available and how to use them
How to design efficient data structures and algorithms for biology
![Page 29: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/29.jpg)
Main Ideas Behind CS273Main Ideas Behind CS2731. The information is in the sequence
Sequence Structure (shape) Function Sequence similarity Structural/functional similarity Sequences are related by evolution
![Page 30: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/30.jpg)
Main Ideas Behind CS273Main Ideas Behind CS2731. The information is in the sequence
Sequence Structure (shape) Function Sequence similarity Structural/functional similarity Sequences are related by evolution
2. Biomolecules move and bind to achieve their functions Deformation folded structures of proteins Motion + deformation multi-molecule complexes One cannot just “jump” from sequence to function
Protein folding
Ligandprotein binding
![Page 31: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/31.jpg)
Sequence Structure Function
sequencesimilarity
structuresimilarity
![Page 32: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/32.jpg)
Main Ideas Behind CS273Main Ideas Behind CS2731. The information is in the sequence
Sequence Structure (shape) Function Sequence similarity Structural/functional similarity Sequences are related by evolution
2. Biomolecules move and bind to achieve their functions Deformation folded structures of proteins Motion + deformation multi-molecule complexes One cannot just “jump” from sequence to function
CS273 is about algorithms for sequence, structure and
motion- Finding sequence and shape similarities- Relating structure to function- Extracting structure from experimental data - Computing and analyzing motion pathways
![Page 33: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/33.jpg)
Vision Underlying CS273 Goal of computational biology:
Low-cost high-bandwidth in-silico biology
Requirements:Reliable models Efficient algorithms
Algorithmic efficiency by exploiting properties of molecules and processes: Proteins are long kinematic chains Atoms cannot bunch up together Forces have relatively short ranges
Computational Biology is more than using computers to biological problems or mimicking nature (e.g., performing MD simulation)
![Page 34: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/34.jpg)
Tentative Schedule Tentative Schedule 1 April 5 Introduction
2 April 10 Protein geometric and kinematic models
3 April 12 Conformational space
4 April 17 Inverse kinematics and applications
5 April 19 Sequence similarity
6 April 24 Sequence similarity
7 April 26 Sequence similarity
8 May 1 Structure comparison9 May 3 Structure comparison10 May 8 Protein phylogeny, clustering, and
classification11 May 10 Protein phylogeny, clustering, and
classification12 May 15 Energy maintenance
13 May 17 Energy maintenance
14 May 22 Structure prediction
15 May 24 Roadmap methods
16 May 31 Structure prediction
17 June 5 Structure prediction
18 June 7 TBA
19 June 12 Project presentations (2 hours)
![Page 35: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/35.jpg)
Instructors and TAsInstructors and TAs
Instructors:– Serafim Batzoglou – Jean-Claude Latombe
TA:– Sam Gross
Emails: | serafim | latombe | ssgross | @ cs.stanford.edu
Class website: http://cs273.stanford.edu
![Page 36: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/36.jpg)
Expected Work
Regular attendance to lectures and active participation
Class scribing (assignments will depend on # of students)
Exciting programming project:http://www.stanford.edu/class/cs273/project/project.html - Structure prediction
- Clustering and distance metrics- Protein design- Something else
![Page 37: CS273 Algorithms for Structure and Motion in Biology](https://reader036.fdocuments.in/reader036/viewer/2022062810/56815ae7550346895dc8aa81/html5/thumbnails/37.jpg)
Questions?Questions?