Workshop in Computational Structural Biology 2015 81855 & 81813, 4 points Ora Schueler-Furman TA:...
-
Upload
caroline-hensley -
Category
Documents
-
view
215 -
download
0
Transcript of Workshop in Computational Structural Biology 2015 81855 & 81813, 4 points Ora Schueler-Furman TA:...
Workshop in Computational Structural Biology
201581855 & 81813, 4 points
Ora Schueler-FurmanTA: Orly Marcu
Introduction – When, Where, How?
• When & Where:– Thursdays, Givat Ram– Lecture: 14:00-15:45,
Sprinzak 25 – Exercise: 16:00-18:45,
Sprinzak computer class #2– Lectures & exercises available
in moodle
• How:– Make sure you have an
account in CS ✓
• Exercises- Submit 7/10 exercises- Due within 2 weeks- Submit by email to
[email protected] 1/3 of grade
• Contact: Ora 87094
[email protected], or Orly 87063
Acknowledgements: Sources of figures and slides include slides from Branden & Tooze; some slides have been adapted from members of the Rosetta Community, especially from Jens MeilerExercises in Pyrosetta have been adapted from teaching material by Jeff Gray
What will we learn:Part I: Protein structure in the eye
of the computational biologist
1. Introduction to computational structural biology•The basics of protein structure•Challenges in computational biology and bioinformatics•Protein structure prediction and design
Part I: Protein structure in the eye of the computational biologist
2. Introduction to Rosetta and structural modeling•Approaches for structural modeling of proteins •The Rosetta framework and its prediction modes•Cartesian and polar coordinates•Sampling (find the structure) and •Scoring (select the structure)
3. Optimization techniques•Energy minimization•Monte Carlo (MC) Sampling•MC with minimization (MCM)
Part II: Protein modeling and design
4. Ab initio modeling: Principles and approaches
5. Full-atom refinement• Local optimization• Side chain modeling
– The representation of side chains as rotamers– Rotamer and off-rotamer sampling– Finding minimum energy rotamer combinations
Part II: Protein modeling and design
6. Homology modeling• Selection of template and alignment of query sequence to
template• Loop modeling approaches (modeling of unaligned regions)
7. Protein design • The theoretical basis of protein design; how different design
goals are achieved• Success and challenge in computational design
Part III: Protein interactions8. Protein-protein docking• Challenges and approaches in protein docking• The theoretical basis of low-resolution and high-resolution docking
9. Interface analysis and design• Determinants of binding affinity and specificity• Identification of interface residue hotspots: Computational alanine scanning• Success and challenge in interface design
10. Summary
What will we learn: ExercisesExercises will span a variety of subjects and involve both Rosetta and other widely-used protocols
• Basic introduction: how to look at proteins• Protein structure evaluation and classification: What does my protein do, how good is its structure? • Structure comparison• Running Rosetta• Pyrosetta and Rosettascripts: running and programming
• ab initio modeling• Homology modeling• Structure refinement• Modeling side chains• Loop modeling• Protein docking• Interface analysis –
Computational alanine scanning
• Protein design and protein interface design
1. Introduction to Computational Structural Biology
The Basics of Protein Structure
The central dogma
The code: 4 bases, 64 triplets, 20 amino acids
4 Hierarchies of protein structure
• Anfinsen: sequence determines structure
The building blocks: 20 amino acids
• Differ in size, polarity, charge, secondary structure propensity …
• The simplest aa• No sc• Very flexible bb
Special amino acids
• Cyclic aa• sc Connects bb N• Very constrained bb
N
CO
C H
HH
N
CO
C H
CH2
CH2H2C
Aliphatic amino acids
• sc contains only carbon and hydrogen atoms• hydrophobic
Amino acids with hydroxyl group
Negatively charged amino acids
Different size → different tendency for 2. structure
Amide amino acids
Positively charged amino acids
• large sc
• pKa 11.1 • pKa 12
Aromatic amino acids
• sc contains aromatic ring
• pKa 7
• benzene ring
Amino acids with sulfur
Cystine
Oxidation of Sulfur atoms creates covalent disulfide bond (S-S bond)between two cysteines
S-S bonds stabilize the protein
A chainG I V E Q C C A S V C S L Y Q L E N E N Y C N
s
s
s
s
B chainF V N Q H L C G S H L V E A L Y L V C G E R G F..
s
s
InsulinA chain
NN
CC
B chain
Post-translational modifications
• Processing (pro-insulin/insulin)– control of protein activity
• Glycosylation– protein trafficking
• Phosphorylation (Tyr, Ser, Thr) – regulation of signaling
• Methylation, Acetylation – histone tagging
• ….
24
Metal binding proteins
• aa: HCDE• Fe, Zn, Mg, Ca• Fe
– blood: red hemoglobin– electro-transfer: cytochrome c
• Zn – in DNA-binding “Zn-finger” proteins– Alcohol dehydrogenase: oxidation of alcohol
25
Important bonds for protein folding and stability
Dipole moments attract each other by van der Waals force (transient and very weak: 0.1-0.2 kcal.mol) Hydrophobic interaction –hydrophobic groups/ molecules tend to cluster together and shield themselves from the hydrophilic solvent
Dipole moments attract each other by van der Waals force (transient and very weak: 0.1-0.2 kcal.mol) Hydrophobic interaction –hydrophobic groups/ molecules tend to cluster together and shield themselves from the hydrophilic solvent
Hydrogen bonding potential of amino acids
Primary sequence: concatenated amino acids
Primary sequence: concatenated amino acids
Formation of a peptide bond
O - oxygen
N - nitrogen
O-+H3N
R
H
CO
C
||
H - hydrogen
C - carbon
cpk colors
The geometry of the peptide backbone
• Peptide bond length and angles do not change• Peptide dihedral angles define structure
•The peptide bond is planar & polar :=180o (trans) or 0o (cis)
Dihedral angles
Dihedral angles 1-4 define side chain
From wikipedia
• Dihedral angle: defines geometry of 4 consecutive atoms (given bond lengths and angles)
Ramachandran plot
Glycine: flexible backboneAll except Glycine
33
Ramachandran plot
34
Secondary structure: local interactions
Secondary structure – built from backbone hydrogen bonds
helix• discovered 1951 by Pauling• 5-40 aa long• average: 10aa• right handed • Oi-NHi+4 : bb atoms satisfied
• helix: i - i+5• 310 helix: i - i+3
1.5Å/res
Favored: Ala, Leu, Arg, Met, Lys Disfavored: Asn, Thr, Cys, Asp, Gly
helix: dipole
• binds negative charges at N-terminus
View down one helical turn39
helix: side chains point out
Frequent amino acids at the N-terminus of helices
Pro Blocks the continuation of the helix by its side
chain Asn, Ser
Block the continuation of the helix by hydrogen bonding with the donor (NH) of N3
Ncap, N1, N2, N3 …….Ccap
40
Helices of different character
1. buried 2. partially exposed3. exposed
41
Representation: helical wheel
42
1. buried 2. partially exposed: amphipathic
helix3. exposed
-sheet• Involves several regions in sequence• Oi-NHj
•Parallel andanti-parallelsheets
43Favored: Tyr, Thr, Ile, Phe, Trp Disfavored: Glu, Ala, Asp, Gly, Pro
Antiparallel -sheet
• Parallel Hbonds• Residue side chains point up/down/up ..• Pleated
44
Parallel -sheet
• less stable than antiparallel sheet• angled hbonds
45
Connecting elements of secondary structure define tertiary structure
46
Loops
• connect helices and strands• at surface of molecule• more flexible• contain functional sites
47
Hairpin Loops ( turns)• Connect strands in antiparallel sheet
G,N,D G G S,T
48
Super secondary structures – Greek Key Motif
Most common topology for 2 hairpins
49
Super Secondary Structures- Motif
• connects strands in parallel sheet• always right-handed
50
Repeated motif creates -meander: TIM barrel
51
Tertiary structure defines protein function
The quaternary structure of a protein defines its biological
functional unit
53
Quaternary structure: Hemoglobin consists of 4 distinct chains
Quaternary structure: assembly of protein domains
(from two distinct protein chains, or two domains in one protein sequence)
Glyceraldehyde phosphate dehydrogenase:• domain 1 binds the substance for being metabolized, • domain 2 binds a cofactor
1. Introduction to Computational Structural Biology
Experimental determination of protein structure: X-ray diffraction
and NMR
Experimental determination of structure
X-ray crystallography• Determines electron
density – positions of atoms in structure
• Highly accurate• Static: depends on
crystal
NMR• Determines constraints
between labeled spins• Allows measure of
structure in solution• Resolution not defined:
more constraints – better defined structure
X-ray diffraction
X-ray diffraction
If direction is such that -> Constructive addition-> Reflection spot in the diffraction pattern
• Wavelength of x-ray ~ crystal plane separations
• Rotation of crystal relative to beam allows recording of different diffractions
• Diffraction maps are translated to electron density maps using Fourier Transform
Resolution measures diffraction angles (high angle peaks – high resolution data)
X-ray diffraction
Iterative refinement allows improvement of structure
R-factor measures quality
Fo – observedFc - calculated
X-ray diffraction
1950’s first protein structure solved by Kendrew & Perutz: sperm whale myoglobin
Today: ~107’000 structures solved, most by x-ray crystallography
Challenges• Grow crystal• Determine phase
NMR (Nuclear Magnetic Resonance)
NMR-active nuclei (w spins)1H, 13C
Application of magnetic field reorients spins – measure resonance between close nuclei
Extract constraints & determine structure
1. Introduction to Computational Structural Biology
Challenges in Computational Structural Biology
Protein structure prediction and design
Protein sequence
Protein sequence
Protein structureProtein
structure
Protein Structure prediction
Protein Structure prediction
Protein DesignProtein Design
FASTA>2180 hSERTMETTPLNSQKQ……
PDBATOM 490 N GLN A 31 52.013 -87.359 -8.797 1.00 7.06 NATOM 491 CA GLN A 31 52.134 -87.762 -10.201 1.00 8.67 CATOM 492 C GLN A 31 51.726 -89.222 -10.343 1.00 10.90 CATOM 493 O GLN A 31 51.015 -89.601 -11.275 1.00 9.63 O…..….
Additional topics in computational structural biology
• Nucleic acids - Prediction of binding and structure– RNA stem & loops, pseudoknots; protein-RNA binding– DNA curvature; protein-DNA binding
• Prediction of macromolecular structures– Reconstruction of protein assemblies from low-
resolution cryo-EM maps
• Protein-ligand interactions– Docking of small ligands– Design of inhibitors
… and many many more!… and many many more!