Bio-modeling. introduction molecular biology biotechnology bioMEMS bioinformatics bio-modeling...

162
bio-modeling

Transcript of Bio-modeling. introduction molecular biology biotechnology bioMEMS bioinformatics bio-modeling...

Page 1: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

bio-modeling

Page 2: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

introduction molecular biology biotechnology bioMEMS bioinformatics bio-modeling cells and e-cells transcription and regulation cell communication neural networks dna computing fractals and patterns the birds and the bees ….. and ants

course layout

Page 3: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

introduction

Page 4: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

far and away in the past

Newton’s equations of motions (17th -18th century) Molecular dynamics (MD)

Boltzmann’s statistics (19th century) Monte Carlo (MC)

Schrödinger/Heisenberg’s quantum mechanics (20th century)

Page 5: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

birth of simulation in chemistry

1950’s: do it by hand (or mechanical calculator)! Tried to solve Newton’s equation of motion for small systems (e.

g. three-atom system) Didn’t take very long before they saw computers

1970’s: Age of punchcards 1980’s: Better IO devices

Workstations dominated as research platforms

Page 6: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

first generation (1980’s – 1990’s)

Gas phase reaction (e.g.) H + H2 H2 + H MD

RA-BRB-C

Page 7: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Liquid simulation (e.g.) Lennard-Jones Fluid MD/MC

first generation (1980’s – 1990’s)

Page 8: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Proteins on lattice MC

first generation (1980’s – 1990’s)

Page 9: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Quantum mechanical structure calculation (semi-empirical, ab initio, …)

first generation (1980’s – 1990’s)

Page 10: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

revolution (~ 1995)

Workstation-like PCs 100 hr Cray time 64MB / 150MHz P

entium “Cheap and fast”

Impacts Two directions

1) More accurate methods2) Larger system

Start of bio-simulations

Page 11: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

impact on “non-bio” simulations

RA-BRB-C

Better surface Revisions on existing surfaces Dynamics on quantum mechanical s

urfaces Quantum wavepacket dynamics

Time dependent Schrödinger equation instead of Newton’s equation

Totally quantum (can’t be more accurate)

Some people still do this for hydride/proton transfer in enzyme dynamics

Page 12: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Impacts on bio-simulations

Proteins got free from the lattice! Off lattice model (still, each residue as a bead) United atom approach (e.g. CH3 one atom) All atom approach

With water (explicit solvent) Without water (implicit solvent)

What to look at? Kinetics: dynamic characteristics (e.g. folding simulation) Thermodynamics: equilibrium characteristics (e.g. binding affinity of

protein & drug)

Page 13: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Implicit solvent Solvent accessible surface area (SASA) Solvation free energy Cheaper than explicit Discrete nature of solvent not included Different methods for SASA/free-E calculation

Generalized Born model (GB/SA) Poisson-Boltzmann model (PB/SA) Distance dependent dielectric (DD/SA)

solvent models

Page 14: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

solvent models

Explicit solvent Water as individual molecules Expensive calculation Periodic boundary conditions usually necessary

Rigid/flexible, polarizable/non-polarizable SPC, TIP3P, TIP4P, TIP5P, …

Page 15: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

impacts on bio-simulations

Proteins got free from the lattice! Off lattice model (each residue as a bead) United atom approach (e.g. CH3 one atom) All atom approach

With water (explicit solvent) Without water (implicit solvent)

What to look at? Kinetics: dynamic characteristics (e.g. folding simulation) Thermodynamics: equilibrium characteristics (e.g. binding affinity

of protein & drug) Remember, proteins are still big!

Page 16: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

off lattice go model

Developed from lattice model: “funnel concept”

Nature has developed proteins to fold (evolution)

Proteins can be modeled to fold Native contacts energy surface Matches with experimental observatio

ns

Page 17: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

united atom/implicit model folding

“Statistical folding” Starts from many independent traj

ectories Lucky trajectories fold

Nfolded / Ntotal = kfold x time

Page 18: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

all atom unfolding

Folding inferred from unfolding At high T, unfolding is fast (~ 1 ns) Full atomistic detail from folded state to unfolded state

Page 19: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

binding free energy: docking

Molecular modeling” Binding free energy is calculated based on the shape of ligand an

d protein Drug design

Page 20: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

binding free energy: more accurate versions

Free energy: Potential + entropy factor P + L PL

Thermodynamic integration (TI) Free energy perturbation (FEP) Jarzinsky’s inequality

Extremely expensive calculationsF

Page 21: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

free energy landscape method

Kinetic information is inferred from free energy surface

Rough free energy surface can be obtained faster by parallelization

“Trajectory by intuition”

Page 22: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

current limitation

Accuracies of models Force field Solvent models

Speed For small proteins (<50 amino acids):

1 ns ~ 1 day Biologically relevant event timescale > 1 s

Size Many proteins are not just large: they are huge!

Page 23: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

responses to the challenges

Accuracy: Blend with quantum mechanical calculation QM/MM, QM-trajectory method (e.g. CPMD)

Speed E.g. Compute on video card

Size E.g. Umbrella sampling

Page 24: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Biological Systems are complex, thus, a combination of experimental and computational approaches are needed.

computational biology

Page 25: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Computational Biology Bioinformatics More than sequences, database searches, statistics or

image analysis.

A part of Computational Science Using mathematical modeling, simulation and

visualization Complementing theory and experiment

computational biology

Page 26: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

A B

irreversible, one-molecule reaction examples: all sorts of decay processes, e.g.

radioactive, fluorescence, activated receptor returning to inactive state

any metabolic pathway can be described by a combination of processes of this type (including reversible reactions and, in some respects, multi-molecule reactions)

simplest chemical reaction

Page 27: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

A B

various levels of description: homogeneous system, large numbers of molecules =

ordinary differential equations, kinetics small numbers of molecules = probabilistic equations,

stochastics spatial heterogeneity = partial differential equations,

diffusion small number of heterogeneously distributed

molecules = single-molecule tracking (e.g. cytoskeleton modelling)

simplest chemical reaction

Page 28: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Imagine a box containing N molecules.

How many will decay during time t? k N Imagine two boxes containing N/2 molecules

each.

How many decay? k N Imagine two boxes containing N molecules

each.

How many decay? 2k N In general:

)(*)(

tndt

tdn teNtn 0)(

differential equation (ordinary, linear, first-order)

exact solution (in more complex cases replaced by a numerical approximation)

kinetic description

Page 29: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

what is bio-modeling?

Page 30: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

DNA

RNA

PROTEIN

GAA GTT GAA AAT CAG GCG AAC CCA CGA CTG

GAA GUU GAA AAU CAG GCG AAC CCA CGA CUG

GLU GAL GLU ASN GLN ALA ASN PRO ARG LEU

biological building blocks

Page 31: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

GLU GLU ASNVAL LEUARGPROASNALAGLN . . .

GLU VAL

GLU

ASN GLN

ALA

ASN PRO

ARG

LEU

protein folding

Page 32: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

some fundamental questions

Question #1:Given a protein or DNA molecule, what is the geometric structure of the molecule?

Question #2:Why and how protein folds to a unique three-dimensional structure?

Question #3:Given a set of distances between pairs of atoms, how can we determine the coordinates of the atoms?

Question #4:Given the magnitudes of the structure factors of a protein, how can we determine the phases of the structure factors?

Question #5:Given two proteins, how can we compare their geometric structures?

Question #6: …

Page 33: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Protein X-ray Crystallography Nuclear Magnetic Resonance Potential Energy Minimization Molecular Dynamics Simulation Homology Modeling Fold Recognition Inverse Protein Folding

methods for structure prediction and determination

Page 34: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

empirical structure determination

Two major experimental methods for determining protein structure

X-ray Crystallography Requires growing a crystal of the protein

(impossible for some, never easy) Diffraction pattern can be inverse-Fourier transformed to

characterize electron densities (Phase problem) Nuclear Magnetic Resonance (NMR) imaging

Provides distance constraints, but can be hard to find a corresponding structure

Works only for relatively small proteins

Page 35: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

X-ray crystallography

X-rays, since wavelength is near the distance between bonded carbon atoms

Maps electron density, not atoms directly Crystal to get a lot of spatially aligned atoms Have to invert Fourier transform to get structure, but

only have amplitudes, not phases

Page 36: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

X-ray crystallography

Page 37: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

In X-ray crystallography, protein first needs to be purified and crystallized, which may take months or years to complete, if not failed.

After that, the protein crystal is put into an X-ray equipment to make an X-ray diffraction image. The diffraction image can be used to determine the three-dimensional structure of the protein.

The process is time consuming, and some proteins cannot even be crystallized.

X-ray crystallography computing

Page 38: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

X-ray crystallography computing

A mathematical problem, called the phase problem, needs to be solved before every crystal structure can be fully determined from the diffraction data.

80% of the structures in PDB Data Bank were determined by using X-ray crystallography.

Page 39: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

NMR structure determination

The NMR approach is based on the fact that nuclei spin and generate magnetic fields. When two nuclei are close their spins interact. The intensity of the interaction depends on the distance between the nuclei. Therefore, the distances between certain pairs of atoms can be estimated by measuring the intensities of the nuclei spin-spin couplings.

The distance data obtained from the NMR experiment can be used to deduce the structural information for the molecule. One way of achieving such a goal is based on molecular distance geometry.

Page 40: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

NMR structure determination

Not all distances between pairs of atoms can be detected. In practice, only lower and upper bounds for the distances can be obtained also.

Structure can be determined by solving a distance geometry problem with the distance data from the NMR experiments.

15% of the structures in PDB Data Bank were determined by using NMR spectroscopy.

Page 41: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

potential energy minimization

HypothesisProtein native structure has the lowest or almost lowest potential energy. It can therefore be located at the global energy minimum of protein.

Page 42: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

potential energy minimization

A reasonably accurate potential energy function needs to be constructed.

Given such a function, a local minimum is easy to find, but a global one is hard, especially if the function has many local minima. No completely satisfactory algorithm has been developed yet for minimizing proteins.

Potential energy minimization has been used successfully for structure refinement though.

Page 43: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

molecular dynamics

Folding can be simulated by following the movement of the atoms in protein according to Newton’s second law of motion.

Page 44: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

The step size has to be small in femto-second to achieve accuracy.

Current computing technology can make only picoseconds to microseconds of simulation, while protein folding may take seconds or even longer time.

Molecular dynamics simulation has been used successfully for the study of other types of dynamical behavior of protein.

molecular dynamics

Page 45: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

limitations of MD simulations

Full atomic representation noise difficulty in discerning the dominant mechanisms of motion need for methods for filtering out the noise, such as Essential Dynamics.

Empirical force fields limited by the accuracy of the potentials.

Time steps constrained by fastest motion (vibrations in bond lengths occur in the femtoseconds (fs) time range and necessitate the use of timesteps of 1-5 fs).

Inefficient sampling of the complete space of conformations. Limited to small proteins (100s of residues) and/or short time

s (subnanoseconds).

Page 46: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Homology ModelingSequence to Sequence

Fold RecognitionStructure to Sequence

Inverse Protein FoldingSequence to Structure

sequence structure alignment

Known Sequences / Structures

Ranking Sequences / Structures

Sequence Structure Alignment

Page 47: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Scoring functions may not be able to distinguish between good and bad matches.

Computing the best alignment is NP-hard in general when gaps are allowed.

The results are not accurate and have only certain level of confidence.

sequence structure alignment

Page 48: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

what is biomolecular modeling?

Application of computational models to understand the structure, dynamics, and thermodynamics of biological molecules

The models must be tailored to the question at hand: Schrödinger equation is not the answer to everything! Reductionist view bound to fail!

This implies that biomolecular modeling must be both multidisciplinary and multiscale

Page 49: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

an odd remark

"Every attempt to employ mathematical methods in the study of chemical questions must be considered profoundly irrational and contrary to the spirit in chemistry. If mathematical analysis should ever hold a prominent place in chemistry - an aberration which is happily almost impossible - it would occasion a rapid and widespread degeneration of that science."

A. Comte (1830)

Page 50: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

1992 Nobel Prize in ChemistryRudolph Marcus (Theory of Electron Transfer)

1998 Nobel Prize in ChemistryJohn Pople (ab initio)Walter Kohn (DFT-density functional theory)

a Nobel remark

Page 51: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

growth of biological databases

3D structures growthhttp://www.rcsb.org/pdb/holdings.html

Page 52: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

molecular modeling

Page 53: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Predictions:•Structure•Properties

Mathematicalmodel

“First Principles” • •- dE / dri = mi d2ri / dt2(MD)•Folding simulations

H = EQM

Empirical Correlations - {property} = k {Descriptors}

•E = Ebonded + Enonbonded (MM)

• (QSAR)

•Fold recognition

^

log ' ''1 2C k k k

MolecularModel

structure-property relationships

Page 54: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Conformational energy (potential energy)

Evalence = Ebond + Eangle + Etorsion + Eoop bond stretching(Ebond)

valence angle bending (Eangle)

dihedral angle torsion (Etorsion)

out-of-plane interactions (Eoop)

Enonbond = EvdW + ECoulomb + Ehbond

van der Waals (EvdW)

electrostatic (ECoulomb)

hydrogen bond (Ehbond)

F.Melani Molecular Modeling in Chimica Farmaceutica

nonbondvalencetotal EEE

molecular geometry and molecular properties

Page 55: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Force-field

definition by atoms type atomic charges constant of force, equlibrium values energy equations

Σ Force fields

conformational energy

(potential energy)

molecular geometry and molecular properties

F.Melani Molecular Modeling in Chimica Farmaceutica

Page 56: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

standard force field

molecular geometry and molecular properties

F.Melani Molecular Modeling in Chimica Farmaceutica

Page 57: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

molecular geometry and molecular properties

F.Melani Molecular Modeling in Chimica Farmaceutica

Page 58: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

bond-stretching ( Ebond )

Morse quadratic quartic

2)( 01 rrek 20 )( rrk 4

043

032

02 )()()( rrkrrkrrk

Morse quadratic

valence angle bending (Eangle ) 2

0 )( k4

043

032

02 )()()( kkk

quadratic quartic

dihedral angle torsion ( Etorsion )

)cos(1 0 nk

molecular geometry and molecular properties

F.Melani Molecular Modeling in Chimica Farmaceutica

Page 59: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

out-of-plane interactions ( Eoop )

H R'

R

O

2k

k

molecular geometry and molecular properties

F.Melani Molecular Modeling in Chimica Farmaceutica

Page 60: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

nonbond term (Enonbond )

van der Waals ( EvdW )

601200 2

ij

ij

ij

ij

i jij r

r

r

rE

612

ij

ij

i j ij

ij

r

D

r

C

hydrogen bond ( Ehbond )

1001200 65

ij

ij

ij

ij

i jij r

r

r

rE

1012

ij

ij

i j ij

ij

r

D

r

C

electrostatic ( Ecoulomb )

i j ij

ji

r

qq

molecular geometry and molecular properties

F.Melani Molecular Modeling in Chimica Farmaceutica

Page 61: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Example: H2O (potential energy )

Koh, b0OH, KHOH, and 0

HOH are parameters of the forcefield

b is the current bond length of one O-H b' is the length of the other O-H bond is the H-O-H angle.

22'2 oHOHHOH

oOH

oOHOH KbbbbKE

molecular geometry and molecular properties

F.Melani Molecular Modeling in Chimica Farmaceutica

Page 62: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

ij

ji

ij

ij

i j ij

ij

r

qq

r

D

r

CE

612

int

The objective: searching the orientations with low interaction energies.

DOCKING

molecular geometry and molecular properties

F.Melani Molecular Modeling in Chimica Farmaceutica

Page 63: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

MEP

drr

r

R

ZpV

rp

nucleus

A Ap

A )(

)(

i pr

i

ir

qpV )(

electronic density

)()()( rrPrionsBasisFunct

molecular geometry and molecular properties

F.Melani Molecular Modeling in Chimica Farmaceutica

Page 64: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

molecular vibration

Page 65: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

molecular vibration

Page 66: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure

Page 67: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Most proteins will fold spontaneously in water, so amino acid sequence alone should be enough to determine protein structure

However, the physics are daunting: 20,000+ protein atoms, plus equal amounts of water Many non-local interactions Can takes seconds (most chemical reactions take place

~1012 --1,000,000,000,000x faster) Empirical determinations of protein structure are

advancing rapidly.

protein structure

Page 68: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Proteins are polymers of amino acids linked by peptide bonds.

Properties of proteins are determined by both the particular sequence of amino acids and by the conformation (fold) of the protein.

Flexibility in the bonds around C: (phi) (psi) sidechain

protein structure

Page 69: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Protein structure is described in four levels Primary structure: amino acid sequence Secondary structure: local (in sequence) ordering into

()Helices: compressed, corkscrew structures ()Strands: extended, nearly straight

structures ()Sheets: paired strands, reinforced by

hydrogen bonds parallel (same direction) or antiparallel

sheets Coils, Turns & Loops: changes in direction

Tertiary structure: global ordering (all angles/atoms) Quaternary structures: multiple, disconnected amino acid

chains interacting to form a larger structure

protein structure

Page 70: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

helices

Page 71: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

2 types of sheets

anti-parallel parallel

Page 72: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

turns

Page 73: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

combining secondary structures to make motifs

DNA-binding helix-turn-helix Calcium-binding motif

Page 74: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

24 ways to arrange adjacent hairpins

Page 75: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

alpha/beta domains

Triosephosphate isomerase Dehydrogenase

Page 76: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Ramanchandran plot

Page 77: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Ramanchandran plot

always glycine

Page 78: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure cartoons

Page 79: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure representations

Page 80: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure representations

Page 81: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure representations

Page 82: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure representations

Page 83: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure representations

Page 84: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Proteins are created linearly and then assume their tertiary structure by “folding.” Exact mechanism is still unknown

Proteins assume the lowest energy structure Or sometimes an ensemble of low energy structures.

Hydrophobic collapse drives process Local (secondary) structure proclivities Internal stabilizers:

Hydrogen bonds, disulphide bonds, salt bridges.

protein structure

Page 85: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

serine-threonine protein kinase

calmodulin regulation

multimer formation

12 subunitswith the catalytic

domains facing out

CaM Kinase II structure

Page 86: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

unc-43 --------------------MQLQQINSGAFSVVRRCVHKTTGLEFAAKIINTKKLSARDrCaMKII -------MATITCTRFTEEYQLFEELGKGAFSVVRRCVKVLAGQEYPAKIINTKKLSARDhCaMKI MLGAVEGPRWKQAEDIRDIYDFRDVLGTGAFSEVILAEDKRTQKLVAIKCIAKEALEGKErCaMKI MPGAVEGPRWKQAEDIRDIYDFRDVLGTGAFSEVILAEDKRTQKLVAIKCIAKKALEGKE .. **** * . . * * * .. unc-43 FQKLEREARICRKLQHPNIVRLHDSIQEESFHYLVFDLVTGGELFEDIVAREFYSEADASrCaMKII HQKLEREARICRLLKHPNIVRLHDSISEEGHHYLIFDLVTGGELFEDIVAREYYSEADAShCaMKI GS-MENEIAVLHKIKHPNIVALDDIYESGGHLYLIMQLVSGGELFDRIVEKGFYTERDASrCaMKI GS-MENEIAVLHKIKHPNIVALDDIYESGGHLYLIMQLVSGGELFDRIVEKGFYTERDAS .* * . . ..***** * * **. **.*****. ** . .*.* *** unc-43 HCIQQILESIAYCHSNGIVHRDLKPENLLLASKAKGAAVKLADFGLAIEVN-DSEAWHGFrCaMKII HCIQQILEAVLHCHQMGVVHRDLKPENLLLASKLKGAAVKLADFGLAIEVEGEQQRWFGFhCaMKI RLIFQVLDAVKYLHDLGIVHRDLKPENLLYYSLDEDSKIMISDFGLSKMED-PGSVLSTArCaMKI RLIFQVLDAVKYLHDLGIVHRDLKPENLLYYSLDEDSKIMISDFGLSKMED-PGSVLSTA . * *.*... * *.*********** * . . ..****.  unc-43 AGTPGYLSPEVLKKDPYSKPVDIWACGVILYILLVGYPPFWDEDQHRLYAQIKAGAYDYPrCaMKII AGTPGYLSPEVLRKDPYGKPVDLWACGVILYILLVGYPPFWDEDQHRLYQQIKARAYDFPhCaMKI CGTPGYVAPEVLAQKPYSKAVDCWSIGVIAYILLCGYPPFYDENDAKLFEQILKAEYEFDrCaMKI CGTPGYVAPEVLAQKPYSKAVDCWSIGVIAYILLCGYPPFYDENDAKLFEQILKAEYEFD .*****..**** . ** * ** *. *** **** ***** ** .*. ** *..  unc-43 SPEWDTVTPEAKSLIDSMLTVNPKKRITADQALKVPWICNRERVASAIHRQDTVDCLKKFrCaMKII SPEWDTVTPEAKDLINKMLTINPSKRITAAEALKHPWISHRSTVASCMHRQETVDCLKKFhCaMKI SPYWDDISDSAKDFIRHLMEKDPEKRFTCEQALQHPWIAGDTALDKNIH-QSVSEQIKKNrCaMKI SPYWDDISDSAKDFIRHLMEKDPEKRFTCEQALQHPWIAGDTALDKNIH-QSVSEQIKKN ** ** .. ** * .. * ** *. .**. ***. . .* * . .**  unc-43 NARRKLKGAILTTMIATRNLSSKRSYRLTLGAEKLVISMKNIEYWQVLLNKIFATYKIKMrCaMKII NARRKLKGAILTTMLATRNFSGG-----------------------------------KShCaMKI FAKSKWKQAFNATAVVRHMR----------------------------------------rCaMKI FAKSKWKQAFNATAVVRHMR---------------------------------------- *. * * * .* . .

…continued

sequence comparison

Page 87: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

unc-43 SPEWDTVTPEAKSLIDSMLTVNPKKRITADQALKVPWICNRERVASAIHRQDTVDCLKKFrCaMKII SPEWDTVTPEAKDLINKMLTINPSKRITAAEALKHPWISHRSTVASCMHRQETVDCLKKFhCaMKI SPYWDDISDSAKDFIRHLMEKDPEKRFTCEQALQHPWIAGDTALDKNIH-QSVSEQIKKNrCaMKI SPYWDDISDSAKDFIRHLMEKDPEKRFTCEQALQHPWIAGDTALDKNIH-QSVSEQIKKN ** ** .. ** * .. * ** *. .**. ***. . .* * . .**  unc-43 NARRKLKGAILTTMIATRNLSSKRSYRLTLGAEKLVISMKNIEYWQVLLNKIFATYKIKMrCaMKII NARRKLKGAILTTMLATRNFSGG-----------------------------------KShCaMKI FAKSKWKQAFNATAVVRHMR----------------------------------------rCaMKI FAKSKWKQAFNATAVVRHMR---------------------------------------- *. * * * .* . .  unc-43 KQCRNLLNKKEQGPPSTIKESSESS-QTIDDNDSEKGGGQLKHENTVVRADGATGIVSSSrCaMKII G--G---NKKNDG----VKESSESTNTTIEDED--------------------------- ***. * .******. **.*.* unc-43 NSSTASKSSSTNLSAQKQDIVRVTQTLLDAISCKDFETYTRLCDTSMTCFEPEALGNLIErCaMKII ------------TKVRKQEIIKVTEQLIEAISNGDFESYTKMCDPGMTAFEPEALGNLVE **.*..**. *..*** ***.**..** **.*********.* unc-43 GIEFHRFYFD--GNRKNQ-VHTTMLNPNVHIIGEDAACVAYVKLTQFLDRNGEAHTRQSQrCaMKII GLDFHRFYFENLWSRNSKPVHTTILNPHIHLMGDESACIAYIRITQYLDAGGIPRTAQSE *..******. * ****.*** .*..*.. **.**...**.** * * **. unc-43 ESRVWSKKQGRWVCVHVHRSTQPSTNTTVSEFrCaMKII ETRVWHRRDGKWQIVHFHRSGAPSVLPH---- *.*** .. *.* **.*** **

…continued (overlapped)

sequence comparison

Page 88: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Protein structure basicsprotein structure

proteins consist mostly of a-helices, b-sheets, and turns. the a-helices and b-sheets typically form the framework of the

protein. the turns and other atypical structures often play important bin

ding and catalytic roles. the core of the protein is hydrophobic, whereas the surface is

usually polar or charged. most turns and kinks have glycines and prolines

Page 89: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

alpha helix

protein structure

Page 90: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

three-stranded antiparallel b-sheet

protein structure

Page 91: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

three-stranded antiparallel b-sheet, space filled

protein structure

Page 92: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

rCaMKII SPEWDTVTPEAKDLINKMLTINPSKRITAAEALKHPWISHRSTVASCMHRQETVDCLKKFrCaMKI SPYWDDISDSAKDFIRHLMEKDPEKRFTCEQALQHPWIAGDTALDKNIH-QSVSEQIKKN 297 ** ** .. *** * .. .* ** *. .**.****. . . .* * . .**   rCaMKII NARRKLKGAILTTMLATRNrCaMKI FAKSKWKQAFNATAVVRHM 316 *. * * *. .* . .

substrate binding cleft

protein structure

Page 93: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

red - chargedblue - polargreen - hydrophobic

sliced protein

Page 94: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

rCaMKII HQKLEREARICRLLKHPNIVRLHDSISEEGHHYLIFDLVTGGELFEDIVAREYYSEADASrCaMKI GS-MENEIAVLHKIKHPNIVALDDIYESGGHLYLIMQLVSGGELFDRIVEKGFYTERDAS 119 .* * . . .****** * * ** *** .**.*****. ** . .*.* *** rCaMKII HCIQQILEAVLHCHQMGVVHRDLKPENLLLASKLKGAAVKLADFGLAIEVEGEQQRWFGFrCaMKI RLIFQVLDAVKYLHDLGIVHRDLKPENLLYYSLDEDSKIMISDFGLSKMED-PGSVLSTA 178 . * *.*.** * *.*********** * . . ..****. .

protein structure

Page 95: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

rCaMKII HQKLEREARICRLLKHPNIVRLHDSISEEGHHYLIFDLVTGGELFEDIVAREYYSEADASrCaMKI GS-MENEIAVLHKIKHPNIVALDDIYESGGHLYLIMQLVSGGELFDRIVEKGFYTERDAS 119 .* * . . .****** * * ** *** .**.*****. ** . .*.* *** rCaMKII HCIQQILEAVLHCHQMGVVHRDLKPENLLLASKLKGAAVKLADFGLAIEVEGEQQRWFGFrCaMKI RLIFQVLDAVKYLHDLGIVHRDLKPENLLYYSLDEDSKIMISDFGLSKMED-PGSVLSTA 178 . * *.*.** * *.*********** * . . ..****. .

protein structure

Page 96: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

rCaMKII HCIQQILEAVLHCHQMGVVHRDLKPENLLLASKLKGAAVKLADFGLAIEVEGEQQRWFGFrCaMKI RLIFQVLDAVKYLHDLGIVHRDLKPENLLYYSLDEDSKIMISDFGLSKMED-PGSVLSTA 178 . * *.*.** * *.*********** * . . ..****. .   rCaMKII AGTPGYLSPEVLRKDPYGKPVDLWACGVILYILLVGYPPFWDEDQHRLYQQIKARAYDFPrCaMKI CGTPGYVAPEVLAQKPYSKAVDCWSIGVIAYILLCGYPPFYDENDAKLFEQILKAEYEFD 238 .*****..**** . ** * ** *. *** **** ***** **.. .*..** *.*

protein structure

Page 97: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure

Page 98: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure prediction

Page 99: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein model

Goodsell, PDB

Page 100: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure prediction

the 3-D structure of proteins is used to understand protein function and design new drugs

Page 101: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Structural Predictions just from raw protein sequence?

1. ggcacgaggc acggctgtgc aggcacgcat gcaggccagc ….2. atctgcacgt ggttatgctg ccggagtttg ggccgccact….

protein structure prediction

Page 102: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

protein structure prediction

1 2

Page 103: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

KD Hydrophobicity

Surface Prob.

Flexibility

Antigenic Index

CF TurnsCF Alpha Helices

CF Beta Sheets

GOR Alpha HelicesGOR Turns

GOR Beta SheetsGlycosylation Sites

0.8

1.20.0

-1.7

1.7

10-5.0

5.0

50 100

50 100

Particular structural features can be recognised in protein sequences

protein structure prediction

Page 104: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

structure prediction

Comparative modeling Modeling the structure of a protein that has a high degree of sequence identi

ty with a protein of known structure Must be >30% identity to have reliable structure

Page 105: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

statistical methods

Residue conformational preferences: Glu, Ala, Leu, Met, Gln, Lys, Arg - helix Val, Ile, Tyr, Cys, Trp, Phe, Thr - strand Gly, Asn, Pro, Ser, Asp - turn

Chou-Fasman algorithm: Identification of helix and sheet "nuclei"

helix - 4 out of 6 residues with high helix propensity sheet - 3 out of 5 residues with high sheet propensity

Propagation until termination criteria met

Page 106: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Threading/fold recognition Uses known fold structures to predict folds in primary

sequence.

structure prediction

Page 107: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

inverse protein folding

based on the assumption that there is limited number of structural protein classes (folds). One attempts to assign a new protein sequence to one of these classes.

Page 108: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

fold recognition/threading

...MLDTNMKTQL KAYLEKLT KPVELIATL DDSAKSAEIKELL...

structure library

Page 109: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

...MLDTNMKTQL KAYLEKLT KPVELIATL DDSAKSAEIKELL...

fold recognition/threading

Page 110: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Ab initio Predicting structure from primary sequence data Generate as many conformations as possible, and assign an e

nergy score to each one When the search terminates (usually when resources run out),

the one with the lowest energy score is selected Usually not as robust nor practical, computationally intensive

structure prediction

Page 111: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

function prediction

Key problem: predict the function of protein structures based on sequence and structure information

Function is loosely defined, and can be thought of at many levels Atomic or molecular level Pathways level Network level Etc.

Currently, relatively little progress has been made in function prediction, particularly for higher order processes

Page 112: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Experimentation Experimentally determine the function of proteins and

other structures The “gold standard” of function determination Expensive in terms of time and money

current methods

function prediction

Page 113: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Annotation transfer When sequence or structure analysis yields

correspondences between structures, the known properties and function of one is used to extrapolate the properties and function of the other

This method has been extremely successful, but its drawbacks include [Bork et al., 1998]: Similar sequence or structure does not always imply

similar function The annotated information about the “known” protein or

its sequence or structure information in the database may be incomplete or incorrect

Generally, only molecular functions of a protein can be inferred by analogy (i.e. not higher level functions)

From a formal point of view, properties derived in this manner must be verified through experimentation

current methods

function prediction

Page 114: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

simulation-based analysis

Simulation-based analysis tests hypotheses with in silico experiments, providing predictions to be tested by in vitro and in vivo studies.

faster and more economical. Example: Folding@Home

Page 115: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Folding@Home

Simulates protein folds Folds dictate the function of the pro

tein Unfolding was discovered by Christi

an Anfinsen When folds do not fold properly, it l

eads to diseases such as Alzheimer’s disease, Mad Cow, Parkinson’s disease

If the fold of the protein is known then it can also be unfolded

Page 116: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Runs on a distributed system Runs as a screensaver Downloadable at:

http://folding.stanford.edu

Folding@Home

Page 117: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

drug design

Page 118: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

structured-based drug design

Page 119: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Compound databases,

Microbial broths,Plants extracts,Combinatorial

Libraries

3-D ligand Databases

DockingLinking orBinding

Receptor-LigandComplex

Randomscreening synthesis

Lead molecule

3-D QSAR

Target EnzymeOR Receptor

3-D structure by Crystallography,NMR, electron microscopy OR

Homology Modeling

Redesign to improve

affinity, specificity etc.

Testing

structured-based drug design

Page 120: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

3D QSAR

quantitative structure activity relationships to calculate and predict charge distribution, solubility, hydrophobicity, lipophilicity

Page 121: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

active sites

Page 122: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Glutathione-GR

drug target site

Page 123: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

DHFR

drug target site

Page 124: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

multiple alignments of DHFRCLUSTAL W (1.81) multiple sequence alignment chabaudi -----------------------E--KAGCFSNKTFKGLGNEGGLPWKCNSVDMKHFSSV 35 vinckei -----------AICACCKVLNSNE--KASCFSNKTFKGLGNAGGLPWKCNSVDMKHFVSV 47 berghei MEDLSETFDIYAICACCKVLNDDE--KVRCFNNKTFKGIGNAGVLPWKCNLIDMKYFSSV 58 yoelii -----------AICACCKVINNNE--KSGSFNNKTFNGLGNAGMLPWKYNLVDMNYFSSV 47 vivax MEDLSDVFDIYAICACCKVAPTSEGTKNEPFSPRTFRGLGNKGTLPWKCNSVDMKYFSSV 60 falciparum -------------------------KKNEVFNNYTFRGLGNKGVLPWKCNSLDMKYFCAV 35 * *. **.*:** * **** * :**::* :* chabaudi TSYVNETNYMRLKWKRDRYMEK---------NNVKLNTDGIPSVDKLQNIVVMGKASWES 86 vinckei TSYVNENNYIRLKWKRDKYIKE---------NNVKVNTDGIPSIDKLQNIVVMGKTSWES 98 berghei TSYINENNYIRLKWKRDKYMEKHNLK-----NNVELNTNIISSTNNLQNIVVMGKKSWES 113 yoelii TSYVNENNYIRLQWKRDKYMGKNNLK-----NNAELNNGELN--NNLQNVVVMGKRNWDS 100 vivax TTYVDESKYEKLKWKRERYLRMEASQGGGDNTSGGDNTHGGDNADKLQNVVVMGRSSWES 120 falciparum TTYVNESKYEKLKYKRCKYLNKET----------VDNVNDMPNSKKLQNVVVMGRTNWES 85 *:*::*.:* :*::** :*: * .:***:****: .*:* chabaudi IPSKFKPLQNRINIILSRTLKKEDLAKEYN------NVIIINSVDDLFPILKCIKYYKCF 140 vinckei IPSKFKPLENRINIILSRTLKKENLAKEYS------NVIIIKSVDELFPILKCIKYYKCF 152 berghei IPKKFKPLQNRINIILSRTLKKEDIVNENN--NENNNVIIIKSVDDLFPILKCTKYYKCF 171 yoelii IPPKFKPLQNRINIILSRTLKKEDIANEDNKNNENGTVMIIKSVDDLFPILKAIKYYKCF 160 vivax IPKQYKPLPNRINVVLSKTLTKEDVK---------EKVFIIDSIDDLLLLLKKLKYYKCF 171 falciparum IPKKFKPLSNRINVILSRTLKKEDFD---------EDVYIINKVEDLIVLLGKLNYYKCF 136 ** ::*** ****::**:**.**:. * **..:::*: :* :***** chabaudi I----------------------------------------------------------- 141 vinckei IIGGASVYKEFLDRNLIKKIYFTRINNAYT------------------------------ 182 berghei IIGGSSVYKEFLDRNLIKKIYFTRINNSYNCDVLFPEINENLFKITSISDVYYSNNTTLD 231 yoelii IIGGSYVYKEFLDRNLIKKIYFTRINNSYN------------------------------ 190 vivax IIGGAQVYRECLSRNLIKQIYFTRINGAYPCDVFFPEFDESQFRVTSVSEVYNSKGTTLD 231 falciparum I----------------------------------------------------------- 137 * chabaudi --------- vinckei --------- berghei FIIYSKTKE 240 yoelii --------- vivax FLVYSKVGG 240 falciparum ---------

Page 125: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

In the absence of a structure of target-ligand complex, it is not a trivial exercise to locate the binding site!!!

This is followed by Lead optimization.

binding site analysis

Page 126: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

lead optimisation

Lead Lead OptimizationActive site

Page 127: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

LIGAND.wat n +PROTEIN.wat n LIGAND.PROTEIN.watp+(n+m-p) wat

HYDROGEN BONDING

HYDROPHOBIC EFFECT

ELECTROSTATIC INTERACTIONS

VAN DER WAALS INTERACTIONS

STRAIN IN THE LIGAND ( BOUND)

STRAIN IN THE PROTEIN

drug design

factors affecting the affinity of a small molecule for a target protein

Page 128: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

difference between inhibitor and drug

Selectivity Less Toxicity Bioavailability Slow Clearance Reach The Target Ease Of Synthesis Low Price Slow Or No Development Of Resistance Stability Upon Storage As Tablet Or Solution Pharmacokinetic Parameters No Allergies

Extra requirement of a drug compared to an inhibitor

Page 129: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Proteins that interact with drugs are typically enzymes or receptors. Drug may be classified as: substrates/inhibitors (for enzymes) agonists/antagonists (for receptors) Ligands for receptors normally bind via a non-covalent reversible binding. Enzyme inhibitors have a wide range of modes:non-covalent reversible,covalent revers

ible/irreversible or suicide inhibition. Enzymes prefer to bind transition states (reaction intermediates) and may not optimal

ly bind substrates as part of energy used for catalysis. In contrast, inhibitors are designed to bind with higher affinity: their affi nities often ex

ceed the corresponding substrate affinities by several orders of magnitude! Agonists are analogous to enzyme substrates: part of the binding energy may be use

d for signal transduction, inducing a conformation or aggregation shift.

thermodynamics of receptor-ligand binding

Page 130: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

To understand ‘what forces’ are responsible for ligands binding to Receptors/Enzymes,

It is worthwhile considering what forces drive protein folding –they share many common features.

The observed structure of Protein is generally a consequence of the hydrophobic effect!

Secondary amides form much stronger H-bonds to water than to other sec. Amides hydrophobic collapse

Proteins generally bury hydrophobic residues inside the core,while exposing hydrophilic residues to the exterior Salt-bridges inside

Ligand building clefts in proteins often expose hydrophobic residues to solvent and may contain partially desolvated hydrophilic groups that are not paired:

The desolvation penalty is paid for by favourable (hydrophobic) interaction elsewhere in the structure.

thermodynamics of receptor-ligand binding

Page 131: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

docking methods

Docking of ligands to proteins is a formidable problem since it entails optimization of the 6 positional degrees of freedom.

Rigid vs Flexible Speed vs Reliability Manual Interactive Docking

Page 132: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

GRID based docking methods

Grid Based methods GRID (Goodford, 1985, J. Med. Chem. 28:849) GREEN (Tomioka & Itai, 1994, J. Comp. Aided. Mol. Des.

8:347) MCSS (Mirankar & Karplus, 1991, Proteins, 11:29).

Functional groups are placed at regularly spaced (0.3-0.5A) lattice points in the active site and their interaction energies are evaluated.

Page 133: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

automated docking methods

Basic Idea is to fill the active site of the Target protein with a set of spheres.

Match the centre of these spheres as good as possible with the atoms in the database of small molecules with known 3-D structures.

Examples: DOCK, CAVEAT, AUTODOCK, LEGEND, ADAM, LINKOR,

LUDI.

Page 134: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

drug binding pocket of L. casei DHFR

Page 135: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

prediction & design of new drugs

Prediction of 3-D PfDHFR using bacterial DHFR and homology modeling approach.

Search for the compounds using bifunctional basic groups that could form stable H-bonds in a plane with carboxyl group.

Optimize the structure of small molecules and then dock them on PfDHFR model.

Toyoda et. al. (1997). BBRC 235:515-519 could identify two compounds.

Page 136: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

These two compounds a triazinobenzimidazole &a pyridoindole were found to be active with high Ki against recombinant wild type DHFR.

Thus demonstrate use of molecular modeling in malarial drug design.

identifying new leads

Page 137: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

Page 138: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

virtual human

Page 139: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

http://www.physiome.org/

virtual human

Simulation of complex models of cells, tissues and organs

Page 140: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

“A worldwide effort to define the physiome by developing databases and models which will facilitate the understanding of the integrative functions of cells, organs and organisms.”

defenitionPhysiome is the quantitative and integrated description of the functional behavior of the physiological state of an individual or species.

physiome project

Page 141: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

main objective:“… to understand and

describe the human organism, its physiology and pathophysiology quantitatively, and to use this understanding to improve human health.”

physiome project

Page 142: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Specific Objectives:1. To develop a database with observations of

physiological phenomenon and interpret these in terms of mechanism (reductionism).

2. To integrate experimental information into quantitative descriptions of the functioning of humans and other organisms (modern integrative biology glued together via modeling). 

3. To disseminate experimental data and integrative models for teaching and research. 

physiome project

Page 143: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Specific Objectives:4. To foster collaboration amongst investigators

worldwide, in an effort to speed up the discovery of how biological systems work. 

5. To determine the most effective targets (molecules or systems) for therapy, either pharmaceutical or genomic. 

6. To provide information for the design tissue-engineered, biocompatible implants.

physiome project

Page 144: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

Issues being addressed:1. Markup language

-- development of SBML (in Caltech) for representing biochemical networks and CellML for electrophysiology, mechanics, energetics and general pathway.

2. Mathematical models-- development of models that are “anatomically based” and “biophysically based” to link gene, protein, cell, tissue ,organ and whole body systems physiology.

Page 145: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Issues being addressed:3. Web-accessible databases

-- For easy data exchange, groups at MIT and UCSD are developing standards for this.Example databases: Genomic Databases, Protein Databases, Material Property Databases, Anatomical Model Databases, Clinical Databases

4. Development of new instrumentation5. Development of Modeling tools, GUIs and web-accessible tools

for visualization of complex models.

physiome project

Page 146: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

http://www.bme.jhu.edu/news/microphys

1. MicrocirculationA common functional system between organs; It provides an important coupling between cells, tissues, and organs.

Page 147: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

a b

physiome project

http://www.bioeng.auckland.ac.nz/projects/nerf/skeletal.php

2. Musculo-skeletal systemContinues to extend the database of parameterised bone geometry to individual muscles, ligaments and tendons.

a Anatomically detailed model of Skeleton.

b Rendered finite element mesh for the bones and a subset of the muscles

Page 148: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

a b

physiome project

Computational model of the skull and torso.

a The layer of skeletal muscle is highlighted. b The heart and lungs shown within the torso.

Page 149: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

3. Cardiome ProjectAn attempt to provide an integrated model of the heart, incorporating electrical activation, mechanical contraction, energy supply and utilization, cell signaling and many other biochemical processes.

Heart model with a textured epidermal surface

Page 150: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

a b c

physiome project

Fibrous-sheet architecture of the heart. Ribbons are drawn in the plane of the myocardial sheets a on the epicardial surface of the heart, b at midwall, and c on the endocardial surface. Note the large fibre angle changes. These fibre-sheet material axes are needed for computation of both myocardial activation and ventricular mechanics.

heart structure

Page 151: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

heart structure

The finite element model of the right and left ventricle of the heart showing various anatomical structures. Geometric information is carried at the nodes of the finite element mesh and interpolated with cubic Hermite basis functions.

Page 152: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

ventricular mechanics

Mechanics of the cardiac cycle, computed by large deformation finite element analysis, at a zero pressure state, b end-diastole, c mid-systole, d end-systole. Note the apex to base shortening and the twisting about the long axis. Also note the six generations of discretely modeled coronary vessels embedded within the myocardial elements which are used to compute coronary flow throughout the cardiac cycle.

a b c d

Page 153: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

ventricular mechanics

The collagenous structure of the extra-cellular myocardial tissue matrix, as revealed by confocal microscopy. The material axes used for defining mechanical and electrical constitutive laws in the continuum modeling of the myocardium are based on these microstructurally defined axes.

Page 154: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

myocardial activation

Activation wave front computed on the finite element model using finite difference techniques based on grid points which move with the deforming myocardium. Bi-domain current conservation equations are solved with trans-membrane ionic currents. The stimulus in this case is a point on the left ventricular endocardial surface near the apex. The activation sequence is heavily influenced by the fibrous-sheet architecture of the myocardium.

Page 155: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

coronary perfusion

Computed flow in the coronary vasculature

Page 156: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Epicardial Fibers – FEM Model Endocardial Fibers – FEM Model

www.ccmb.jhu.edu

physiome project

ventricular fluid flow

Page 157: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

Human Torsomodel has been developed which includes the heart, lungs and the layers of skeletal muscle, fat and skin. Current flow from the heart into the torso is computed in order to predict the body surface potentials arising from activation of the myocardium.

Page 158: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

physiome project

4. LungsDevelopment of models of the integrated function of various physical processes operating in the lung.

5. Bladder and ProstateAn anatomically detailed model of the bladder and prostate is developed.

6. Circulation SystemA model of the circulation system is being developed based on the Visual Human Project dataset (http://www.nlm.nih.gov/research/visible)

Page 159: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

future

Development of Precision Models Simulation requires the integration of multiple

hierarchies of models that have different scales and qualitative properties

Some biological processes take place within milliseconds while others may take hours or daysExample: Protein folding vs. Cell Mitosis

Page 160: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Development of Precision Models Biological processes can involve the interaction

of different types of processes (i.e. biochemical networks coupled to protein transport, chromosome dynamics, cell migration or morphological changes in tissues)

future

Page 161: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

Development of Precision Models Types of modeling:

Using differential equations and stochastic simulation

Many cell biological phenomena require calculation of structural dynamics

Deformation of elastic bodies Spring-mass models and other physical processes

future

Page 162: Bio-modeling.  introduction  molecular biology  biotechnology  bioMEMS  bioinformatics  bio-modeling  cells and e-cells  transcription and regulation.

the end