Protein-Protein Docking

222
Center for Bioinformatics, Saarbrücken, Germany Celera Genomics, Rockville, MD, USA Protein-Protein Docking Basics and New Applications Oliver Kohlbacher Hans-Peter Lenhof

Transcript of Protein-Protein Docking

Page 1: Protein-Protein Docking

Center for Bioinformatics, Saarbrücken, Germany

Celera Genomics, Rockville, MD, USA

Protein-Protein DockingBasics and New Applications

Oliver Kohlbacher

Hans-Peter Lenhof

Page 2: Protein-Protein Docking

Overview• Introduction

• Basic Algorithms + Scoring Functions

• Integration of Protein Flexibility– Flexibility in Proteins

– Algorithms for Semi-Flexible Docking

• Integration of Experimental Data– NMR Spectroscopy

– NMR-Based Protein Docking

• Outlook + Summary

Page 3: Protein-Protein Docking

Protein-Docking: Was, wie, warum?

Page 4: Protein-Protein Docking

Protein Docking: Introduction

Protein-Protein Docking

Page 5: Protein-Protein Docking

Protein Docking: Introduction

Protein-Protein Docking

Page 6: Protein-Protein Docking

Protein Docking: Introduction

Protein-Protein Docking

• Given two proteins A and B

• Predict complex structure AB

Page 7: Protein-Protein Docking

Protein Docking: Introduction

Protein-Protein Docking

• Uniform chemistry

• Understanding protein

interactions

• Speed-up structure

elucidation

• Predict protein

interactions

Page 8: Protein-Protein Docking

Protein Docking: Introduction

Protein-Ligand Docking

• Small, flexible Ligand

• Ligands are

chemically diverse

• Crucial for Drug

Design

• Virtual screening

Page 9: Protein-Protein Docking

Protein Docking: Introduction

Lock-and-Key Principle

Emil Fischer 1894“ To use an image, I would say that

enzyme and glycoside have to fit into each other like a lock and a key, in order to exert a chemical effect on each other.”

Page 10: Protein-Protein Docking

Protein Docking: Introduction

Lock-and-Key Principle

Page 11: Protein-Protein Docking

Protein Docking: Introduction

Lock-and-Key Principle

Page 12: Protein-Protein Docking

Protein Docking: Introduction

Lock-and-Key Principle

Page 13: Protein-Protein Docking

Protein Docking: Introduction

Lock-and-Key Principle

Page 14: Protein-Protein Docking

Protein Docking: Introduction

Lock-and-Key Principle

Page 15: Protein-Protein Docking

Protein Docking: Introduction

Lock-and-Key Principle

Page 16: Protein-Protein Docking

Protein Docking: Introduction

Lock-and-Key Principle

Page 17: Protein-Protein Docking

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Lock-and-Key Principle

Page 18: Protein-Protein Docking

18Algorithms for Protein Docking: Basics

Protein Docking: How?

• Structure generation

Page 19: Protein-Protein Docking

19Algorithms for Protein Docking: Basics

Protein Docking: How?

• Structure generation

• Filtering

Page 20: Protein-Protein Docking

20Algorithms for Protein Docking: Basics

Protein Docking: How?

• Structure generation

• Filtering

Page 21: Protein-Protein Docking

21Algorithms for Protein Docking: Basics

Protein Docking: How?

• Structure generation

• Filtering

• Evaluation

Page 22: Protein-Protein Docking

22Algorithms for Protein Docking: Basics

Protein Docking: How?

< <

• Structure generation

• Filtering

• Evaluation

Page 23: Protein-Protein Docking

Algorithms for Protein Docking: Basics

Connolly SurfaceConnolly 1986 Bacon, Moult 1992Fischer et al. 1995Lin et al. 1994Norel et al. 1995Sandak et al. 1995Campbell et al. 1996Ackermann et al. 1995

Fuzzy LogicExtner, Brickmann 1997

Cube RepresentationJiang, Kim 1991

Graph RepresentationShoichet et al. 1992Shoichet, Kuntz 1996Kasinos et al. 1992

CorrelationKatchalski-Katzir et al.1992Vakser, Aflalo 1994Vakser 1996Gabb et al. 1997Meyer et al. 1996

Monte Carlo ApproachCherfils et al. 1991Totrov, Abagyan 1994

Slices RepresentationWalls, Sternberg 1992Helmer-Citterich et al. 1994Ausiello et al. 1997

Genetic AlgorithmsLevine et al. 1997

DatabaseEster et al. 1995

Overview of Docking Techniques

Page 24: Protein-Protein Docking

Algorithms for Protein Docking: Basics

Connolly SurfaceConnolly 1986 Bacon, Moult 1992Fischer et al. 1995Lin et al. 1994Norel et al. 1995Sandak et al. 1995Campbell et al. 1996Ackermann et al. 1995

Fuzzy LogicExtner, Brickmann 1997

Cube RepresentationJiang, Kim 1991

Graph RepresentationShoichet et al. 1992Shoichet, Kuntz 1996Kasinos et al. 1992

CorrelationKatchalski-Katzir et al.1992Vakser, Aflalo 1994Vakser 1996Gabb et al. 1997Meyer et al. 1996

Monte Carlo ApproachCherfils et al. 1991Totrov, Abagyan 1994

Slices RepresentationWalls, Sternberg 1992Helmer-Citterich et al. 1994Ausiello et al. 1997

Genetic AlgorithmsLevine et al. 1997

DatabaseEster et al. 1995

Overview of Docking Techniques

Page 25: Protein-Protein Docking

Algorithms for Protein Docking: Basics

What are the differences?

• Method for determining tentative structures– Based on surface complementarity

– Based on triangles

– ….

• Scoring functions employed– Contact surface area

– Electrostatics

– Hydrogen bonds

– ….

Page 26: Protein-Protein Docking

Algorithms for Protein Docking: Basics

Structure Generation• Assumption: Proteins are rigid bodies!• Three-dimensional “puzzle”• Six degrees of freedom

– Rotation– Translation

• Identify rigid transformations bringing B in contact with A

• Discretization– Of proteins, protein surface– Rotational/translational space

Page 27: Protein-Protein Docking

Algorithms for Protein Docking: Basics

Structure Generation

• Katchalski-Katzir et al., Proc. Natl. Acad.

Sci. USA, 1992

– Grid-based correlation of A and B

• Lenhof, RECOMB 97

– Mapping of triangles

Page 28: Protein-Protein Docking

Lenhof, RECOMB 1997

Hash Table

Structure Generation

Calculate “critical points” on surface of A

Calculate triangles define by critical points

Calculate triangles defined by atoms of B

Search for triangles with similar side lengths

Page 29: Protein-Protein Docking

Lenhof, RECOMB 1997

Structure Generation

Calculate “critical points” on surface of A

Calculate triangles define by critical points

Calculate triangles defined by atoms of B

Search for triangles with similar side lengths

Page 30: Protein-Protein Docking

Lenhof, RECOMB 1997

Structure Generation

Page 31: Protein-Protein Docking

Scoring Functions for Protein Docking

Scoring Functions

• What distinguishes the true complex structure from “ false positives”?

• Physical chemistry:Complex structure with the lowest binding free

energy is the one observed in nature.

• Caveat: relies on sufficiently complete sampling of conformation space

Page 32: Protein-Protein Docking

Scoring Functions for Protein Docking

Prediction of Free Energy

• Also hardest part in related problems:

– Ligand docking

– Protein structure prediction

• Many methods neglect

– entropic contributions

– solvent effects

Page 33: Protein-Protein Docking

Scoring Functions for Protein Docking

Scoring Functions

• Simplest functions: Geometry

– Key-and-lock principle

– Large contact areas are favorable

– Overlaps between A and B are unfavorable

• More sophisticated: “Chemistry”

– Models based on physicochemistry

– Compromise between complexity and accuracy

Page 34: Protein-Protein Docking

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Page 35: Protein-Protein Docking

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Page 36: Protein-Protein Docking

Scoring Functions for Protein Docking

Geometric Scoring Function

Page 37: Protein-Protein Docking

Scoring Functions for Protein Docking

Geometric Scoring Function

Determine distance between (a, b) (a ∈A, b∈B)|}0.4.),(75.2|),{ (|)( <<= badistbacontact AB

|}75.2.),(|),{ (|)( <= badistbaoverlap AB)()()(_ 21 ABABAB overlapkcontactkscoregeom ⋅−⋅=

Page 38: Protein-Protein Docking

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Page 39: Protein-Protein Docking

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Page 40: Protein-Protein Docking

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Page 41: Protein-Protein Docking

Scoring Functions for Protein Docking

Chemical Scoring

�∈

=contactba

btypeatypeMscorechem),(

)(),()(_ AB

Page 42: Protein-Protein Docking

Scoring Functions for Protein Docking

Geometry

Chemistry

++- -

Scoring Function

Page 43: Protein-Protein Docking

B

A

2PTC

Page 44: Protein-Protein Docking

2PTC AB

Page 45: Protein-Protein Docking

2PTC No. 1 RMSD = 1.94 A

Page 46: Protein-Protein Docking

Scoring Functions for Protein Docking

Results: Docking of 2PTC

score

RM

SD

]

Page 47: Protein-Protein Docking

Katchalski-Katzir et al., PNAS 1992

Basic Ideas• Protein on grid

Page 48: Protein-Protein Docking

Katchalski-Katzir et al., PNAS 1992

Basic Ideas

000outside

01δ > 0surface

0ρ < 0ρ* δ < 0inside

outsidesurface insideA B

• Protein on grid• Assign values

– ai,j,k =• 1 at the surface of A• ρ << 0 inside A• 0 outside A

– bi,j,k =• 1 at the surface of B• δ > 0 inside B• 0 outside B

Page 49: Protein-Protein Docking

Katchalski-Katzir et al., PNAS 1992

Basic Ideas• Protein on grid• Assign values

– ai,j,k =• 1 at the surface of A• ρ << 0 inside A• 0 outside

– bi,j,k =• 1 at the surface of B• δ > 0 inside B• 0 outside B

000outside

01δ > 0surface

0ρ < 0ρ* δ < 0inside

outsidesurface insideA B

Page 50: Protein-Protein Docking

Katchalski-Katzir et al., PNAS 1992

Calculate Correlation cα,β,γ

• Correlation of a and b identifies those translations where A and B are in contact:

� +++⋅=kji

kjikji bac,,

,,,,,, γβαγβα

• Repeat for all translation vectors (α, β, γ)

���� Run time O(N6)!

Page 51: Protein-Protein Docking

Katchalski-Katzir et al., PNAS 1992

From: Katchalski-Katzir et al., PNAS 1992, 2195

Cross Section cα=0,β,γ

Page 52: Protein-Protein Docking

Katchalski-Katzir et al., PNAS 1992

Algorithm

• Calculate ai,j,k and A* = [FFT(a)]*

• For all rotations of B:– Calculate bi,j,k and B = FFT(b)

– Calculate C = A * B

– Calculate ci,j,k = IFT(C)

– Identify tentative transformations (α, β, γ) as strongly positive peaks in c

Page 53: Protein-Protein Docking

From Rigid to Flexible Docking

3122114TPI

31225521TGS

1221802TGP

8166951TEC

5133812SNI

1317272136914SGB

3124924935182SEC

11221612PTC

8122492MHB

912291256722KAI

1012224HVP

5>125106104176373HFM

28252467922HFL

1233345529001243091FDL

51331064CPA

312221CHO

L97AHP+96MWS96NLW+95NLW+95NLW+95PDB ID

Comparison of AlgorithmsComparison of Algorithms

Page 54: Protein-Protein Docking

13122114TPI

131225521TGS

11221802TGP

18166951TEC

15133812SNI

11317272136914SGB

13124924935182SEC

111221612PTC

18122492MHB

1912291256722KAI

11012224HVP

15>125106104176373HFM

128252467922HFL

171233345529001243091FDL

151331064CPA

1312221CHO

L97AHP+96MWS96NLW+95NLW+95NLW+95PDB ID

Comparison of AlgorithmsComparison of Algorithms

Page 55: Protein-Protein Docking

13122114TPI

131225521TGS

11221802TGP

18166951TEC

15133812SNI

11317272136914SGB

13124924935182SEC

111221612PTC

18122492MHB

1912291256722KAI

11012224HVP

15>125106104176373HFM

128252467922HFL

171233345529001243091FDL

151331064CPA

1312221CHO

L97AHP+96MWS96NLW+95NLW+95NLW+95PDB ID

Comparison of AlgorithmsComparison of Algorithms

Page 56: Protein-Protein Docking

RMSD

2PTC_E 2PTC_I

Fit = GeomScore + ChemScore

Page 57: Protein-Protein Docking

One PS in the Life of PTI

Page 58: Protein-Protein Docking
Page 59: Protein-Protein Docking

RMSD

1TPO 4PTI

Fit = GeomScore + ChemScore

Page 60: Protein-Protein Docking

2PTC 1TPO 4PTI

Page 61: Protein-Protein Docking

2PTC 1TPO 4PTI

Page 62: Protein-Protein Docking

2PTC 1TPO 4PTI

Page 63: Protein-Protein Docking

From Rigid to Flexible Docking

Protein Flexibility• Domain movements

– Large scale movements of protein domains

– Mainly backbone movements

– Highly flexible “hinge” regions: hinge bending

• Side-chain flexibility– Rather rigid backbone

– Local rearrangements of side chains

– Flexibility based on side-chain torsions

Page 64: Protein-Protein Docking

From Rigid to Flexible Docking

Semi-Flexible Docking• Assumptions

– Rigid backbone

– Flexible side chains

• Basic Algorithm

– Start with rigid docking candidates

– Demangle overlapping side chains

– Calculate energy

Good approximation for Trypsin/BPTI!

Page 65: Protein-Protein Docking

Protein Docking with Flexible Side Chains

Page 66: Protein-Protein Docking

Protein Docking with Flexible Side Chains

• Apply the RBD algorithm

Page 67: Protein-Protein Docking

Protein Docking with Flexible Side Chains

• Apply the RBD algorithm

• Optimize the side chains in the docking site of the best RBD candidates

Page 68: Protein-Protein Docking

Protein Docking with Flexible Side Chains

• Apply the RBD algorithm

• Optimize the side chains in the docking site of the best RBD candidates

w.r.t. the potential energy (AMBER)

Page 69: Protein-Protein Docking

14.12.99

ARG 39

Protein Docking with Flexible Side Chains

• Apply the RBD algorithm

• Optimize the side chains in the docking site of the best RBD candidates

w.r.t. the potential energy (AMBER)

using a rotamer library

Page 70: Protein-Protein Docking

From Rigid to Flexible Docking

Torsion angle distributionLYS

Page 71: Protein-Protein Docking

From Rigid to Flexible Docking

SER

Rotamer LibrarySide-chain conformational space is adequately represented

by a discrete set of rotamers

Page 72: Protein-Protein Docking
Page 73: Protein-Protein Docking

(1) Determine all side chains in the docking site!

Page 74: Protein-Protein Docking

(1) Determine all side chains in the docking site!

Page 75: Protein-Protein Docking

S1

S2

(1) Determine all side chains in the docking site!

Page 76: Protein-Protein Docking

S1

S2

(1) Determine all side chains in the docking site!(2) Build sets of potential side chain orientation

Page 77: Protein-Protein Docking

S1 R1 R2

S2

(1) Determine all side chains in the docking site!(2) Build sets of potential side chain orientation

Page 78: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

(1) Determine all side chains in the docking site!(2) Build sets of potential side chain orientation

Page 79: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

(1) Determine all side chains in the docking site!(2) Build sets of potential side chain orientation(3) Determine the rotamer combination with minimal energy!

Page 80: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

Page 81: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12

Page 82: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12

Xij ∈ { 0, 1}

Page 83: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)

Xij ∈ { 0, 1}

Page 84: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

Page 85: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ ij

Page 86: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

min

j

Page 87: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

min ET

j

Page 88: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

min ET + � Xij Eijij

j

Page 89: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

min ET + � Xij Eij + � � Xij Xkl Eijklij klij

k≠i

j

Page 90: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

min ET + � Xij Eij + � � Xij Xkl (Eijkl - Emax)ij kl

k≠i

j

ij

Page 91: Protein-Protein Docking

S1 R1 R2

S2 R1 R2

X11

X22X21

X12 (X11 , X12 , X21 , X22)( 0 , 1 , 1 , 0 )

Xij ∈ { 0, 1}

� Xij = 1 ∀ i

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)ij kl

k≠i

j

ij

Page 92: Protein-Protein Docking

Solve ILP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Page 93: Protein-Protein Docking

Solve ILP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Page 94: Protein-Protein Docking

Solve ILP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Page 95: Protein-Protein Docking

Solve ILP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Page 96: Protein-Protein Docking

Solve LP

Solve ILP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Page 97: Protein-Protein Docking

Solve ILP

Solve LP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Page 98: Protein-Protein Docking

Solve ILP

Solve LP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Page 99: Protein-Protein Docking

infeasible

Solve ILP

Solve LP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Page 100: Protein-Protein Docking

infeasible

Solve ILP

Solve LP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Page 101: Protein-Protein Docking

infeasible

Solve ILP

Solve LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Page 102: Protein-Protein Docking

Search for a cutting plane

infeasible

Solve ILP

Solve LP

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Page 103: Protein-Protein Docking

add it to the LP

non feasible

Solve ILP

Solve LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Page 104: Protein-Protein Docking

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Page 105: Protein-Protein Docking

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Page 106: Protein-Protein Docking

feasible

non feasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ [ 0, 1]

� Xij = 1 ∀ i Xij ∈ [ 0, 1]

ij ij

j

kl

Page 107: Protein-Protein Docking

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Page 108: Protein-Protein Docking

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Page 109: Protein-Protein Docking

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Solve ILP xi j= 0

Solve ILP xi j= 1

0<Xij<1

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Page 110: Protein-Protein Docking

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Solve ILP xi j= 0

Solve ILP xi j= 1

0<Xij<1

min ET + � Xij Eij + � � Yijkl (Eijkl - Emax)

Yijkl ≤ Xij , Yijkl ≤ Xkl Yijkl ∈ { 0, 1}

� Xij = 1 ∀ i Xij ∈ { 0, 1}

ij ij

j

kl

Page 111: Protein-Protein Docking

From Rigid to Flexible Docking

Test Case: Trypsin/BPTI

BPTI

Trypsin

first approximation: rank 3

(docking of native structures)

Page 112: Protein-Protein Docking

BALL: Design und Architektur

What’s the problem?

Page 113: Protein-Protein Docking

From Rigid to Flexible Docking

What’s the problem?

•LYS 15 moves on docking

•overlapping atoms

•rigid docking must fail

Page 114: Protein-Protein Docking

From Rigid to Flexible Docking

Semi-Flexible Protein Docking

• Structure generation

• Filtering

• Final energetic evaluation

Page 115: Protein-Protein Docking

From Rigid to Flexible Docking

Semi-Flexible Protein Docking

• Structure generation

• Filtering

• Final energetic evaluation

Page 116: Protein-Protein Docking

From Rigid to Flexible Docking

Semi-Flexible Protein Docking

• Structure generation

• Filtering

• Side-chain demangling

• Final energetic evaluation

Page 117: Protein-Protein Docking

From Rigid to Flexible Docking

Side-Chain Flexibility

• Bond distances and bond angles constant

• Torsion angles variable

Page 118: Protein-Protein Docking

From Rigid to Flexible Docking

Torsion Angle Space

• Instead of 10 - 20 atoms with 3 coordinates

• Up to 4 torsion angles

• Typical binding site:

– 50 amino acids, 600 atoms

– 200 instead of 1800 degrees of freedom

Page 119: Protein-Protein Docking

From Rigid to Flexible Docking

Torsion angle distribution

0 60 120 180 240 300 3600

200

400

600

800

1000

1200co

unts

χ1 [°]

Page 120: Protein-Protein Docking

From Rigid to Flexible Docking

Torsion angle distribution

0 60 120 180 240 300 3600

60

120

180

240

300

360χ 2 [°

]

χ1 [°]

Page 121: Protein-Protein Docking

Flexible Docking: Combinatorial Problem

Combinatorial Problem

• Identify set of rotamers with lowest energy

• Typical binding site: 50 side-chains

• Binding site defined via distance between A/B

• Number of possible combinations: ~1060

Page 122: Protein-Protein Docking

Flexible Docking: Combinatorial Problem

Scoring Function

• No sophisticated energetic evaluation feasible

• Simple and fast scoring function: AMBER

• Decomposition into three components

���<

++=i ij

pwji

i

tpli

tpltotal

srrEEEE ,

Page 123: Protein-Protein Docking

Flexible Docking: Combinatorial Problem

Earlier Approaches

• Leach (1994): A* algorithm for side-chain

optimization (ligand docking)

• Desmet et al. (1992): Dead End Elimination

– Simple inequality

– Iteratively applied

Page 124: Protein-Protein Docking

Flexible Docking: Combinatorial Problem

How to solve the problem?

• Multi Greedy method

– Fast and simple heuristic

– Suboptimal solutions

• ILP formulation

– Optimal solution

Page 125: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Methodroot node

Page 126: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Methodroot node

11 12

Page 127: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

11 12

Page 128: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

11 12

21 2122 2223 2324 24

Page 129: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

11 12

21 2122 2223 2324 24

Page 130: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

E1/1 E1/2

E1

���<

++=i ij

pwji

i

tpli

tpltotal

srrEEEE ,

tplEE111 =

pwtpltpl EEEE1111 2,1211/1 ++=

Page 131: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

E1

E1/1 E1/2 E1/3 E1/4 E2/1 E2/2 E2/3 E2/4

E2

Etpl

Page 132: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Page 133: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Page 134: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Page 135: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Page 136: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Page 137: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Page 138: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Page 139: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Method

Page 140: Protein-Protein Docking

Flexible Docking: Multi Greedy Approach

Multi Greedy Approach

• Fast

• Correctly demangles side chains for test set

• May yield suboptimal solutions

• How to determine the quality of the solution?

• Optimal algorithm: based on polyhedral optimization, ILP formulation

Page 141: Protein-Protein Docking

ILP-Based Algorithm

Graph Problem

For each rotamer r of side-chain i create a node with weight tplii rr

EvE =)(ri

v

Page 142: Protein-Protein Docking

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

For each rotamer r of side-chain i create a node with weight tplii rr

EvE =)(ri

v

Page 143: Protein-Protein Docking

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

A k-partite graph with partitions V1…Vk is constructed

Page 144: Protein-Protein Docking

ILP-Based Algorithm

V1

Graph Problem

Pairwise interactions are represented by edges uv with weight

V2 Vk……..

pwji sr

EuvE ,)( =

Page 145: Protein-Protein Docking

ILP-Based Algorithm

V1

Graph Problem

Pairwise interactions are represented by edges uv with weight

V2 Vk……..

pwji sr

EuvE ,)( =

Page 146: Protein-Protein Docking

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

Page 147: Protein-Protein Docking

ILP-Based Algorithm

Graph Problem

rotamer graphV1 V2 Vk……..

Page 148: Protein-Protein Docking

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

���<

++=i ij

pwji

i

tpli

tpltotal

srrEEEE ,

Page 149: Protein-Protein Docking

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

��∈∈

+=RGuvRGv

uvEvERGE )()()(

Page 150: Protein-Protein Docking

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

• Binary decision variables for nodes and edges

– xv for each node v

– xuv for each edge uv

• Node is selected if xv = 1

• Edge is selected if xuv = 1

��∈∈

+=RGuvRGv

uvEvERGE )()()(

Page 151: Protein-Protein Docking

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

��∈∈

+=Euv

uvVv

v uvExvExE )()(

• Binary decision variables for nodes and edges

– xv for each node v

– xuv for each edge uv

• Node is selected if xv = 1

• Edge is selected if xuv = 1

Page 152: Protein-Protein Docking

ILP-Based Algorithm

Graph Problem

V1 V2 Vk……..

��∈∈

+=Euv

uvVv

v uvExvExE )()(

Page 153: Protein-Protein Docking

ILP-Based Algorithm

Integer Linear Program

��∈∈

+=Euv

uvVv

v uvExvExE )()(

Determine xv, xuv that minimize E

Page 154: Protein-Protein Docking

ILP-Based Algorithm

ILP: Basic Constraint System

��

���

� + ��∈∈ Euv

uvVv

v uvExvEx )()(min

E uvxx

E uvxx

vuv

uuv

∈≤

∈≤

allfor

allfor

}...1{ allfor 1 kixiVv

v ∈=�∈

s.t.

Page 155: Protein-Protein Docking

ILP-Based Algorithm

��

���

� −+− ��∈∈

))(())((min maxmax EuvExEvExEuv

uvVv

v

ILP: Basic Constraint System

E uvxx

E uvxx

vuv

uuv

∈≤

∈≤

allfor

allfor

}...1{ allfor 1 kixiVv

v ∈=�∈

s.t.

Page 156: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

Solve ILP

Branch & Cut

Page 157: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

Branch & Cut

Solve ILP

Page 158: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

Solve ILP

Branch & Cut

Page 159: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

Solve LP

Solve ILP

Branch & Cut

Page 160: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

Solve ILP

Solve LP

Branch & Cut

Page 161: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

Solve ILP

Solve LP

Branch & Cut

Page 162: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

infeasible

Solve ILP

Solve LP

Branch & Cut

Page 163: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

infeasible

Solve ILP

Solve LP

Branch & Cut

Page 164: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

infeasible

Solve ILP

Solve LP

Search for a cutting plane

Branch & Cut

Page 165: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

Search for a cutting plane

infeasible

Solve ILP

Solve LP

Branch & Cut

Page 166: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

add it to the LP

non feasible

Solve ILP

Solve LP

Search for a cutting plane

Branch & Cut

Page 167: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Branch & Cut

Page 168: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Branch & Cut

Page 169: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Branch & Cut

Page 170: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Branch & Cut

Page 171: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Branch & Cut

Page 172: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Solve ILP xi j= 0

Solve ILP xi j= 1

Branch & Cut

Page 173: Protein-Protein Docking

ILP-Based Algorithm: Branch & Cut

feasible

infeasible

Solve ILP

Solve LP

add it to the LP

Search for a cutting plane

Solve ILP xi j= 0

Solve ILP xi j= 1

Branch & Cut

Page 174: Protein-Protein Docking

Semi-Flexible Docking: Results

Energetic Evaluation

• Based on Jackson & Sternberg

• Energetic evaluation has to be extended:

– inclusion of internal energies

• AMBER torsion energies:

torscavelebind GGGG ∆+∆+∆=∆

Page 175: Protein-Protein Docking

ILP-Based Algorithm

Implementation

• Molecular DS, rotamers, energies: BALL

• Branch-&-Cut: LEDA and ABACUS

• LP solver: CPLEX and SOPLEX

Page 176: Protein-Protein Docking

ILP-Based Algorithm: Results

Docking Trypsin/BPTI - Results

Page 177: Protein-Protein Docking

Semi-Flexible Docking: Results

Running Times(UltraSparc II, 333 MHz, per candidate)

Avg. time [min]Stage

Multi Greedy ILPCalculation of energies 30Combinatorial problem 3 14Side-chain optimization 5Energetic evaluation 70Total 108 119

Page 178: Protein-Protein Docking

Semi-Flexible Docking: Results

Results I

example RMSD/Å

1TPO/4PTI 1.41SBC/2CI2 1.85CHA/2OVO 3.2

Page 179: Protein-Protein Docking

Semi-Flexible Docking: Results

Results IIBPTI (LYS 15)

Page 180: Protein-Protein Docking

Semi-Flexible Docking: Results

Results III

-200 -100 0 100 200 300 4000

10

20

30

40

50

60

RM

SD

[10-1

0 m]

∆Gbind

[kJ/mol]

Trypsin/BPTI

Page 181: Protein-Protein Docking

Holm&Sander 1992 Monte-CarloBruccoleri&Novotny 1992 Exhaustive SearchDesmet et al. 1992 Dead-End-EliminationTotrov&Abagyan 1994a Monte-CarloTotrov&Abagyan 1994b Monte-CarloLaughton 1994 Local Homology ModelingLeach 1994 A*-AlgorithmWeng et al. 1996 Exhaustive SearchLeach&Lemon 1998 A*-Algorithm

Page 182: Protein-Protein Docking

182

Integration of Experimental Data

• Main problem in Docking: Energy Function

• Avoid the energy calculation!

• Include experimental data

– Containing geometric information

– Derived from simple and inexpensive experiment

� Nuclear Magnetic Resonance (NMR)

Page 183: Protein-Protein Docking

NMR-Based Protein Docking

Overview

• What is NMR spectroscopy?

• Defining a NMR-based scoring function– Predicting chemical shifts

– Synthesizing spectra

– Comparing NMR spectra

• Results

Page 184: Protein-Protein Docking

184NMR - Basics

NMR - Basics• 1H nuclei possess spin angular momentum

Page 185: Protein-Protein Docking

185NMR - Basics

NMR - Basics• 1H nuclei possess spin angular momentum

• In an external magnetic field B0, each nucleus can assume one of two spin states: α or β

α

β

0B E∆

Page 186: Protein-Protein Docking

186NMR - Basics

NMR - Basics• 1H nuclei possess spin angular momentum

• In an external magnetic field B0, each nucleus can assume one of two spin states: α or β

• Supplying energy ∆E enables transition

α

β

E∆ν⋅h

Page 187: Protein-Protein Docking

187NMR - Basics

NMR - Basics• 1H nuclei possess spin angular momentum

• In an external magnetic field B0, each nucleus can assume one of two spin states: α or β

• Supplying energy ∆E enables transition

α

β

E∆

Page 188: Protein-Protein Docking

188NMR - Basics

NMR - Basics• ∆E depends on

– magnitude of external field

– electronic structure surrounding the nucleus

δ

Page 189: Protein-Protein Docking

NMR - Basics

NMR: The Hardware

Page 190: Protein-Protein Docking

190NMR - Basics

1D 1H-NMR spectra

Page 191: Protein-Protein Docking

191NMR - Basics

Structural Information in Spectra

• Chemical shift in proteins depends on– Topology (chemical environment)

– Geometry (conformation)

• Certain experiments can– identify which proton causes which peak

(assignment)

– yield distance information (e.g., NOE constraints)

Page 192: Protein-Protein Docking

192NMR-Based Protein Docking

NMR-based Protein Docking

∆ ∆ ∆ ∆

Experiment

Docking

Page 193: Protein-Protein Docking

193NMR-Based Protein Docking

NMR-based Protein Docking

∆ ∆ ∆ ∆

Experiment

Docking

< < <

Page 194: Protein-Protein Docking

194NMR-Based Protein Docking

RCδδ =

Random coil shift

Page 195: Protein-Protein Docking

195NMR-Based Protein Docking

RCδδ =

+

E�

ESRC δδδ +=

Electrostatic field

Page 196: Protein-Protein Docking

196NMR-Based Protein Docking

ESRC δδδ +=

Ring current effect

Page 197: Protein-Protein Docking

197NMR-Based Protein Docking

ESRC δδδ +=

RSB�

RSESRC δδδδ ++=

Ring current effect

Page 198: Protein-Protein Docking

198NMR-Based Protein Docking

RSESRC δδδδ ++=Magnetic Anisotropy

AnisoB�

AnisoRSESRC δδδδδ +++=

Page 199: Protein-Protein Docking

199NMR-Based Protein Docking

Spectrum Synthesis/Comparison

• Removal of rapidly exchanging protons

• Lorentzian peaks of equal width

�−+

=i

Wi

S2

)(1

1)(δδ

δ

• Spectrum comparison via difference area ∆

−=−∆ δδδ dSSSS expcalcexp.calc |)()(|)( ...

Page 200: Protein-Protein Docking

NMR-Based Protein Docking

Synthesized Spectra

Page 201: Protein-Protein Docking

NMR-Based Protein Docking

Results – Test Set

• Bound complex structures (PDB)– 1DT7 A/B

– 1DT7 A/X

– 1CFF

– 1CKK

• NMR assignments (BMRB)– Reconstruction of exp. spectra

Page 202: Protein-Protein Docking

NMR-Based Protein Docking

Results - Overview

121DT7 A/B

121CKK

241CFF

101DT7 A/X

Rank of 1st true positive

# false positive among top 10Complex

Page 203: Protein-Protein Docking

NMR-Based Protein Docking

1DT7 A/X (NMR-based)

Page 204: Protein-Protein Docking

NMR-Based Protein Docking

1DT7 A/X (ACE)

Page 205: Protein-Protein Docking

NMR-Based Protein Docking

Page 206: Protein-Protein Docking

206NMR-Based Protein Docking

Open Problems

• Improving the shift model

– H-bonds

– Solvent effects

• Modeling peak width, couplings?

• Blind predictions of protein complexes

• Validation with larger data set and exp. data

Page 207: Protein-Protein Docking

Summary

• Protein Docking is an interesting problem!

• Key difficulties

– Energetic evaluation

– Protein flexibility

• Integration of experimental evidence

improves scoring

Page 208: Protein-Protein Docking

Scoring Functions for Protein Docking

Free Energy Functions

• Contact-based

– Atomic contact potential

– Residue contact potentials

• Molecular Mechanics

• Surface-based potentials

• …..

Page 209: Protein-Protein Docking

Scoring Functions for Protein Docking

Contributions to Free Energy I

• Hydrogen bonds

• Salt bridges

• Entropic contributions

– Solvent effects

– Loss of side-chain entropy

– Loss of DOF on binding

Page 210: Protein-Protein Docking

Scoring Functions for Protein Docking

Contributions to Free Energy II

• Electrostatic interactions

– Coulomb-based

– Continuum models (solvent effects)

• Hydrophobic interactions

• Van-der-Waals interactions

• ….

Page 211: Protein-Protein Docking

Scoring Functions for Protein Docking

Energetic EvaluationJackson, Sternberg (1994):

– continuum electrostatics

• solution of the Poisson-Boltzmann equation

• includes electrostatic solvent effects

– cavitation free energy

• measure for the hydrophobic interaction

vdWconfcavelebind GGGGG ∆+∆+∆+∆=∆ cavelebind GGG ∆+∆=∆

Page 212: Protein-Protein Docking

Scoring Functions for Protein Docking

Energetic Evaluation

Jackson, Sternberg (1994):

cavelebind GGG ∆+∆=∆

Page 213: Protein-Protein Docking

Scoring Functions for Protein Docking

Energetic Evaluation

Jackson, Sternberg (1994):

cavelebind GGG ∆+∆=∆BA

intBsol

Asolele GGGG −∆+∆∆+∆∆=∆

Page 214: Protein-Protein Docking

BALL: Design und Architektur

Poisson Boltzmann Equation

• Spatially dependent dielectric constant ε(r)– ~2-4 inside protein– 78 outside (water)

• Charge distribution ρ � potential φ• Solve differential equation on grid• Yields total electrostatic free energy ∆GES

( )0

2 )()(sinh)()()(

ερϕκϕε r

Tk

rerrr

0 ����� −=��

��

�−∇∇

Page 215: Protein-Protein Docking

Scoring Functions for Protein Docking

Continuum Electrostatics

AsolG∆∆

BsolG∆∆

BAintG −∆

BAintG −∆

Page 216: Protein-Protein Docking

Scoring Functions for Protein Docking

Energetic Evaluation

Jackson, Sternberg (1994):

cavelebind GGG ∆+∆=∆BA

intBsol

Asolele GGGG −∆+∆∆+∆∆=∆

)( BAABMSMSMScav AAAAG −−=∆=∆ γγ

Page 217: Protein-Protein Docking

From Rigid to Flexible Docking

Docking of Unbound Structures

• All previous examples are “easy”– structures A and B derived from true complex

structure (bound structures)

– Geometric complementarity guaranteed!

• More interesting problem:– Use unbound structures

– Problem: few test sets - A, B, and AB have to be known!

Page 218: Protein-Protein Docking

Protein Docking: Introduction

Why Docking?

Page 219: Protein-Protein Docking

Helmer-Citterich, Tramontano, J. Mol. Biol. 1994

Decomposition of A/B into Disks

• Calculate protein surface• Decompose into disks along z-axis• Decompose disk into segments of equal area• Differential surface: (r2 – r1), (r3 – r2),…

Fig. after Helmer-Citterich, Tramontano, JMB 1994

r1 - rn….r6 – r5r5 – r 4r4 – r3r3 – r2r2 – r1

Page 220: Protein-Protein Docking

Helmer-Citterich, Tramontano, J. Mol. Biol. 1994

Matching of Disc Segments

• Try to match windows, i.e. stretches of successive radii in a disk, from A and B

• Matching windows define tentative transformations

• Aggregate overlapping windows to regions

Page 221: Protein-Protein Docking

Helmer-Citterich, Tramontano, J. Mol. Biol. 1994

Horizontal Windows GrowingA

B

Page 222: Protein-Protein Docking

Helmer-Citterich, Tramontano, J. Mol. Biol. 1994

Overall Algorithm• Determine surface of A and B

• Decompose A into disks along z-axis

• For all rotations of B:– Decompose B into disks

– Find matching windows on differential surface

– Grow windows horizontally

– Grow windows vertically (→ regions)

– Each region defines a transformation