Accurate protein-protein docking with rapid calculation

32
Improvement of the Protein-Protein Docking Prediction by Introducing a Simple Hydrophobic Interaction Model: an Application to Interaction Pathway Analysis Masahito Ohue Yuri Matsuzaki Takashi Ishida Yutaka Akiyama † Graduate School of Information Science and Engineering, Tokyo Institute of Technology ‡ JSPS Research Fellow The 7 th IAPR International Conference on Pattern Recognition in Bioinformatics November 9th, 2012, Tokyo, Japan

description

In PRIB2012 Talk (http://prib2012.org) Masahito Ohue, Yuri Matsuzaki, Takashi Ishida, Yutaka Akiyama: "Improvement of the Protein-Protein Docking Prediction by Introducing a Simple Hydrophobic Interaction Model: an Application to Interaction Pathway Analysis", In Proceedings of The 7th IAPR International Conference on Pattern Recognition in Bioinformatics (PRIB2012), Lecture Note in Bioinformatics 7632, 178-187, Springer Heidelberg, 2012. http://link.springer.com/chapter/10.1007%2F978-3-642-34123-6_16

Transcript of Accurate protein-protein docking with rapid calculation

Page 1: Accurate protein-protein docking with rapid calculation

Improvement of the Protein-Protein Docking Prediction

by Introducing a Simple Hydrophobic Interaction Model:

an Application to Interaction Pathway Analysis

Masahito Ohue† ‡ Yuri Matsuzaki† Takashi Ishida† Yutaka Akiyama†

† Graduate School of Information Science and Engineering,

Tokyo Institute of Technology

‡ JSPS Research Fellow

The 7th IAPR International Conference on

Pattern Recognition in Bioinformatics

November 9th, 2012, Tokyo, Japan

Page 2: Accurate protein-protein docking with rapid calculation

• MEGADOCK is a PPI network prediction system

using all-to-all protein docking calculation

– Required a rapidly calculation for all-to-all prediction

– MEGADOCK is 8.8 times faster than ZDOCK

• We proposed new docking score model

added a hydrophobic interaction keeping rapidness

• We achieved the better level of accuracy without

increasing calculation time

2

Summary

2012/11/09 PRIB2012

Page 3: Accurate protein-protein docking with rapid calculation

Introduction

Page 4: Accurate protein-protein docking with rapid calculation

• Proteins play a key role in biological events

Many proteins display their biological functions

by binding to a specific partner protein at a specific site

4

Background

2012/11/09 PRIB2012

ex) Signal transduction

SOS

Ras Raf1

http://en.wikipedia.org/wiki/Epidermal_growth_factor_receptor

Page 5: Accurate protein-protein docking with rapid calculation

• PDB crystal structures are currently increasing

but interacted structures are not enough

5

Background

2012/11/09 PRIB2012

Structural coverage of interactions

While there is structural data for a large part of the interactors (60–80%), the portion of interactions

having an experimental structure or a template for comparative modeling is limited (less than 30%)

→ Computational protein docking is very important

Modified from Stein A, 2011.

Page 6: Accurate protein-protein docking with rapid calculation

• Protein docking is useful of all-to-all protein-

protein interaction prediction

6

Background

2012/11/09 PRIB2012

Need a fast protein docking software for all-to-all

protein-protein interaction prediction

• 1000 proteins’ combinations

→500,500 (1000x1001/2)

• 500,500 PPI predictions require

43,000 CPU days when using ZDOCK*

• On 400 nodes x 12 cores CPU/node

→ still require 9 days

*Mintseris J, et al. Proteins, 2007.

Page 7: Accurate protein-protein docking with rapid calculation

• MEGADOCK has a rapid protein docking engine

• However, MEGADOCK’s docking accuracy is

worse compared to ZDOCKs

• Our purpose is developing more accurate protein

docking method with rapidness 7

Background

2012/11/09 PRIB2012

0

30

60

90

120

150

MEGADOCK ZDOCK 2.3 ZDOCK 3.0

Avg

do

ckin

g t

ime [

min

]

All-to-all PPI Prediction

in 1000 proteins

9 days → 1 day ZDOCK 3.0 MEGADOCK

Page 8: Accurate protein-protein docking with rapid calculation

8

Protein Docking Problem

2012/11/09 PRIB2012

Protein monomeric

structure pair

Docking calculation

Protein complex

(predicted structure)

Page 9: Accurate protein-protein docking with rapid calculation

9

MEGADOCK Docking Engine Overview

2012/11/09 PRIB2012

Receptor FFT

Ligand FFT

Loop for Ligand rotations

Receptor voxelization

Convolution

Score calculation (Inverse FFT)

Save high scoring docking poses O

pe

nM

P P

arallelizatio

n

Ligand voxelization

Receptor

Ligand Master

Node

Worker

Node 1

Worker

Node 2

Worker

Node 3

MPI Parallelization

Page 10: Accurate protein-protein docking with rapid calculation

• 3-D Grid Convolution

10

3-D Grid Convolution by FFT

2012/11/09 PRIB2012

Katchalski-Katzir E, et al. PNAS, 1992.

Receptor Ligand

Convolution using FFT

Convolution

Docking Score Parallel translation

Δθ is 15 degree → 3600

patterns of Ligand rotations

(2-D image)

Page 11: Accurate protein-protein docking with rapid calculation

Previous Score Model (MEGADOCK)

• MEGADOCK’s docking score

– Calculate shape complementarity and electrostatics

with only 1 convolution

rPSC ELEC

11

Ohue M, et al. IPSJ TOM, 2010.

2012/11/09 PRIB2012

Convolution

Page 12: Accurate protein-protein docking with rapid calculation

rPSC

• rPSC (real Pairwise Shape Complementarity)

– Shape complementarity score using only real value

Ohue M, et al. IPSJ TOM, 2010.

12 2012/11/09 PRIB2012

2 -27 -27 -27 -27 -27 2

3 -27 -27 -27 -27 -27 2

3 -27 -27 -27 5 2

3 -27 -27 -27 5 2

3 -27 -27 -27 -27 -27 2

2 -27 -27 -27 -27 -27 2

2 3 3 3 2

2 3 3 3 2

1 1

1 1

1 1

2 2

2 2

1 1

1

1

Receptor rPSC Ligand rPSC

1 1

1

1

11

Receptor

Ligand

Good point!

Penalty

(2-D image)

1 1

1 1

1 1

1 1

1 1

1 1

1

1

Page 13: Accurate protein-protein docking with rapid calculation

• ZDOCK 3.0 convolution

• MEGADOCK convolution

13

Comparison of Convolutions

2012/11/09 PRIB2012

Shape Complementarity = Fit neatly Clash +i Electrostatics = Coulomb force

Desolvation free energy 1 = atom a’s effect atom b’s effect +i Desolvation free energy 2

・・・

= atom c’s effect atom d’s effect +i

Desolvation free energy 6 = atom k’s effect atom l’s effect +i

Docking Score Shape Complementarity (rPSC) +i = Coulomb force

Mintseris J, et al. Proteins, 2007.

Page 14: Accurate protein-protein docking with rapid calculation

Accurate and Fast Docking

• For more accurate protein docking

– Considering hydrophobic interaction

– Previous methods require high calculation cost

• ZDOCK 2.3・・・2 convolutions

• ZDOCK 3.0・・・8 convolutions

• For keeping rapid calculation

– We should keep 1 time convolution

• We proposed simple hydrophobic interaction

model with keeping 1 convolution

14 2012/11/09 PRIB2012

Page 15: Accurate protein-protein docking with rapid calculation

Materials and Methods

Page 16: Accurate protein-protein docking with rapid calculation

• Implementation of hydrophobic interaction

– Used Atomic Contact Energy score

• Atomic Contact Energy (ACE)

– The empirical atomic pairwise potential for estimating

desolvation free energies

– Used by ZDOCK, FireDock*, etc.

• How to add the ACE to MEGADOCK?

– Add the averaged ACE value (non-pairwise) of surface

atom of Receptor

16

Proposed Model

2012/11/09 PRIB2012

Zhang C, et al. J Mol Biol, 1997.

*Andrusier N, et al. Proteins, 2007.

Modified rPSC Model

Page 17: Accurate protein-protein docking with rapid calculation

Modification of rPSC Model

• Considered hydrophobic surface of the receptor

using non-pairewise ACE

17 2012/11/09 PRIB2012

Sum of near atoms’ ACE (space)

(inside of the receptor)

2+H -27 -27 -27 -27 -27 2+H

3+H -27 -27 -27 -27 -27 2+H

3+H -27 -27 -27 5+H 2+H

3+H -27 -27 -27 5+H 2+H

3+H -27 -27 -27 -27 -27 2+H

2+H -27 -27 -27 -27 -27 2+H

2+H 3+H 3+H 3+H 2+H

2+H 3+H 3+H 3+H 2+H

1 1

1 1

1 1

1 1

1 1

1 1

1

1

1+H 1+H

1+H

1+H

1+H 1+H

Receptor

Ligand

Page 18: Accurate protein-protein docking with rapid calculation

Pairwise Potential

• Using table of contact energy

18

eij N Cα C …

N -0.724 -0.903 -0.722 …

Cα -0.119 -0.842 -0.850 …

C -0.008 -0.077 -0.704 …

… … … … …

Receptor

eiN

Ligand

Set 1 on

position of

N atoms

×

FFT

Receptor

eiCα

Ligand

Set 1 on

position of

Cα atoms

×

FFT

Receptor

eiC

Ligand

Set 1 on

position of

C atoms

×

FFT ←Need (#type of atoms/2) times convolutions

・・・

Page 19: Accurate protein-protein docking with rapid calculation

Averaged (Non-Pairwise) Potential

• Using averaged contact energy

19

eij N Cα C … ei

N -0.724 -0.903 -0.722 … -0.495

Cα -0.119 -0.842 -0.850 … -0.553

C -0.008 -0.077 -0.704 … -0.464

… … … … … … ↑ 1 time FFT

Receptor

ei

Ligand

Set 1 on

position of

all atoms

×

FFT

Pros : high speed calculation

Cons: poor accuracy

Page 20: Accurate protein-protein docking with rapid calculation

• Dataset: Protein-protein docking benchmark 4.0

– 176 complex structures

(include both bound/unbound form)

20

Experimental Settings

2012/11/09 PRIB2012

Hwang H, et al. Proteins, 2010.

PDB ID: 1CGI

1CGI receptor

1CGI ligand

2CGA

1HPT

Predicted 1CGI

using 2CGA and 1HPT

Predicted 1CGI

using 1CGI_r and 1CGI_l

bound

unbound

(re-docking)

(real problem)

Page 21: Accurate protein-protein docking with rapid calculation

Proposed Docking Score

• New docking score

– Calculate 3 physico-chemical effects

by only 1 convolution calculation

21 2012/11/09 PRIB2012

Page 22: Accurate protein-protein docking with rapid calculation

• How to evaluate accuracy? → Success Rank

22

Experimental Settings

2012/11/09 PRIB2012

Native

S=2132.1 35.1Å

S=1802.3 2.3Å

S=1623.5 4.1Å

S=300.8 24.8Å

・・・

1st

2nd

3rd

3600th

Docking score Ligand-RMSD

・・・

Success Rank

of this case is 2

Near native solution

(L-RMSD < 5Å)

Ligand-RMSD (root mean square deviation) RMSD with respect to the X-ray structure, calculated for the all heavy atoms of the ligand residues

when only the receptor proteins (predicted structure and X-ray structure) are superimposed.

Page 23: Accurate protein-protein docking with rapid calculation

Results and Discussions

Page 24: Accurate protein-protein docking with rapid calculation

0

100

200

300

400

MEGADOCK Proposed ZDOCK 2.3 ZDOCK 3.0

To

tal ti

me

fo

r 1

76

do

ck

ing

[h

r]

Docking Calculation Time

24

good

2012/11/09 PRIB2012

On TSUBAME 2.0, 1 CPU core (Intel Xeon 2.93 GHz)

Proposed

Comparable

x8.8 faster than ZDOCK 3.0 / x3.8 faster than ZDOCK 2.3

Page 25: Accurate protein-protein docking with rapid calculation

Accuracy in Bound Set

25

good

2012/11/09 PRIB2012

0%

20%

40%

60%

80%

100%

1 10 100 1000

Success R

ate

Number of Predictions

ZDOCK 3.0

ZDOCK 2.3

Proposed

MEGADOCK

ex) The number of cases of “Success Rank ≦ 100”

is 147 (84%(147/176) of all cases).

Achieved the same level of accuracy as ZDOCK 3.0

Page 26: Accurate protein-protein docking with rapid calculation

26

Accuracy in Unbound Set

2012/11/09 PRIB2012

0%

20%

40%

60%

80%

100%

1 10 100 1000

Success R

ate

Number of Predictions

ZDOCK 3.0

ZDOCK 2.3

Proposed

MEGADOCK

ex) The number of cases of “Success Rank ≦ 1000”

is 46 (26%(46/176) of all cases).

good

Achieved the same level of accuracy as ZDOCK 2.3

Page 27: Accurate protein-protein docking with rapid calculation

Prediction Example

27

PDB ID: 1BVN (Unbound Success Rank:

11(MEGADOCK)→1(Proposed))

Receptor: α-amylase

Ligand: Tendamistat

(α-amylase inhibitor)

2012/11/09 PRIB2012

Hydrophobicity scale*

hydrophilic hydrophobic

*Eisenberg D, et al. J Mol Biol, 1984.

The 1st rank predicted structure

by MEGADOCK (wrong position)

Another angle

Page 28: Accurate protein-protein docking with rapid calculation

• Bacterial chemotaxis signaling pathway prediction

– Predicting “interact or not” from docking results*

• Method

28

Application to Pathway Analysis

2012/11/09 PRIB2012

* Matsuzaki Y, et al. JBCB, 2009.

S1

S2

S3

・・・

S6,000

Mean of S μ

S.D. of S σ

Docking

calculationZR2

ZR3,273

ZR1

・・・

・・・

ZRANKRank of

energy score: ZR

Page 29: Accurate protein-protein docking with rapid calculation

• Bacterial chemotaxis signaling pathway prediction

– Predicting “interact or not” from docking results*

– Results (F-measure)

29

Application to Pathway Analysis

2012/11/09 PRIB2012

* Matsuzaki Y, et al. JBCB, 2009. d A(L) B R W D Y C Z X FliG FliM FliN

Tsr ◆ ◆ ◆ ◆ ◆

A(L) ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ Known PPI

B ◆ ◆ ◆ ◆ ◆ Predicted

R ◆ ◆

W ◆ ◆ ◆ ◆ ◆ ◆

D ◆ ◆ ◆ ◆

Y ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆ ◆

C ◆ ◆ ◆ ◆ ◆

Z ◆ ◆ ◆ ◆

X ◆ ◆ ◆ ◆ ◆ ◆

FliG ◆ ◆ ◆ ◆ ◆

FliM ◆

FliN ◆ ◆ ◆

MEGADOCK Proposed ZDOCK 3.0

0.43 0.45 0.49

Page 30: Accurate protein-protein docking with rapid calculation

Conclusion

Page 31: Accurate protein-protein docking with rapid calculation

Conclusion

• We proposed novel docking score including

simple hydrophobic interaction

• We achieved the better level of accuracy without

increasing calculation time

– Same level of accuracy as ZDOCK 2.3

– 8.8 times faster than ZDOCK 3.0

3.8 times faster than ZDOCK 2.3

• Future works

– Developing a hydrophobic interaction model considering

both receptor and ligand protein

– Applying to large PPI network and cross-docking of

structure ensemble

31 2012/11/09 PRIB2012

Page 32: Accurate protein-protein docking with rapid calculation

• Akiyama Lab. Members

• This work was supported in part by

– a Grant-in-Aid for JSPS Fellows, 23・8750

– a Grant-in-Aid for Research and Development of The Next-Generation

Integrated Life Simulation Software

– Educational Academy of Computational Life Sciences

from the Ministry of Education, Culture, Sports, Science and

Technology of Japan (MEXT).

Acknowledgements

32

TECH chan