Ligand Building with ARP/wARP

Post on 07-Jan-2016

32 views 0 download

description

Ligand Building with ARP/wARP. Automated Model Building. Given the native X-ray diffraction data and a phase-set To rapidly deliver a complete, accurate and error free model. Back to about 2000: a side project for a PhD student. Building Ligands from Dummy Atoms / Seed Points. - PowerPoint PPT Presentation

Transcript of Ligand Building with ARP/wARP

Ligand Building with ARP/wARP

Automated Model Building

Given the native X-ray diffraction data and a phase-set

To rapidly deliver a complete, accurate and error free model

Building Ligands from Dummy Atoms / Seed Points

Back to about 2000: a side project for a PhD student

Nearest Neighbour Distance Distribution

f ( d

jk

obs

) =

1

σ

m

π

d

jk

obs

d

jk

tar

e

d

jk

o b s

( )

2

+ d

jk

ta r

( )

2

4 σ

m

2

sinh

d

jk

obs

d

jk

tar

2 σ

m

2

0

0.1

0.2

0.3

0.4

0.5

0.6

0 0.5 1 1.5 2 2.5 3 3.5 4

d

obs

Error free distance dtar is 1.5 Å

Expected rmsd is 1.0 Å

N ( d

ij

tar

, 2 σ

m

2

)

ShakeGiven a coordinate error, the inter-atomic distances in a protein model change:

Fit that

into

that !

Building a Ligand into a Difference Mapimagine:

a ligand consisting of N atoms

a density map containing M points

the only thing to do is to correctly select N out of M !

A Simple Example: Select 3 out of 4

• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å

a

bc

d

a b c d

a 0 1.07 Å 0.98 Å 1.01 Å

b 7 0 0.85 Å 2.10 Å

c 2 15 0 0.95 Å

d 1 110 5 0

Triangle Log likelihood Probability

abc -278 2.0*10-108

f ( d

j k

obs

) =

1

σ

m

π

d

j k

obs

d

j k

t ar

e

d

j k

obs

( )

2

+ d

j k

t ar

( )

2

4 σ

m

2

s inh

d

j k

obs

d

j k

t ar

2 σ

m

2

0

0.1

0.2

0.3

0.4

0.5

0.6

0 0.5 1 1.5 2 2.5 3 3.5 4

d

obs

Error free distance dtar is 1.5 Å

Expected rmsd is 1.0 Å

N ( d

ij

t ar

, 2 σ

m

2

)

A Simple Example: Select 3 out of 4

• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å

a

bc

d

Triangle Log likelihood Probability

abc -278 2.0*10-108

abd -12150 0

a b c d

a 0 1.07 Å 0.98 Å 1.01 Å

b 7 0 0.85 Å 2.10 Å

c 2 15 0 0.95 Å

d 1 110 5 0

A Simple Example: Select 3 out of 4

• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å

a

bc

d

Triangle Log likelihood Probability

abc -278 2.0*10-108

abd -12150 0

bcd -12350 0

a b c d

a 0 1.07 Å 0.98 Å 1.01 Å

b 7 0 0.85 Å 2.10 Å

c 2 15 0 0.95 Å

d 1 110 5 0

A Simple Example: Select 3 out of 4

• The task is to find an equilateral triangle• Prior knowledge: edges should have a length 1.0 Å• Reliability: error on data (distances) is 0.01 Å

a

bc

d

Triangle Log likelihood Probability

abc -278 2.0*10-108

abd -12150 0

bcd -12350 0

acd -30 0.9999

a b c d

a 0 1.07 Å 0.98 Å 1.01 Å

b 7 0 0.85 Å 2.10 Å

c 2 15 0 0.95 Å

d 1 110 5 0

N atoms in the ligand molecule

M points in a density map

W X Y Z

A B C D

Ligand Building as a Label Swapping Problem

Qassignment = log P(dijobs | dij

assigned ,error _model)[ ]j= i+1

N

∑i=1

N

• Sources of possible prior information:– Chemical composition of a ligand– Bonding distances – Angle bonded distances– Chirality– VdW interactions

Combinatorial Explosion

N po int s!

N po int s −Natoms( )!

Label Swapping

Initial map 349 grid pointsComplexity 1059

Sparse map 58 grid pointsComplexity 1037

22-atoms molecule of retinoic acid

Topological Extension(a branch and bound approach)

Retinoic acid - topological extension

Topology of the sparse map Topology of the ligand

a

bc

d a

bc

d a

bc

d a

bc

d a

bc

d

Real Space Fit for Final Selection of the Model

22 atoms molecule of retinoic acid: among 100 “top” models:21 are less than 0.5 Å r.m.s.d. from the final modelthe “best” model is 0.14 Å r.m.s.d. from the final model

MTZ file

Protein withoutligand

Ligand

Ligand Building Module in ARP/wARP 6.1

Take the largest object in the

difference map

Build the ligand there (label assignment)

Real space refinement of the

ligand

Ligand Building Module in ARP/wARP 6.1

Location unknown Location known

Single known ligand

Yes (if the largest) No

A ligand out of the list of expected

ligandsNo No

Partially ordered ligand

No No

Working sample

Ligand building

Performance Assessment

Run with default parameters

- PDB and MTZ from the EDS- Ligand PDB from HICUP- Exclude DNA- Exclude ligands covalently bound to the chain- Exclude ligands with partial occupancies

(3821 structures)

Large-Scale Test

1

3

2

4

5

6

78

9

Name-by-name Nearest neighbour

Assume the PDB structure to be correct

Atomic scale(correctly built ligand

into correct site)

Ligand scale(correct site

incorrectly built ligand)

Protein scale(incorrect site)

Accuracy of Ligand Building Process

Size of the Largest Ligand in the Working Sample

2981 structures withLigand size 7

3821 structures

Dependence on Resolution of the Data

Dependence on Ligand DisorderB factors

Dependence on Ligand DisorderR.m.s.d (Ligand_Bfactors)

Dependence on Ligand Size

What is the Ligand Site / Largest Object ?

Typically it is the largest set (cluster) of connected map points where the density is above a threshold

It is however mostly the case that at different thresholds there are different (and even non-overlapping) clusters

Take the largest object in the

difference map

Build the ligand there (label assignment)

Real space refinement of the

ligand

At each density threshold count the number of clusters.

A maximum is reached at typically ~1.5 sigma density level.

Density Clusters and a Fragmentation Tree

1ED5 (nitric oxide synthase), 1.8 Å resolution, Rfactor 21 % (with CNS)

Ligands: 2 x HEM and NGR (N-omega-nitro-L-arginine)

Fragmentation Tree: an Example

1ED5 (nitric oxide synthase), 1.8 Å resolution, Rfactor 21 % (with CNS)

Ligands: 2 x HEM and NGR (N-omega-nitro-L-arginine)

Fragmentation Tree: an Example

Looking for HEM, finding HEM

Scoring of Density Clusters

Looking for NGR, finding NGR

Looking for NGR, finding HEM Looking for HEM, finding NGR

Selection of Correct Density Cluster

Other Lessons ?

Take the largest object in the

difference map

Build the ligand there (label assignment)

Real space refinement of the

ligand

Ligand Building: ARP/wARP 6.1 and perspectives

Location unknown Location known

Single known ligand

Yes (if the largest)

Yes

No

Yes

A ligand out of the list of expected

ligands

No

Yes

No

Yes

Partially ordered ligand

No

No

No

May be

Developers

EMBL Hamburg: Guillaume Evrard, Johan Hattne, Gerrit Langer,

Venkat Parthasarathy, Tilo Strutz, Victor Lamzin and

many in-house friends

NKI Amsterdam: Serge Cohen, Diederick De Vries, Marouane

Jelloul, Krista Joosten, Tassos Perrakis

Former members and collaborators

Richard Morris, Peter Zwart, Francisco Fernandez, Olga

Kirillova, Matheos Kakaris, Gleb Bourenkov, Garib

Murshudov, Alexei Vagin, Andrey Lebedev, Peter Briggs,

Eleanor Dodson, Keith Wilson, Zbyszek Dauter, Gerard

Klejwegt

ARP/wARP - the people