Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust...

25
Don't fffear the buccaneer Kevin Cowtan, York. Map simulation A tool for building robust statistical methods 'Pirate' A new statistical phase improvement method 'Buccaneer' A new statistical chain tracing method Results And a diatribe about their irrelevance The Royal Society York Structural Biology Laboratory
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust...

Page 1: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Don't fffear the buccaneerKevin Cowtan, York.

● Map simulation⇨ A tool for building robust statistical methods

● 'Pirate'⇨ A new statistical phase improvement method

● 'Buccaneer'⇨ A new statistical chain tracing method

● Results⇨ And a diatribe about their irrelevance

The Royal SocietyYork Structural Biology Laboratory

Page 2: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Map simulation

The Royal SocietyYork Structural Biology Laboratory

Refined modeldensity.

Targetnoisy map.

Simulatednoisy map.

Structurefactors

Known (reference) structure Unknown (work) structure

Phases

Scale factors

Phase errors

• Map simulation is a tool to generate problem specific statistical targets:

Page 3: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Map simulation: Method

The Royal SocietyYork Structural Biology Laboratory

Low|E|

Med.|E|

High|E|

Med.resol.

Highresol.

Lowresol.

Low|E|

Med.|E|

High|E|

Med.resol.

Highresol.

Lowresol.

Transferring the errors:1. Classify the reflections from both structures by |E| and resol.

(Note: we use 225 bins, not 9!)

Page 4: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Map simulation: Method

The Royal SocietyYork Structural Biology Laboratory

Low|E|

Med.|E|

High|E|

Med.resol.

Highresol.

Lowresol.

Low|E|

Med.|E|

High|E|

Med.resol.

Highresol.

Lowresol.

0.1, 0.00.0, 0.0

...

...

...

...

...0.9, 0.80.6, 0.4

...

...

Transferring the errors:2. Copy FOMs by bin from work structure to reference.

(We pick a random FOM from the same bin of the work structure for each reflection in the reference structure.)

Page 5: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Map simulation: Method

The Royal SocietyYork Structural Biology Laboratory

P()

0

Transferring the errors:3. Simulate a phase error in accordance with the distribution

for that FOM:

Page 6: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Map simulation: Method

The Royal SocietyYork Structural Biology Laboratory

|E|2

Resolution

|E|2

Resolution

Transferring the scales:Rescale the reference data to match the work data, after

accounting for the difference in cell volumes.

Page 7: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Map simulation: Method

The Royal SocietyYork Structural Biology Laboratory

Result:

• Map calculated from simulated reference data has same statistical properties as work map.

Notes:

• Need reliable FOMs!

• Can potentially simulate HL coeffs too.

• Should bin FOMs for centric/acentric data separately (if data available).

Page 8: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

'Pirate': Rationale• Density modification history has been

dominated by the solvent mask in one form or another.

• Limitations:– What do we do with disordered protein?

– What do we do with ordered solvent?

– Need to know solvent content.

– What do we do for non-proteins?

The Royal SocietyYork Structural Biology Laboratory

Page 9: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

'Pirate': Method• Divide map into a multi-dimensional

continuum of states.

The Royal SocietyYork Structural Biology Laboratory

e.g. Local mean and local variance classify map into:

●Electron sparse/dense●Disordered/ordered

Dense, ordered

Dense, disordered

Sparse, ordered

Sparse, disordered

Page 10: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

'Pirate': MethodCompare simulated and known map to obtain density distributions for each region, then apply these distributions to the unknown map.

The Royal SocietyYork Structural Biology Laboratory

Reference structure: Work structure:

Page 11: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

'Pirate': Method• Obtain per-grid density probability distributions

– Also allows NCS, known density etc.

• Transform using equations of Bricogne (1992).– Similar to Terwilliger (1999).

– Map probability becomes phase probability distribution.

The Royal SocietyYork Structural Biology Laboratory

Bricogne (1992) Proc. CCP4 Study WeekendBricogne (1997) Methods in Enzymology

R

I

Page 12: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

'Pirate': Method

The Royal SocietyYork Structural Biology Laboratory

• Finally, combine new distribution with original HL coefficients, for new phases and maps.

• Gives final 'improved' phase probabilities.

R

I

R

I

X ABCD

Page 13: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

'Pirate': Method

The Royal SocietyYork Structural Biology Laboratory

Notes:• No solvent content required, since reference map is

pre-scaled to work map.

• Single step process (for now)

– No solvent mask -> no mask to refine.

• Should work for novel problems too (with related reference structure)

– e.g. No solvent, disordered domains, metaloproteins.

Page 14: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

'Buccaneer': MethodCompare simulated map and known model to obtain likelihood target, then search for this target in the unknown map.

The Royal SocietyYork Structural Biology Laboratory

Reference structure: Work structure:

LLK

Page 15: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

'Buccaneer': Method• Compile statistics for reference map in 4A

sphere about C => LLK target.

The Royal SocietyYork Structural Biology Laboratory

4A sphere about Ca also used by 'CAPRA'Ioeger et al. (but different target function).

• Use mean/variance (in future histogram).

Page 16: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

'Buccaneer': MethodFind candidateC positionsusing LLK-fffearsearch.(~1 per 3 residues)

The Royal SocietyYork Structural Biology Laboratory

Page 17: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

'Buccaneer': MethodExtend fromcandidates using 2 residue lookahead withRamachandranrestraints.

(Same target-fn.but in real space)

Then ARP/wARP?

The Royal SocietyYork Structural Biology Laboratory

Lookahead search c.f.Jones, Oldfield, Terwilliger, etc.

Page 18: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

ResultsProblem: “tuning” of one program to another.

The Royal SocietyYork Structural Biology Laboratory

Ecorr

/ MPEw

/ m0

Phasing Ph.Impr.Ecorr

/ MPEw

/ m0

'dm'

'resolve'

'dm'

'resolve'

'mlphare'

'solve'

0.508 / 59.1 / 1.35

0.474 / 61.0 / 0.83

0.700 / 50.6 / 0.61

0.436 / 67.8 / 0.37

0.750 / 47.7 / 0.68

0.710 / 48.0 / 0.67

'resolve' version 2.0.5, with 'no build' optionin order to compare model-free phasing.

Statistics are: Ecorr

: E-map correlation;

MPEw: weighted Mean Phase Error;

m0: gradient of regression of cos() vs.

FOM

What other examples of “tuning” are present in this case?

Page 19: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

ResultsAfter 'solve', but with other tuning problems:

The Royal SocietyYork Structural Biology Laboratory

Ecorr

/ MPEw

/ m0Ph.Impr.

'pirate' 1

'resolve'

'dm' 0.750 / 47.7 / 0.68

0.710 / 48.0 / 0.67

0.775 / 43.2 / 1.08

'pirate' 2

'pirate' 3

'pirate' 6

'pirate' 5

'pirate' 4

0.762 / 43.3 / 0.98

0.824 / 37.2 / 1.02

0.788 / 39.7 / 0.94

0.745 / 44.7 / 1.02

0.759 / 42.7 / 0.94

Reference structures

Beta-mannosidase (2003) StructureBoraston, Revett, Boraston, Nurizzo, Davies

Page 20: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Results

The Royal SocietyYork Structural Biology Laboratory

SAD 'dm'

Page 21: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Results

The Royal SocietyYork Structural Biology Laboratory

'resolve' 'pirate'

Page 22: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

ResultsOther cases:

– MIRAS:

• Commercial structure phased with MLPHARE.

• Results better than 'dm'.

– High resolution:

• RNAse phase extension to 1.5, 1.0A.

• Map improved (unlike 'dm') with appropriate reference structure.

• (But not as good a dual space methods: ACORN).

The Royal SocietyYork Structural Biology Laboratory

Page 23: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Future• 'Pirate' available soon Q1 2004 (after tuning)

• 'Pirate' flexi-domain averaging Q3 2004

• 'Buccaneer' 2004?

Technology:

Both applications are extremely simple, built using Clipper libraries, less than 1000 lines of code each, less than 2 months development.

The Royal SocietyYork Structural Biology Laboratory

Page 24: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Conclusions• Very simple but effective applications can be

built with improved statistical targets from map simulation calculations.

• Preliminary results on real data suggest this approach is competitive with the state-of-the-art, even at an early stage of development.

• Need reliable phase probability distributions (figures of merit).

The Royal SocietyYork Structural Biology Laboratory

Page 25: Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust statistical methods ● 'Pirate' ⇨ A new statistical phase improvement.

Acknowledgements

● G. Bricogne(Original probability transformation eqns.)

● T. Terwilliger(First implementation of statistical dm.)

● E. Dodson(Test data)

● Royal Society (KDC funding)