Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust...
-
date post
21-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of Don't fffear the buccaneer Kevin Cowtan, York. ● Map simulation ⇨ A tool for building robust...
Don't fffear the buccaneerKevin Cowtan, York.
● Map simulation⇨ A tool for building robust statistical methods
● 'Pirate'⇨ A new statistical phase improvement method
● 'Buccaneer'⇨ A new statistical chain tracing method
● Results⇨ And a diatribe about their irrelevance
The Royal SocietyYork Structural Biology Laboratory
Map simulation
The Royal SocietyYork Structural Biology Laboratory
Refined modeldensity.
Targetnoisy map.
Simulatednoisy map.
Structurefactors
Known (reference) structure Unknown (work) structure
Phases
Scale factors
Phase errors
• Map simulation is a tool to generate problem specific statistical targets:
Map simulation: Method
The Royal SocietyYork Structural Biology Laboratory
Low|E|
Med.|E|
High|E|
Med.resol.
Highresol.
Lowresol.
Low|E|
Med.|E|
High|E|
Med.resol.
Highresol.
Lowresol.
Transferring the errors:1. Classify the reflections from both structures by |E| and resol.
(Note: we use 225 bins, not 9!)
Map simulation: Method
The Royal SocietyYork Structural Biology Laboratory
Low|E|
Med.|E|
High|E|
Med.resol.
Highresol.
Lowresol.
Low|E|
Med.|E|
High|E|
Med.resol.
Highresol.
Lowresol.
0.1, 0.00.0, 0.0
...
...
...
...
...0.9, 0.80.6, 0.4
...
...
Transferring the errors:2. Copy FOMs by bin from work structure to reference.
(We pick a random FOM from the same bin of the work structure for each reflection in the reference structure.)
Map simulation: Method
The Royal SocietyYork Structural Biology Laboratory
P()
0
Transferring the errors:3. Simulate a phase error in accordance with the distribution
for that FOM:
Map simulation: Method
The Royal SocietyYork Structural Biology Laboratory
|E|2
Resolution
|E|2
Resolution
Transferring the scales:Rescale the reference data to match the work data, after
accounting for the difference in cell volumes.
Map simulation: Method
The Royal SocietyYork Structural Biology Laboratory
Result:
• Map calculated from simulated reference data has same statistical properties as work map.
Notes:
• Need reliable FOMs!
• Can potentially simulate HL coeffs too.
• Should bin FOMs for centric/acentric data separately (if data available).
'Pirate': Rationale• Density modification history has been
dominated by the solvent mask in one form or another.
• Limitations:– What do we do with disordered protein?
– What do we do with ordered solvent?
– Need to know solvent content.
– What do we do for non-proteins?
The Royal SocietyYork Structural Biology Laboratory
'Pirate': Method• Divide map into a multi-dimensional
continuum of states.
The Royal SocietyYork Structural Biology Laboratory
e.g. Local mean and local variance classify map into:
●Electron sparse/dense●Disordered/ordered
Dense, ordered
Dense, disordered
Sparse, ordered
Sparse, disordered
'Pirate': MethodCompare simulated and known map to obtain density distributions for each region, then apply these distributions to the unknown map.
The Royal SocietyYork Structural Biology Laboratory
Reference structure: Work structure:
'Pirate': Method• Obtain per-grid density probability distributions
– Also allows NCS, known density etc.
• Transform using equations of Bricogne (1992).– Similar to Terwilliger (1999).
– Map probability becomes phase probability distribution.
The Royal SocietyYork Structural Biology Laboratory
Bricogne (1992) Proc. CCP4 Study WeekendBricogne (1997) Methods in Enzymology
R
I
'Pirate': Method
The Royal SocietyYork Structural Biology Laboratory
• Finally, combine new distribution with original HL coefficients, for new phases and maps.
• Gives final 'improved' phase probabilities.
R
I
R
I
X ABCD
'Pirate': Method
The Royal SocietyYork Structural Biology Laboratory
Notes:• No solvent content required, since reference map is
pre-scaled to work map.
• Single step process (for now)
– No solvent mask -> no mask to refine.
• Should work for novel problems too (with related reference structure)
– e.g. No solvent, disordered domains, metaloproteins.
'Buccaneer': MethodCompare simulated map and known model to obtain likelihood target, then search for this target in the unknown map.
The Royal SocietyYork Structural Biology Laboratory
Reference structure: Work structure:
LLK
'Buccaneer': Method• Compile statistics for reference map in 4A
sphere about C => LLK target.
The Royal SocietyYork Structural Biology Laboratory
4A sphere about Ca also used by 'CAPRA'Ioeger et al. (but different target function).
• Use mean/variance (in future histogram).
'Buccaneer': MethodFind candidateC positionsusing LLK-fffearsearch.(~1 per 3 residues)
The Royal SocietyYork Structural Biology Laboratory
'Buccaneer': MethodExtend fromcandidates using 2 residue lookahead withRamachandranrestraints.
(Same target-fn.but in real space)
Then ARP/wARP?
The Royal SocietyYork Structural Biology Laboratory
Lookahead search c.f.Jones, Oldfield, Terwilliger, etc.
ResultsProblem: “tuning” of one program to another.
The Royal SocietyYork Structural Biology Laboratory
Ecorr
/ MPEw
/ m0
Phasing Ph.Impr.Ecorr
/ MPEw
/ m0
'dm'
'resolve'
'dm'
'resolve'
'mlphare'
'solve'
0.508 / 59.1 / 1.35
0.474 / 61.0 / 0.83
0.700 / 50.6 / 0.61
0.436 / 67.8 / 0.37
0.750 / 47.7 / 0.68
0.710 / 48.0 / 0.67
'resolve' version 2.0.5, with 'no build' optionin order to compare model-free phasing.
Statistics are: Ecorr
: E-map correlation;
MPEw: weighted Mean Phase Error;
m0: gradient of regression of cos() vs.
FOM
What other examples of “tuning” are present in this case?
ResultsAfter 'solve', but with other tuning problems:
The Royal SocietyYork Structural Biology Laboratory
Ecorr
/ MPEw
/ m0Ph.Impr.
'pirate' 1
'resolve'
'dm' 0.750 / 47.7 / 0.68
0.710 / 48.0 / 0.67
0.775 / 43.2 / 1.08
'pirate' 2
'pirate' 3
'pirate' 6
'pirate' 5
'pirate' 4
0.762 / 43.3 / 0.98
0.824 / 37.2 / 1.02
0.788 / 39.7 / 0.94
0.745 / 44.7 / 1.02
0.759 / 42.7 / 0.94
Reference structures
Beta-mannosidase (2003) StructureBoraston, Revett, Boraston, Nurizzo, Davies
Results
The Royal SocietyYork Structural Biology Laboratory
SAD 'dm'
Results
The Royal SocietyYork Structural Biology Laboratory
'resolve' 'pirate'
ResultsOther cases:
– MIRAS:
• Commercial structure phased with MLPHARE.
• Results better than 'dm'.
– High resolution:
• RNAse phase extension to 1.5, 1.0A.
• Map improved (unlike 'dm') with appropriate reference structure.
• (But not as good a dual space methods: ACORN).
The Royal SocietyYork Structural Biology Laboratory
Future• 'Pirate' available soon Q1 2004 (after tuning)
• 'Pirate' flexi-domain averaging Q3 2004
• 'Buccaneer' 2004?
Technology:
Both applications are extremely simple, built using Clipper libraries, less than 1000 lines of code each, less than 2 months development.
The Royal SocietyYork Structural Biology Laboratory
Conclusions• Very simple but effective applications can be
built with improved statistical targets from map simulation calculations.
• Preliminary results on real data suggest this approach is competitive with the state-of-the-art, even at an early stage of development.
• Need reliable phase probability distributions (figures of merit).
The Royal SocietyYork Structural Biology Laboratory
Acknowledgements
● G. Bricogne(Original probability transformation eqns.)
● T. Terwilliger(First implementation of statistical dm.)
● E. Dodson(Test data)
● Royal Society (KDC funding)