Phasing in Macromolecular Crystallography

Phasing in Macromolecular Crystallography

What do we do with it now?

Phasing in Macromolecular Crystallography

How do we get from spots on a screen to a pretty picture of our protein?

Heidi’s Gun4 Data and Stucture

BY CALCULATING AMPLITUDES AND PHASES!!

Why Do We Care So Much About Phases?

Diffraction by X-rays

h = 1k = 2l = 0

yx

z dhkl

SOURCE

DETECTORΘ

Θ

k

k0

Bragg’s Lawnλ = 2dsinθ

Diffraction by X-rays

(h,k,l) = (2,1,0)

Crystal

k0

k

SOURCE

The Phase Problem

Φ

A PHASE DIFFERENCE!!

k0

k

Addition of Waves

+

+

f1

f2

f3

Fs

The Phase Problem

f1f2

f3

Fhkl

real axis

imaginary axis

Φhkl

The resulting wave that reaches the detector has a particular phase and amplitude that results from the addition of individual scattering factors from all the atoms in the unit cell, which each have their own phase and amplitude.

Fs = V ∑ fj[cos2π(hx + ky + lz) + isin2π(hx + ky + lz)]N

j=1

The Phase Problem

The Phase Problem:Each reflection we measure during the diffraction experiment tells us the amplitude of a particular Fs, but not its phase.

Addition of Waves

+

+

f1

f2

f3

Fs

Solving the Phase Problem• How do we figure

out the phase of Fs?– Combine Fs (or FP

here) with a wave of known phase (fH) to get a new resultant wave (FPH).

FP(unknown phase)

fH(known phase)

FPH(unknown phase)

+

FPH = FP + fH

Solving the Phase Problem:Harker Constructions

real axis

imaginary axis

FPH = FP + fH

fH

FPH FP

Hurray!! We’ve solved the phase problem!!Well, sort of… Now we actually have two phases to choose from.

Solving the Phase Problem

FPH = FP + fH

How do we actually figure out the amplitude and phase of fH?Amplitude: determined during the diffraction experimentPhase: determine the real space (x,y,z) location of “H” through the use of the Patterson Function

real axis

imaginary axis

fH

FPH FP

The Patterson Method

P(u,v,w) = 1/V ∑ |Fhkl|2cos2π(hu + kv + lw)h,k,l

The Patterson function is similar to the electron density equation in that they both use amplitudes measured during the diffraction experiment.

However, Patterson functions do not require phases to be computed, just the amplitudes of the (h,k,l) reflections

The Patterson MethodThe Patterson map provides a map of interatomic vectors within the unit cell and the peaks in a Patterson map are proportional to the electron density at a particular position.

12

3

REAL SPACE

1-2

1-3

2-1

2-3

3-13-2

PATTERSON SPACE

The Patterson MethodThe Patterson function is especially useful

because of two important features:

1. The magnitude of P(u,v,w) is proportional to the product of the atomic numbers of the atoms at the ends of the vector u = (u,v,w).

2. The symmetry within a unit cell is imposed on the peaks in a Patterson map. This means that symmetry related atoms will also have a peak in the Patterson map.

Harker Sections

xz

y

180°(x,y,z)

(x,y,z) - (-x,y,-z)Should have a peak on the v=0 Harker section of the Patterson map.

= (2x,0,2z) = (u,v,w)

x = u/2, y = 0, z = w/2

(-x,y,-z)

Harker Sections

v = 0 Harker Section

x = u/2, y = 0, z = w/2(2x,0,2z) = (u,v,w)

Solving the Phase Problem

FPH = FP + fH

How do we actually figure out the amplitude and phase of fH?Amplitude: determined during the diffraction experimentPhase: determine the real space (x,y,z) location of “H” through the use of the Patterson Function

real axis

imaginary axis

fH

FPH FP

Methods for Determining Phases

• Isomorphous Replacement– Single Isomorphous Replacement (SIR)– Multiple Isomorphous Replacement (MIR)

• Anomalous Dispersion– Single Wave-Length Anomalous Dispersion

(SAD)– Multiple Wave-Length Anomalous Dispersion

(MAD)• Molecular Replacement• Direct Methods

Isomorphous Replacement• The Goal: Modify our crystal by

having it bind a heavy atom.• Why? A heavy-atom derivative of

our crystal will create a change in the intensity of observed reflections relative to our native crystal.

Isomorphous Replacement• Why do we use heavy atoms?

– Because they have lots of electrons and scatter x-rays more strongly.

N

H1/2

N

H

avg

2avg

ff

NN2I

ΔI NH = number of heavy atomsfH = heavy atom scattering powerNN = number of native atomsfN = avg. scattering power for native atoms

Isomorphous Replacement

f1f2

f3

FP

real axis

imaginary axis

Φhkl

f1f2

f3

FP

real axis

imaginary axis

Φhkl

fH

FPH

NATIVE HEAVY-ATOM DERIVATIVE

Isomorphous Replacement

f1f2

f3

FP

real axis

imaginary axis

Φhkl

fH

FPH

HEAVY-ATOM DERIVATIVE

Isomorphous Replacement• Why do we care about intensity

differences in the observed diffraction pattern?

P(u,v,w) = 1/V ∑ |ΔF|2cos2π(hu + kv + lw)h,k,l

The differences in intensity (ΔF = |FPH| – |FP|) can be used as coefficients for the Patterson Function.This is useful because our Patterson Maps and Harker Sections are now giving us information about the locations of our heavy-atom derivatives. THIS IS REALLY IMPORTANT!!

Heavy Atoms in Isomorphous Replacement

• How do we make derivatives of our crystal?– Trial and error: add a small amount of the metal

reagent (0.1-10 mM) to the crystallization condition and soak the crystal (seconds, minutes, hours, days)

Common Heavy MetalsPlatinum Potassium Thiocyanate

Gold CyanidePotassium Tetrachloro Platinate

Thimerasol

Amino Acid LigandsHis (pH>7), Lys (pH>9)

Cys, His (pH>7), Lys (pH>9)Cystines, His (pH>7)

Cys, His (pH>7)

And many more…

Detecting Derivatives• Unfortunately, like most things in

crystallography, just about everything is a variable when trying to get a derivative.– So how do we know when we’ve gotten a

derivative?– Answer: Look for differences in spot

intensity between native and potential derivative crystals, especially at low resolution where heavy atom differences will be strongest

Detecting Derivatives• It must take a long time to find a

derivative!!!Fortunately, we don’t have to collect a full dataset for each derivative

Scaling a few images of a derivative dataset against a native dataset is enough to detect differences in intensity

Detecting Derivatives• First things first: A derivative crystal

should be different than the native crystal.

Detecting DerivativesUse Scalepack to scale a native dataset against a few images (~3°) of a potential derivative

When derivative and native crystals are scaled together, the Χ2 value should be >1.Assuming the cell parameters look good,

If Χ2 is…~50, probably not a derivative (non-isomorphous, wrong indexing, or way too many substitutions)

~10, good chance that you have a derivative

~2-5, well…I would keep looking for derivatives, but keep this one in mind

1, scales well with native so you’ve probably got a native crystal (no heavy-atom substitution)

Isomorphous Replacement:The Good and the Bad

The Good-Can be quick since data collection can be done at home-Works well for proteins purified from sources other than E. coli (e.g. yeast)-Don’t need high quality data (low resolution is okay)

The Bad-Getting a derivative in the first place is not trivial-Getting a derivative crystal that’s isomorphous can be difficult-Data quality can be poor

The Reality of Isomorphous Replacement

FP

real axis

imaginary axis

Φhkl fH

FPH

FPH = FP + fH


In reality, there’s some error in our measurement of FPH and the location of the heavy atoms

FP

real axis

imaginary axis

Φhkl fH

FPH,obs

FPH ≈ FP + fH

ε FPH,calc

ε = FPH,obs – FPH,calc

ε = lack of closure

The Reality of Isomorphous Replacment

Harker Construction of a single reflection from an SIR experiment

Phase ProbabilityP(α) = exp{-ε2(α)/2E2}

Best Phaseαbest = ∫αP(α)dα

Figure of Meritm = ∫P(α)exp(iα)dα/∫P(α)dα

m = 1, no phase errorm = 0.5, ~60° phase error

m = 0, all phases equally probablyMinimize the phase error by using the centroid of the phase distribution (Best Phase).


By using multiple heavy atom derivatives, we can get a better estimate of the correct phase

This is Multiple Isomorphous Replacement (MIR)

The Reality of Isomorphous Replacment

SIR MIR

Anomalous Dispersion Techniques

What is Anomalous Dispersion?

A phenomenon which occurs when electrons absorb and reemit X-rays having an energy close to that of the electron’s nuclear binding energy.

What’s the result of this absorption?

The x-ray’s reemitted have the same wavelength as the incident radiation, but now are phase shifted by 90°.

Effects of Anomalous Dispersion

Radiation scattered by an atom is actually composed of two components:

1. “Normal” Thompson scattering (no phase change relative to incident radiation)

2. A minor anomalous component phase shifted by π/2

f(λ) = f0 + Δf'(λ) + if''(λ) = f'(λ) + if''(λ)

FPH+

FPH-

fH+

fH-

FP+

FP-

|F+| = |F-| Friedel’s Law

fH+

fH-

FP+

FP-

FPH+

FPH-

f''

f''

|F+| = |F-| Breakdown in Friedel’s Law

Bijvoet Differences(That’s Bi-foot)

Wavelength Dependence ofAnomalous Dispersion

Anomalous Signal Increases With Scattering Angle

Using Anomalous Dispersion to Solve the Phase Problem: MAD

P(u,v,w) = 1/V ∑ (|F+| - |F-|)2cos2π(hu + kv + lw)h,k,l

Choosing Appropriate Wavelengths

Peak(f'')

Inflection (f')

Remote

MAD Requirements• Strong anomalous signal (at least 1

Se per 17 kDa of protein)• Tunable x-ray source (synchrotron)• Preferably the ability to measure the

absorbance spectrum of your protein• High solvent content• Best data possible- high resolution

and low Rmerge

Single Wavelength Anamalous Dispersion (SAD)

Solve the phase ambiguity by evaluating the quality of the maps for both solutions using density modification programs.

Basic Requirements:•Strong anomalous signal (usually at least 1 Se per 17 kDa of protein)•As usual, best data possible (resolution and error)•High solvent content (>50%)•Accurate measurement of phasing errors

Sulfur Anomalous Phasing

Sulfur Anomalous Phasing

Ramagopal et al. Acta Cryst. (2003) D59, 1020-1027

Detecting Anomalous Signals

1. Using Scalepack, scale data as normal, except turn on the ANOMALOUS flag (writes out F+ and F- separately in a .sca file).

2. Rescale this .sca file, but this time with the anomalous flag turned off. This compares F+ and F-.

3. Examine the X2 values. Presence of an anomalous signal should give X2 > 1 (or could indicate absorption or detector problems). Useful for also examining the resolution cut-off of the anomalous signal.

The Good and the Bad of MAD/SAD Structure

Determination• Essentially eliminate the isomorphism problems of

SIR/MIR. Generally better phases.• Ability to use molecular biology to derivatize your

protein (SeMet).– May also be able to use naturally bound anomalous scatters

(Zn, Fe, Ca, etc.)• Usually need to go to synchrotron

– However, there’s often an anomalous signal with Cu-Kα; could be useful in SIRAS/MIRAS methods.

• The potential of sulfur anomalous phasing essentially eliminates the need to derivatize your protein– Method can be limited to especially good data

• Due to availability of synchrotron resources, anomalous phasing is the primary method of choice for phasing

Calculation of Protein Phases

After solving the real space location of the heavy atom through isomorphous or anomalous difference Patterson’s, determine the protein phases:

•Refine the xyz coordinates of the heavy atom through cycles of refining the occupancy and B-factor•Determine the protein phases from the refined heavy atom coordinates and refine these phases•Look at the initial map and see how you did•Then go and talk to Devin to see where to go from here

•Programs that can do these steps include PHASES and MLPHARE (available in the CCP4 package)

Automated Methods: SOLVEFortunately, there are automated processes available to do everything from scaling data, solving the location of heavy atoms, and determining protein phases. One such program is SOLVE.

Works by converting each decision making step into an optimization problem through scoring and ranking of possible solutions

1. Locate and refine heavy atom sites through difference Patterson’s or direct methods and generate phases. Converts MAD data to pseudo-SIRAS.

2. Score potential heavy atom sites by four criteria:A. Agreement between calculated and observed Patterson

mapsB. Cross-validation of the heavy-atom sites through

difference Fourier analysis- delete a site in a solution and recalculate phases

C. Figure of Merit (m)D. Non-randomness of the electron density map- identify

solvent and protein regions and score based on connectivity in solvent and protein region.

Tutorial

Use Heidi’s Gun4 MAD data to calculate and analyze Patterson Maps and Harker Sections to determine the real space coordinates of the Selenium within the structure.

Phasing in Macromolecular Crystallography

Documents

Transcript of Phasing in Macromolecular Crystallography