Page 157 - Macromolecular Crystallography
P. 157
146 MACROMOLECULAR CRYS TALLOGRAPHY
structure a good guess of the correct histogram can equations of type 3 and 6 are of a statistical nature
be obtained, resulting in the following equation: and therefore may be less restrictive. Furthermore,
inaccuracies in determining the solvent mask and
H(ρ(x protein ))
the non-crystallographic symmetry operators and
2N masks will compromise the procedure. Neverthe-
1
= H obs F(h j ) e less, the additional information imposed often leads
iφ(h j )−2iπx protein ·h j
V
j=0 to substantial improvement.
(6) Unfortunately, the relations between the electron
Here x protein is a real space coordinate within density, the restraints we have discussed here, and
the protein region, H(ρ(x)) is the expected, non- the structure factors are non-linear. Thus, the only
Gaussian histogram of the electron density and strategy we can adopt is to use the approximate
H obs (ρ(x)) is the observed histogram of protein phases we start out with and improve these iter-
density which may or may not have phase errors. atively. Even this is not straightforward, mainly
Equation 6 cannot be substituted into Eq. 1 and because Eq. 1 is expensive to compute. However,
therefore it does not further reduce the number there exists a powerful and straightforward proce-
of unknowns. However, it does provide addi- dure that is used in virtually all phase refinement
tional equations, their number being determined programs: Fourier cycling.
by the number of independent grid points within In Fourier cycling, the approximate phases we
the unique protein region. Its effectiveness is deter- have available at the beginning of the process of
mined by the difference between the theoretical density modification are used to calculate an ini-
histogram of a protein at a given resolution, and that tial map. The real space restraints, solvent flatness,
of randomly phased data. non-crystallographic symmetry averaging, and his-
togram matching, are imposed on this initial density
map. After Fourier transformation, the structure fac-
10.5 The practice of phase refinement:
Fourier cycling tors obtained typically no longer obey the reciprocal
space constraints such as the measured amplitude
In theory, density modification could produce per- and the phase probability distribution. Therefore
fect phases, if the Eqs 2 to 6 are sufficiently restric- existing reciprocal space restraints are recombined
tive. Let us illustrate this by an example; assume with the phase probability distribution obtained
a crystal with 50% solvent and three-fold non- after back transformation of the restrained electron
crystallographic symmetry. There are 2N Fourier density. These modified structure factors are used to
summations (Eq. 1) with 4N unknowns. After sub- calculate a new map. This new map may, in turn,
stitution with Friedel’s Law (Eq. 2), only N phases no longer obey the real space restraints, so these are
remain unknown, so now there are 3N unknowns in reimposed. The procedure is repeated until it con-
total. For all phases we have experimental informa- verges on a density map satisfying the equations
tion encoded in Hendrickson–Lattman coefficients available as well as is possible. In Fig. 10.1, a flow
(Eq. 3), so we can add N equations to our set. As chart of the process of Fourier cycling in shown.
we know the location of the solvent region we can Before we can understand why Fourier cycling
reduce the number of unknown densities at ind- works, we have to deepen our understanding of
ependent grid points from 2N to N upon substi- the Fourier transform. In particular, we need to
tuting with Eq. 4. Non-crystallographic symmetry understand the effects of modifying density on the
further reduces the number of unknown densities to structure factor amplitudes and phases. The math-
N/3 by substituting with Eq. 5. Histogram matching ematical tool that describes this is the convolution
can further reduce the search space of solutions and operator.
improve convergence. Convolution is a commonly used mathematical
If there are more equations than unknowns, why technique that takes as input two functions, say A(x)
then can we not determine phases accurately with- and B(x). To convolute A(x) with B(x), first take the
out building an atomic model? Well, the additional function A(x) and place it at the origin of the second