Page 167 - Macromolecular Crystallography
P. 167
156 MACROMOLECULAR CRYS TALLOGRAPHY
pack loosely into the crystalline lattice and are sur-
rounded by layers of solvent. Indeed, protein crys- f(x)
tals are approximately half liquid, with the fraction
of solvent varying from 25 to 85%. In extreme cases,
Local minimum
a protein crystal is not too dissimilar from a glass of
good wine.
Recent years have witnessed further progress in Global minimum
the development of the underlying methodology for
the determination of 3D macromolecular structures.
A particular emphasis is given to high-throughput x
methodologies as an integral part and one of the
Figure 11.1 The basic problem of an optimization problem with a
specific goals of structural biology. one-dimensional function having more than one minimum. The slope
In this chapter, high-throughput automation (gradient) can give useful information concerning which direction to
efforts being developed to meet the needs of seek the minimum and which point to try next. Extrema are
Structural Genomics initiatives will be cast in the characterized by a gradient of zero, so methods that rely solely
on this information will halt also in a local minimum.
framework of an optimization problem. A general
overview will be given on optimization techniques
with a bias specifically towards the problem of optimized is called the objective function or cost func-
crystallographic refinement. This picture will be tion. In the general case, the objective function f(.)
extended as we brush over model building, pro- will depend on several variables, x =(x i , ... , x N ).
gram flow control, decision-making, validation, and The basic issue of an optimization problem is shown
automation. Finer details of different approaches in Fig. 11.1. The plot depicts a one-dimensional func-
will be painted in a conclusive review of some tion f(x) and its dependency on the variable x = (x 1 ).
popular software packages and pipelines. Optimization theory aims to provide methods for
determining the values of the variables x such that
the objective function is either maximized or mini-
11.2 Basics of model building mized. The variables that optimize f(.) are known
and refinement as optimal values. An important practical shortcut
is to not necessarily obtain the optimal values of
11.2.1 Introduction to optimization
the variables but to approach them to a satisfactory
Optimization is an important field of mathematics accuracy (tolerance) within a reasonable amount of
with applications covering virtually all areas of sci- computational time.
ence, engineering, technology, transport, business, Without loss of generality, one can formulate all
etc. It is hardly surprising that much effort has been optimization problems as minimization problems,
invested in this area and that well-matured tech- with the maximum for the function g(.) =−f(.) being
niques and methods exist for solving many kinds the minimum for the function f(.). By definition, f(.)
0
0
0
of optimization problems. State-of-the-art optimiza- has a minimum at point x = (x , ... , x ) if, and only
N
i
0
tion packages are highly complex and fine-tuned if f(x )< f(x) for all x over which the function is
0
software masterpieces that often contain many inge- defined. If the condition f(x )< f(x) is valid only
0
nious ideas, robust heuristics, and decades of man- within a small neighbourhood of x , then f(x) is said
0
power on the underlying research and work. To to have a local minimum in x .
run through the theoretical and algorithmic details Acrystallographicexampleofoptimizationwould
of these tools is clearly beyond the scope of this be the minimization of a least-squares or a negative
chapter. Instead we walk through some basic ideas log-likelihood residual as the objective function,
and considerations. using fractional or orthogonal atomic coordinates
Optimization is concerned with finding extrema as the variables. The values of the variables
(minima and maxima) of functions (provided that that optimize this objective function constitute the
they have them). The function f(.) that should be final crystallographic model. However, due to the