Page 168 - Macromolecular Crystallography
P. 168
MODEL BUILDING, REFINEMENT, AND VALIDATION 157
complexity of the crystallographic objective func- macromolecule with constraints on the bond lengths
tion a macromolecular model never truly optimizes and angles, or TLS (translation-libration-screw) or
the function, that is the global minimum is never NCS (non-crystallography symmetry) refinement
reached. Instead, the model – although typically (e.g. Tronrud, 2004), where a protein model is split
very good – is an approximation to what Nature into a number of rigid bodies within which the
has gathered within the macromolecular sample. atomic positions and displacement parameters are
This point should always be kept in mind and constrained. Problems with no limitations on the
would indeed still be valid even if the model did freedom of the variables are called unconstrained
represent the global minimum of the target func- optimization problems.
tion. This should become clearer further down A special but very important topic is the tightness
the text. of the boundary conditions. A(holonomic) constraint
Optimization problems in crystallographic struc- is an absolutely tight condition (an equality–apre-
ture refinement are seldom convex, that is very cise setting or sharp probability density function
rarely characterized by a unimodal function f(x). Reg- of values), which in effect reduces the number of
ularization of a two-atom model is an example of parameters to be refined. We may, however, wish to
such a unimodal function, Fig. 11.2a. In contrast, formulate the problem where the condition is rel-
Fig. 11.2b shows a profile of a function for modelling atively soft (an inequality – a range of plausible
an amino acid side chain – the peaks correspond values and a potentially broader probability func-
to the possible rotamers. In this case, the shape of tion). For example, within the above mentioned
the function f(x) is called multimodal. Such functions constrained refinement case we may want to allow
arise naturally in structural macromolecular opti- bond lengths to vary slightly within reasonable
mization problems and possess a highly complex chemical limits, or for the values of atomic displace-
multiminima energy landscape that does not lend ment parameters (also known as temperature factors)
itself favourably to standard robust optimization not to jump too sharply from one atom to the next,
techniques. or to give a preference to one of the rotamers in
Typically one has a function at hand that may be Fig. 11.2c if it falls into electron density. The degree
described by to which we would like in the latter example to
take the density height into account defines the
f(x) : A →
tightness of this additional condition. Such condi-
in which the domain A, the search space, is a subset tions are called restraints. Mathematically, this does
n
of Euclidean space , often specified by a set of not reduce the number of refined parameters but
constraints in the form of equalities or inequalities instead increases the number of observations. It
that these solutions must satisfy thereby reducing now becomes apparent that a restrained optimiza-
the effective dimensionality of the search space. Sol- tion problem may deal with observations, which
utionsthatsatisfyallconstraintsareknownasfeasible have different physical origin (e.g. measured X-ray
solutions in the sense that they are plausible under intensities combined with stereochemistry), differ-
given boundary conditions. Examples for feasible ent metrics, and different degrees of variability. Thus
solutions in macromolecular crystallography are cell the term ‘total number of observations’ becomes
parameters that obey the space group restraints, or ambiguous and, if quoted, should be taken with a
bond distances that agree with known stereochem- pinch of salt, unless precisely defined.
istry, or torsion angles that fall within an allowed Optimization problems and the computational
region in the Ramachandran plot (Ramakrishnan techniques to tackle them are often classified further
and Ramachandran, 1965). depending on the properties of these constraints,
In many crystallographic problems, the choice of the objective function, and the domain itself. Linear
the variables x is subject to constraints (boundary Programming deals with cases in which the objec-
conditions represented by equations). The problem tive function f(x) is linear and the set A is spec-
is then known as a constrained optimization prob- ified through linear equalities and inequalities. If
lem. An example would be the refinement of a the variables x can only acquire integer values,