Page 168 - Macromolecular Crystallography
P. 168

MODEL BUILDING, REFINEMENT, AND VALIDATION  157

        complexity of the crystallographic objective func-  macromolecule with constraints on the bond lengths
        tion a macromolecular model never truly optimizes  and angles, or TLS (translation-libration-screw) or
        the function, that is the global minimum is never  NCS (non-crystallography symmetry) refinement
        reached. Instead, the model – although typically  (e.g. Tronrud, 2004), where a protein model is split
        very good – is an approximation to what Nature  into a number of rigid bodies within which the
        has gathered within the macromolecular sample.  atomic positions and displacement parameters are
        This point should always be kept in mind and  constrained. Problems with no limitations on the
        would indeed still be valid even if the model did  freedom of the variables are called unconstrained
        represent the global minimum of the target func-  optimization problems.
        tion. This should become clearer further down  A special but very important topic is the tightness
        the text.                                    of the boundary conditions. A(holonomic) constraint
          Optimization problems in crystallographic struc-  is an absolutely tight condition (an equality–apre-
        ture refinement are seldom convex, that is very  cise setting or sharp probability density function
        rarely characterized by a unimodal function f(x). Reg-  of values), which in effect reduces the number of
        ularization of a two-atom model is an example of  parameters to be refined. We may, however, wish to
        such a unimodal function, Fig. 11.2a. In contrast,  formulate the problem where the condition is rel-
        Fig. 11.2b shows a profile of a function for modelling  atively soft (an inequality – a range of plausible
        an amino acid side chain – the peaks correspond  values and a potentially broader probability func-
        to the possible rotamers. In this case, the shape of  tion). For example, within the above mentioned
        the function f(x) is called multimodal. Such functions  constrained refinement case we may want to allow
        arise naturally in structural macromolecular opti-  bond lengths to vary slightly within reasonable
        mization problems and possess a highly complex  chemical limits, or for the values of atomic displace-
        multiminima energy landscape that does not lend  ment parameters (also known as temperature factors)
        itself favourably to standard robust optimization  not to jump too sharply from one atom to the next,
        techniques.                                  or to give a preference to one of the rotamers in
          Typically one has a function at hand that may be  Fig. 11.2c if it falls into electron density. The degree
        described by                                 to which we would like in the latter example to
                                                     take the density height into account defines the
           f(x) : A →
                                                     tightness of this additional condition. Such condi-
        in which the domain A, the search space, is a subset  tions are called restraints. Mathematically, this does
                          n
        of Euclidean space   , often specified by a set of  not reduce the number of refined parameters but
        constraints in the form of equalities or inequalities  instead increases the number of observations. It
        that these solutions must satisfy thereby reducing  now becomes apparent that a restrained optimiza-
        the effective dimensionality of the search space. Sol-  tion problem may deal with observations, which
        utionsthatsatisfyallconstraintsareknownasfeasible  have different physical origin (e.g. measured X-ray
        solutions in the sense that they are plausible under  intensities combined with stereochemistry), differ-
        given boundary conditions. Examples for feasible  ent metrics, and different degrees of variability. Thus
        solutions in macromolecular crystallography are cell  the term ‘total number of observations’ becomes
        parameters that obey the space group restraints, or  ambiguous and, if quoted, should be taken with a
        bond distances that agree with known stereochem-  pinch of salt, unless precisely defined.
        istry, or torsion angles that fall within an allowed  Optimization problems and the computational
        region in the Ramachandran plot (Ramakrishnan  techniques to tackle them are often classified further
        and Ramachandran, 1965).                     depending on the properties of these constraints,
          In many crystallographic problems, the choice of  the objective function, and the domain itself. Linear
        the variables x is subject to constraints (boundary  Programming deals with cases in which the objec-
        conditions represented by equations). The problem  tive function f(x) is linear and the set A is spec-
        is then known as a constrained optimization prob-  ified through linear equalities and inequalities. If
        lem. An example would be the refinement of a  the variables x can only acquire integer values,
   163   164   165   166   167   168   169   170   171   172   173