Page 202 - Macromolecular Crystallography
P. 202
CHAPTER 13
Electron density fitting and
structure validation
Mike Carson
13.1 Introduction at much higher resolution than is typical for pro-
teins. Currently about 200,000 error-free organic
TheHumanGenomeProjectwentthree-dimensional
compounds with conventional R factors less than
in late 2000. ‘Structural genomics’ efforts will deter-
0.05 are available through the Cambridge Structural
mine the structures of thousands of new proteins
Database (CSD) (Allen et al., 1979). These data have
over the next decade. These initiatives seek to
been mined for conformational analysis, hydrogen
streamline and automate every experimental and
bonding directionality, non-bonded packing inter-
computational aspect of the structural determina-
action, and more, as recently reviewed (Allen and
tion pipeline, with most of the steps involved cov-
Motherwell, 2002). The CSD provides an invalu-
ered in previous chapters of this volume. At the end
able source of coordinate geometry for inhibitors
of the pipeline, an atomic model is built and itera-
and cofactors, which should be trusted more than
tively refined to best fit the observed data. The final
the energy minimized output of any modelling
atomic model, after careful analysis, is deposited
program.
in the Protein Data Bank, or PDB (Berman et al.,
A common feature of modelling and refinement
2000). About 25,000 unique protein sequences are
programs is a dictionary of ideal residues derived
currently in the PDB. High-throughput and con-
from the results of small-molecule crystallography.
ventional methods will dramatically increase this
Ideal bond lengths and angles for the amino acid
number and it is crucial that these new struc-
and nucleic acid building blocks of macromolecules
tures be of the highest quality (Chandonia and
have been gathered from the CSD (Engh and Huber,
Brenner, 2006).
1991). The atomic bond and angle parameters are
This chapter will address software systems to
tightly constrained for macromolecular refinement
interactively fit molecular models to electron den-
and may be regarded as fixed, with the only degrees
sity maps and to analyse the resulting models. This
of freedom coming from torsional rotation about
chapter is heavily biased toward proteins, but the
single bonds.
programs can also build nucleic acid models. First a
The favoured dihedral angles for protein main
brief review of molecular modelling and graphics is
chains were derived from energy considerations
presented. Next, the best current and freely available
of steric clashes in peptides giving the well
programs are discussed with respect to their perfor-
known Ramachandran plot (Ramachandran and
mance on common tasks. Finally, some views on the
Sasisekharan, 1968). These phi/psi combinations
future of such software are given.
characterize the elements of secondary structure.
Accuratemainchainmodelscanbeconstructedfrom
‘spare parts’, that is short pieces of helices, sheets,
13.2 Initial molecular models
turns, and random coils taken from highly refined
Small molecule crystal structures solved through structures, provided a series of C-alpha positions
direct methods yield very accurate atomic positions can be established from the electron density map
191