Page 194 - Macromolecular Crystallography
P. 194
HIGH-THROUGHPUT DATA COLLECTION AT SYNCHROTRONS 183
of the database record for that crystal. At the time necessary phases, followed by fitting of the lig-
that crystals are loaded into the sample-changing and into the appropriate electron density feature
robot carousel, the 2D barcode is also read into the within the map. The Astex AutoSolve ® program
database. This two bar code system provides redun- and the SGX target to lead platform have both
dancy in sample identification. And, since the 2D bar adopted this approach. When anomalous signals
code is part of the mounting hardware and its value are used, such as in the SGX FAST (Fragments
is stored in the diffraction image file, this identifier of Active Structures) fragment-based lead discov-
is always associated with the crystal. ery process, they help to differentiate between lig-
ands and to provide information on the orientation
of the ligand within the target protein. A sum-
12.8 Diffraction data processing
mary of the SGX automated data processing system
Any high-throughput system for protein crystal- is provided in the high-throughput example in
lography requires efficient processes for converting Section 12.11.
measured diffraction images into experimental elec-
tron density maps and structures. Chapters 6 to 11
12.9 Information technology
and 13 cover the various approaches to data analysis
infrastructure
in considerable depth.
Several systems have been developed for auto- Modern high-throughput data collection permits
matic protein structure determination via X-ray examination of a large number of samples and
crystallography, including software packages such generates enormous quantities of diffraction data.
as PHENIX (Adams et al., 2002), ELVES (Holton With a CCD detector, data collection rates are typ-
and Alber, 2004), Auto-Rickshaw (Panjikar et al., ically 1 to 5 MB/sec, although higher rates, up to
2005), and ACrS (Brunzelle et al., 2003). These pack- 24 MB/sec, are currently possible at third genera-
ages utilize existing individual programs to pro- tion synchrotrons (Bakaikoa, 2006). The pixel array
cess the diffraction images and data. Additional detectors currently available provide data at rates of
programs and scripts analyse interim results, and 2 to 12 MB/sec. (Hülsen et al., 2006).
shepherd the output from one program to another. Because of these collection rates, high-throughput
The modularity of these systems permits rapid synchrotron operations are as much an issue of sam-
incorporation of enhancements in crystallographic ple and data management as of data collection. For
software. this reason, SGX-CAT operations were included in
Data processing occurs in two stages. Initially, the information management systems for the struc-
diffraction images are reduced to a tabulation of tural biology platform at SGX at the time of beamline
reflection indices and intensities or, after trunca- commissioning. These data systems directly link
tion, structure factors. The second stage involves beamlineoperationstoSGXeffortsindrugdiscovery
conversion of the observed structure factors into an and structural proteomics.
experimental electron density map. The choice of The SGX Laboratory Information Management
howtoexecutethelatterstepdependsonthemethod System (LIMS) is based on an Oracle database plat-
used to determine the phase for each reflection. form. As an enterprise level relational database,
The software packages enumerated above generally Oracle is robust and can be expanded to meet any
focus on exploitation of anomalous signals to over- future needs. The expense to implement and main-
come the phase problem for a protein of unknown tain the database and to create the tools to retrieve
structure. the data stored within, including an administrator
Structure-based drug discovery involves determi- dedicated to managing the system, can be signifi-
nation of many cocrystal structures with different cant. At SGX, use of the LIMS to monitor all of our
ligands bound to the same target protein. In such scientific activities reduces the incremental cost to
cases, where the structure of the protein is well the beamline to an acceptable level.
known, automated structure determination proce- The SGX database tracks every aspect of the prep-
dures rely on molecular replacement to supply the aration of crystalline samples, including gene