Page 194 - Macromolecular Crystallography
P. 194

HIGH-THROUGHPUT DATA COLLECTION AT SYNCHROTRONS  183

        of the database record for that crystal. At the time  necessary phases, followed by fitting of the lig-
        that crystals are loaded into the sample-changing  and into the appropriate electron density feature
        robot carousel, the 2D barcode is also read into the  within the map. The Astex AutoSolve ®  program
        database. This two bar code system provides redun-  and the SGX target to lead platform have both
        dancy in sample identification. And, since the 2D bar  adopted this approach. When anomalous signals
        code is part of the mounting hardware and its value  are used, such as in the SGX FAST (Fragments
        is stored in the diffraction image file, this identifier  of Active Structures) fragment-based lead discov-
        is always associated with the crystal.       ery process, they help to differentiate between lig-
                                                     ands and to provide information on the orientation
                                                     of the ligand within the target protein. A sum-
        12.8 Diffraction data processing
                                                     mary of the SGX automated data processing system
        Any high-throughput system for protein crystal-  is provided in the high-throughput example in
        lography requires efficient processes for converting  Section 12.11.
        measured diffraction images into experimental elec-
        tron density maps and structures. Chapters 6 to 11
                                                     12.9 Information technology
        and 13 cover the various approaches to data analysis
                                                     infrastructure
        in considerable depth.
          Several systems have been developed for auto-  Modern high-throughput data collection permits
        matic protein structure determination via X-ray  examination of a large number of samples and
        crystallography, including software packages such  generates enormous quantities of diffraction data.
        as PHENIX (Adams et al., 2002), ELVES (Holton  With a CCD detector, data collection rates are typ-
        and Alber, 2004), Auto-Rickshaw (Panjikar et al.,  ically 1 to 5 MB/sec, although higher rates, up to
        2005), and ACrS (Brunzelle et al., 2003). These pack-  24 MB/sec, are currently possible at third genera-
        ages utilize existing individual programs to pro-  tion synchrotrons (Bakaikoa, 2006). The pixel array
        cess the diffraction images and data. Additional  detectors currently available provide data at rates of
        programs and scripts analyse interim results, and  2 to 12 MB/sec. (Hülsen et al., 2006).
        shepherd the output from one program to another.  Because of these collection rates, high-throughput
        The modularity of these systems permits rapid  synchrotron operations are as much an issue of sam-
        incorporation of enhancements in crystallographic  ple and data management as of data collection. For
        software.                                    this reason, SGX-CAT operations were included in
          Data processing occurs in two stages. Initially,  the information management systems for the struc-
        diffraction images are reduced to a tabulation of  tural biology platform at SGX at the time of beamline
        reflection indices and intensities or, after trunca-  commissioning. These data systems directly link
        tion, structure factors. The second stage involves  beamlineoperationstoSGXeffortsindrugdiscovery
        conversion of the observed structure factors into an  and structural proteomics.
        experimental electron density map. The choice of  The SGX Laboratory Information Management
        howtoexecutethelatterstepdependsonthemethod  System (LIMS) is based on an Oracle database plat-
        used to determine the phase for each reflection.  form. As an enterprise level relational database,
        The software packages enumerated above generally  Oracle is robust and can be expanded to meet any
        focus on exploitation of anomalous signals to over-  future needs. The expense to implement and main-
        come the phase problem for a protein of unknown  tain the database and to create the tools to retrieve
        structure.                                   the data stored within, including an administrator
          Structure-based drug discovery involves determi-  dedicated to managing the system, can be signifi-
        nation of many cocrystal structures with different  cant. At SGX, use of the LIMS to monitor all of our
        ligands bound to the same target protein. In such  scientific activities reduces the incremental cost to
        cases, where the structure of the protein is well  the beamline to an acceptable level.
        known, automated structure determination proce-  The SGX database tracks every aspect of the prep-
        dures rely on molecular replacement to supply the  aration of crystalline samples, including gene
   189   190   191   192   193   194   195   196   197   198   199