Page 118 - Accounting Information Systems
P. 118

CHAPT E R 2        Introduction to Transaction Processing  89


                         FI G U R E
                           2-41     HASHING TECHNIQUE WITH POINTER TO RELOCATE THE COLLISION RECORD













                                                                                    15943
                               Key Value Being Sought
                               = 15943
                                                                              Collision Pointer

                                 Hashing
                                 Technique

                                    Prime #/ Key
                                    = 99997/15943 = 6.27215705
                                                                                Relocated Record
                                    Residual Translates to:
                                                             Cylinder 272  Record with Same Hashed
                                                             Surface 15  Address as 15943
                                                             Record # 705









                           The hashing structure has two significant disadvantages. First, this technique does not use storage
                       space efficiently. The storage location chosen for a record is a mathematical function of its primary key
                       value. The algorithm will never select some disk locations because they do not correspond to legitimate
                       key values. As much as one-third of the disk pack may be wasted.
                           The second disadvantage is the reverse of the first. Different record keys may generate the same
                       (or similar) residual, which translates into the same address. This is called a collision because two records
                       cannot be stored at the same location. One solution to this problem is to randomly select a location for
                       the second record and place a pointer to it from the first (the calculated) location. The dark arrow in
                       Figure 2-41 represents this technique.
                           The collision problem slows access to records. Locating a record displaced in this manner involves
                       first calculating its theoretical address, searching that location, and then determining the actual address
                       from the pointer contained in the record at that location. This has an additional implication for Operation
                       7 in Table 2-2—deleting a record from a file. If the first record is deleted from the file, the pointer to the
                       second (collision) record will also be deleted and the address of the second record will be lost. This can
                       be dealt with in two ways: (1) After deleting the first record, the collision record can be physically relo-
                       cated to its calculated address, which is now vacant, or (2) the first record is marked deleted but is left in
                       place to preserve the pointer to the collision record.


                       POINTER STRUCTURE
                       Figure 2-42 presents the pointer structure, which in this example is used to create a linked-list file. This
                       approach stores in a field of one record the address (pointer) of a related record. The pointers provide
   113   114   115   116   117   118   119   120   121   122   123