Page 198 -
P. 198

2011/6/1
                                                                     3:17 Page 161
                               11-ch04-125-186-9780123814791
                                                                                    #37
                         HAN
                                                                4.4 Data Warehouse Implementation  161


                  Example 4.7 Bitmap indexing. In the AllElectronics data warehouse, suppose the dimension item at
                               the top level has four values (representing item types): “home entertainment,” “com-
                               puter,” “phone,” and “security.” Each value (e.g., “computer”) is represented by a bit vector
                               in the item bitmap index table. Suppose that the cube is stored as a relation table with
                               100,000 rows. Because the domain of item consists of four values, the bitmap index table
                               requires four bit vectors (or lists), each with 100,000 bits. Figure 4.15 shows a base (data)
                               table containing the dimensions item and city, and its mapping to bitmap index tables
                               for each of the dimensions.


                               Base table       item bitmap index table   city bitmap index table
                                RID  item  city  RID  H    C    P    S      RID  V    T
                                R1   H    V      R1   1    0    0    0      R1   1    0
                                R2   C    V      R2   0    1    0    0      R2   1    0
                                R3   P    V      R3   0    0    1    0      R3   1    0
                                R4   S    V      R4   0    0    0    1      R4   1    0
                                R5   H    T      R5   1    0    0    0      R5   0    1
                                R6   C    T      R6   0    1    0    0      R6   0    1
                                R7   P    T      R7   0    0    1    0      R7   0    1
                                R8   S    T      R8   0    0    0    1      R8   0    1
                               Note: H for “home entertainment,” C for “computer,” P for “phone,” S for “security,”
                               V for “Vancouver,” T for “Toronto.”

                    Figure 4.15 Indexing OLAP data using bitmap indices.

                                 Bitmap indexing is advantageous compared to hash and tree indices. It is especially
                               useful for low-cardinality domains because comparison, join, and aggregation opera-
                               tions are then reduced to bit arithmetic, which substantially reduces the processing time.
                               Bitmap indexing leads to significant reductions in space and input/output (I/O) since a
                               string of characters can be represented by a single bit. For higher-cardinality domains,
                               the method can be adapted using compression techniques.
                                 The join indexing method gained popularity from its use in relational database query
                               processing. Traditional indexing maps the value in a given column to a list of rows having
                               that value. In contrast, join indexing registers the joinable rows of two relations from a
                               relational database. For example, if two relations R(RID, A) and S(B, SID) join on the
                               attributes A and B, then the join index record contains the pair (RID, SID), where RID
                               and SID are record identifiers from the R and S relations, respectively. Hence, the join
                               index records can identify joinable tuples without performing costly join operations.
                               Join indexing is especially useful for maintaining the relationship between a foreign key 2
                               and its matching primary keys, from the joinable relation.
                                 The star schema model of data warehouses makes join indexing attractive for cross-
                               table search, because the linkage between a fact table and its corresponding dimension
                               tables comprises the fact table’s foreign key and the dimension table’s primary key. Join

                               2 A set of attributes in a relation schema that forms a primary key for another relation schema is called
                               a foreign key.
   193   194   195   196   197   198   199   200   201   202   203