Page 193 -
P. 193

11-ch04-125-186-9780123814791
                         HAN
                                                            2011/6/1
          156   Chapter 4 Data Warehousing and Online Analytical Processing  3:17 Page 156  #32



                         and summarized data, which greatly facilitates data mining. For example, rather than
                         storing the details of each sales transaction, a data warehouse may store a summary
                         of the transactions per item type for each branch or, summarized to a higher level,
                         for each country. The capability of OLAP to provide multiple and dynamic views
                         of summarized data in a data warehouse sets a solid foundation for successful data
                         mining.
                           Moreover, we also believe that data mining should be a human-centered process.
                         Rather than asking a data mining system to generate patterns and knowledge automati-
                         cally, a user will often need to interact with the system to perform exploratory data
                         analysis. OLAP sets a good example for interactive data analysis and provides the nec-
                         essary preparations for exploratory data mining. Consider the discovery of association
                         patterns, for example. Instead of mining associations at a primitive (i.e., low) data level
                         among transactions, users should be allowed to specify roll-up operations along any
                         dimension.
                           For example, a user may want to roll up on the item dimension to go from viewing the
                         data for particular TV sets that were purchased to viewing the brands of these TVs (e.g.,
                         SONY or Toshiba). Users may also navigate from the transaction level to the customer or
                         customer-type level in the search for interesting associations. Such an OLAP data mining
                         style is characteristic of multidimensional data mining. In our study of the principles
                         of data mining in this book, we place particular emphasis on multidimensional data
                         mining, that is, on the integration of data mining and OLAP technology.



                 4.4     Data Warehouse Implementation


                         Data warehouses contain huge volumes of data. OLAP servers demand that decision
                         support queries be answered in the order of seconds. Therefore, it is crucial for data
                         warehouse systems to support highly efficient cube computation techniques, access
                         methods, and query processing techniques. In this section, we present an overview
                         of methods for the efficient implementation of data warehouse systems. Section 4.4.1
                         explores how to compute data cubes efficiently. Section 4.4.2 shows how OLAP data
                         can be indexed, using either bitmap or join indices. Next, we study how OLAP queries
                         are processed (Section 4.4.3). Finally, Section 4.4.4 presents various types of warehouse
                         servers for OLAP processing.



                   4.4.1 Efficient Data Cube Computation: An Overview
                         At the core of multidimensional data analysis is the efficient computation of aggrega-
                         tions across many sets of dimensions. In SQL terms, these aggregations are referred to
                         as group-by’s. Each group-by can be represented by a cuboid, where the set of group-by’s
                         forms a lattice of cuboids defining a data cube. In this subsection, we explore issues
                         relating to the efficient computation of data cubes.
   188   189   190   191   192   193   194   195   196   197   198