Page 48 -
P. 48

#11
                                                             2011/6/1
                                                                     3:12
                                                                           Page 11
                          HAN 08-ch01-001-038-9780123814791
                                                              1.3 What Kinds of Data Can Be Mined?  11





                               Data source in Chicago
                                                                                               Client
                                                  Clean
                               Data source in New York  Integrate  Data      Query and
                                                  Transform   Warehouse     analysis tools
                                                  Load
                                                  Refresh
                                Data source in Toronto                                         Client



                               Data source in Vancouver


                     Figure 1.6 Typical framework of a data warehouse for AllElectronics.



                               or sum(sales amount). A data cube provides a multidimensional view of data and allows
                               the precomputation and fast access of summarized data.

                  Example 1.3 A data cube for AllElectronics. A data cube for summarized sales data of AllElectronics
                               is presented in Figure 1.7(a). The cube has three dimensions: address (with city values
                               Chicago, New York, Toronto, Vancouver), time (with quarter values Q1, Q2, Q3, Q4), and
                               item (withitemtypevalueshomeentertainment,computer,phone,security).Theaggregate
                               value stored in each cell of the cube is sales amount (in thousands). For example, the total
                               sales for the first quarter, Q1, for the items related to security systems in Vancouver is
                               $400,000,asstoredincellhVancouver,Q1,securityi.Additionalcubesmaybeusedtostore
                               aggregatesumsovereachdimension,correspondingtotheaggregatevaluesobtainedusing
                               different SQL group-bys (e.g., the total sales amount per city and quarter, or per city and
                               item, or per quarter and item, or per each individual dimension).
                                 By providing multidimensional data views and the precomputation of summarized
                               data, data warehouse systems can provide inherent support for OLAP. Online analyti-
                               cal processing operations make use of background knowledge regarding the domain of
                               the data being studied to allow the presentation of data at different levels of abstraction.
                               Such operations accommodate different user viewpoints. Examples of OLAP opera-
                               tions include drill-down and roll-up, which allow the user to view the data at differing
                               degrees of summarization, as illustrated in Figure 1.7(b). For instance, we can drill
                               down on sales data summarized by quarter to see data summarized by month. Sim-
                               ilarly, we can roll up on sales data summarized by city to view data summarized by
                               country.
                                 Although data warehouse tools help support data analysis, additional tools for
                               data mining are often needed for in-depth analysis. Multidimensional data mining
                               (also called exploratory multidimensional data mining) performs data mining in
   43   44   45   46   47   48   49   50   51   52   53