Page 50 -
P. 50

HAN 08-ch01-001-038-9780123814791
                                                                     3:12
                                                                           Page 13
                                                             2011/6/1
                                                                                   #13
                                                              1.3 What Kinds of Data Can Be Mined?  13


                               multidimensional space in an OLAP style. That is, it allows the exploration of mul-
                               tiple combinations of dimensions at varying levels of granularity in data mining,
                               and thus has greater potential for discovering interesting patterns representing knowl-
                               edge. An overview of data warehouse and OLAP technology is provided in Chapter 4.
                               Advanced issues regarding data cube computation and multidimensional data mining
                               are discussed in Chapter 5.


                         1.3.3 Transactional Data
                               In general, each record in a transactional database captures a transaction, such as a
                               customer’s purchase, a flight booking, or a user’s clicks on a web page. A transaction typ-
                               ically includes a unique transaction identity number (trans ID) and a list of the items
                               making up the transaction, such as the items purchased in the transaction. A trans-
                               actional database may have additional tables, which contain other information related
                               to the transactions, such as item description, information about the salesperson or the
                               branch, and so on.

                  Example 1.4 A transactional database for AllElectronics. Transactions can be stored in a table, with
                               one record per transaction. A fragment of a transactional database for AllElectronics is
                               shown in Figure 1.8. From the relational database point of view, the sales table in the
                               figure is a nested relation because the attribute list of item IDs contains a set of items.
                               Because most relational database systems do not support nested relational structures,
                               the transactional database is usually either stored in a flat file in a format similar to
                               the table in Figure 1.8 or unfolded into a standard relation in a format similar to the
                               items sold table in Figure 1.5.

                                 As an analyst of AllElectronics, you may ask,“Which items sold well together?” This
                               kind of market basket data analysis would enable you to bundle groups of items together
                               as a strategy for boosting sales. For example, given the knowledge that printers are
                               commonly purchased together with computers, you could offer certain printers at a
                               steep discount (or even for free) to customers buying selected computers, in the hopes
                               of selling more computers (which are often more expensive than printers). A tradi-
                               tional database system is not able to perform market basket data analysis. Fortunately,
                               data mining on transactional data can do so by mining frequent itemsets, that is, sets




                                trans ID  list of item IDs
                                 T100    I1, I3, I8, I16
                                 T200    I2, I8
                                  ...    ...

                     Figure 1.8 Fragment of a transactional database for sales at AllElectronics.
   45   46   47   48   49   50   51   52   53   54   55