Page 272 -
P. 272

3:19 Page 235
                                                            2011/6/1
                               12-ch05-187-242-9780123814791
                                                                                    #49
                         HAN
                                                                                    5.6 Exercises  235


                                 lattice. Iceberg cubes and shell fragments are examples of partial materialization. An
                                 iceberg cube is a data cube that stores only those cube cells that have an aggregate
                                 value (e.g., count) above some minimum support threshold. For shell fragments of
                                 a data cube, only some cuboids involving a small number of dimensions are com-
                                 puted, and queries on additional combinations of the dimensions can be computed
                                 on-the-fly.
                                 There are several efficient data cube computation methods. In this chapter, we dis-
                                 cussed four cube computation methods in detail: (1) MultiWay array aggregation for
                                 materializing full data cubes in sparse-array-based, bottom-up, shared computation;
                                 (2) BUC for computing iceberg cubes by exploring ordering and sorting for efficient
                                 top-down computation; (3) Star-Cubing for computing iceberg cubes by integrating
                                 top-down and bottom-up computation using a star-tree structure; and (4) shell-
                                 fragment cubing, which supports high-dimensional OLAP by precomputing only
                                 the partitioned cube shell fragments.
                                 Multidimensional data mining in cube space is the integration of knowledge discov-
                                 ery with multidimensional data cubes. It facilitates systematic and focused knowledge
                                 discovery in large structured and semi-structured data sets. It will continue to endow
                                 analysts with tremendous flexibility and power at multidimensional and multigran-
                                 ularity exploratory analysis. This is a vast open area for researchers to build powerful
                                 and sophisticated data mining mechanisms.
                                 Techniques for processing advanced queries have been proposed that take advantage
                                 of cube technology. These include sampling cubes for multidimensional analysis on
                                 sampling data, and ranking cubes for efficient top-k (ranking) query processing in
                                 large relational data sets.
                                 This chapter highlighted three approaches to multidimensional data analysis with
                                 data cubes. Prediction cubes compute prediction models in multidimensional
                                 cube space. They help users identify interesting data subsets at varying degrees of
                                 granularity for effective prediction. Multifeature cubes compute complex queries
                                 involving multiple dependent aggregates at multiple granularities. Exception-based,
                                 discovery-driven exploration of cube space displays visual cues to indicate discov-
                                 ered data exceptions at all aggregation levels, thereby guiding the user in the data
                                 analysis process.


                       5.6     Exercises



                           5.1 Assume that a 10-D base cuboid contains only three base cells: (1) (a 1 , d 2 , d 3 , d 4 , ...,
                               d 9 , d 10 ), (2) (d 1 ,b 2 , d 3 , d 4 ,..., d 9 , d 10 ), and (3) (d 1 , d 2 , c 3 , d 4 ,..., d 9 , d 10 ), where a 1 6=
                               d 1 , b 2 6= d 2 , and c 3 6= d 3 . The measure of the cube is count( ).

                               (a) How many nonempty cuboids will a full data cube contain?
                              (b) How many nonempty aggregate (i.e., nonbase) cells will a full cube contain?
   267   268   269   270   271   272   273   274   275   276   277