Page 273 -
P. 273

12-ch05-187-242-9780123814791
                         HAN

          236   Chapter 5 Data Cube Technology              2011/6/1  3:19 Page 236  #50



                         (c) How many nonempty aggregate cells will an iceberg cube contain if the condition of
                            the iceberg cube is “count ≥ 2”?
                        (d) A cell, c, is a closed cell if there exists no cell, d, such that d is a specialization of
                            cell c (i.e., d is obtained by replacing a ∗ in c by a non-∗ value) and d has the same
                            measure value as c. A closed cube is a data cube consisting of only closed cells. How
                            many closed cells are in the full cube?
                     5.2 There are several typical cube computation methods, such as MultiWay [ZDN97], BUC
                         [BR99], and Star-Cubing [XHLW03]. Briefly describe these three methods (i.e., use one
                         or two lines to outline the key points), and compare their feasibility and performance
                         under the following conditions:
                         (a) Computing a dense full cube of low dimensionality (e.g., less than eight
                            dimensions).
                        (b) Computing an iceberg cube of around 10 dimensions with a highly skewed data
                            distribution.
                         (c) Computing a sparse iceberg cube of high dimensionality (e.g., over 100 dimensions).
                     5.3 Suppose a data cube, C, has D dimensions, and the base cuboid contains k distinct
                         tuples.
                         (a) Present a formula to calculate the minimum number of cells that the cube, C, may
                            contain.
                        (b) Present a formula to calculate the maximum number of cells that C may contain.
                         (c) Answer parts (a) and (b) as if the count in each cube cell must be no less than a
                            threshold, v.
                        (d) Answer parts (a) and (b) as if only closed cells are considered (with the minimum
                            count threshold, v).

                     5.4 Suppose that a base cuboid has three dimensions, A, B, C, with the following number
                         of cells: |A| = 1,000,000, |B| = 100, and |C| = 1000. Suppose that each dimension is
                         evenly partitioned into 10 portions for chunking.
                         (a) Assuming each dimension has only one level, draw the complete lattice of the cube.
                        (b) If each cube cell stores one measure with four bytes, what is the total size of the
                            computed cube if the cube is dense?
                         (c) State the order for computing the chunks in the cube that requires the least amount
                            of space, and compute the total amount of main memory space required for
                            computing the 2-D planes.
                     5.5 Often, the aggregate count value of many cells in a large data cuboid is zero, resulting in
                         a huge, yet sparse, multidimensional matrix.
                         (a) Design an implementation method that can elegantly overcome this sparse matrix
                            problem. Note that you need to explain your data structures in detail and discuss the
                            space needed, as well as how to retrieve data from your structures.
   268   269   270   271   272   273   274   275   276   277   278