Page 273 -
P. 273
12-ch05-187-242-9780123814791
HAN
236 Chapter 5 Data Cube Technology 2011/6/1 3:19 Page 236 #50
(c) How many nonempty aggregate cells will an iceberg cube contain if the condition of
the iceberg cube is “count ≥ 2”?
(d) A cell, c, is a closed cell if there exists no cell, d, such that d is a specialization of
cell c (i.e., d is obtained by replacing a ∗ in c by a non-∗ value) and d has the same
measure value as c. A closed cube is a data cube consisting of only closed cells. How
many closed cells are in the full cube?
5.2 There are several typical cube computation methods, such as MultiWay [ZDN97], BUC
[BR99], and Star-Cubing [XHLW03]. Briefly describe these three methods (i.e., use one
or two lines to outline the key points), and compare their feasibility and performance
under the following conditions:
(a) Computing a dense full cube of low dimensionality (e.g., less than eight
dimensions).
(b) Computing an iceberg cube of around 10 dimensions with a highly skewed data
distribution.
(c) Computing a sparse iceberg cube of high dimensionality (e.g., over 100 dimensions).
5.3 Suppose a data cube, C, has D dimensions, and the base cuboid contains k distinct
tuples.
(a) Present a formula to calculate the minimum number of cells that the cube, C, may
contain.
(b) Present a formula to calculate the maximum number of cells that C may contain.
(c) Answer parts (a) and (b) as if the count in each cube cell must be no less than a
threshold, v.
(d) Answer parts (a) and (b) as if only closed cells are considered (with the minimum
count threshold, v).
5.4 Suppose that a base cuboid has three dimensions, A, B, C, with the following number
of cells: |A| = 1,000,000, |B| = 100, and |C| = 1000. Suppose that each dimension is
evenly partitioned into 10 portions for chunking.
(a) Assuming each dimension has only one level, draw the complete lattice of the cube.
(b) If each cube cell stores one measure with four bytes, what is the total size of the
computed cube if the cube is dense?
(c) State the order for computing the chunks in the cube that requires the least amount
of space, and compute the total amount of main memory space required for
computing the 2-D planes.
5.5 Often, the aggregate count value of many cells in a large data cuboid is zero, resulting in
a huge, yet sparse, multidimensional matrix.
(a) Design an implementation method that can elegantly overcome this sparse matrix
problem. Note that you need to explain your data structures in detail and discuss the
space needed, as well as how to retrieve data from your structures.