Page 229 -

P. 229

HAN 12-ch05-187-242-9780123814791

192 Chapter 5 Data Cube Technology 2011/6/1 3:19 Page 192 #6

2
(a 1 , a , *, ..., *) : 20

(a , a , a , ..., a 100 ) : 10 (a , a , b , ..., b 100 ) : 10
2
2
1
3
3
1
Figure 5.2 Three closed cells forming the lattice of a closed cube.
To systematically compress a data cube, we need to introduce the concept of closed
coverage. A cell, c, is a closed cell if there exists no cell, d, such that d is a special-
ization (descendant) of cell c (i.e., where d is obtained by replacing ∗ in c with a
non-∗ value), and d has the same measure value as c. A closed cube is a data cube
consisting of only closed cells. For example, the three cells derived in the preced-
ing paragraph are the three closed cells of the data cube for the data set {(a 1 , a 2 ,
a 3 ,..., a 100 ) : 10, (a 1 , a 2 , b 3 ,..., b 100 ) : 10}. They form the lattice of a closed cube as
shown in Figure 5.2. Other nonclosed cells can be derived from their corresponding
closed cells in this lattice. For example, “(a 1 , ∗ , ∗ ,..., ∗) : 20” can be derived from
“(a 1 , a 2 , ∗ ,..., ∗) : 20” because the former is a generalized nonclosed cell of the latter.
Similarly, we have “(a 1 , a 2 , b 3 , ∗ ,..., ∗) : 10.”
Another strategy for partial materialization is to precompute only the cuboids involv-
ing a small number of dimensions such as three to ﬁve. These cuboids form a cube shell
for the corresponding data cube. Queries on additional combinations of the dimensions
will have to be computed on-the-ﬂy. For example, we could compute all cuboids with
three dimensions or less in an n-dimensional data cube, resulting in a cube shell of size 3.
This, however, can still result in a large number of cuboids to compute, particularly when
n is large. Alternatively, we can choose to precompute only portions or fragments of the
cube shell based on cuboids of interest. Section 5.2.4 discusses a method for computing
shell fragments and explores how they can be used for efﬁcient OLAP query processing.

5.1.2 General Strategies for Data Cube Computation
There are several methods for efﬁcient data cube computation, based on the vari-
ous kinds of cubes described in Section 5.1.1. In general, there are two basic data
structures used for storing cuboids. The implementation of relational OLAP (ROLAP)
uses relational tables, whereas multidimensional arrays are used in multidimensional
OLAP (MOLAP). Although ROLAP and MOLAP may each explore different cube
computation techniques, some optimization “tricks” can be shared among the different

224 225 226 227 228 229 230 231 232 233 234