Page 231 -

P. 231

HAN 12-ch05-187-242-9780123814791

194 Chapter 5 Data Cube Technology 2011/6/1 3:19 Page 194 #8

Optimization Technique 4: The Apriori pruning method can be explored to
3
compute iceberg cubes efﬁciently. The Apriori property, in the context of data
cubes, states as follows: If a given cell does not satisfy minimum support, then no descen-
dant of the cell (i.e., more specialized cell) will satisfy minimum support either. This
property can be used to substantially reduce the computation of iceberg cubes.
Recall that the speciﬁcation of iceberg cubes contains an iceberg condition, which
is a constraint on the cells to be materialized. A common iceberg condition is that the
cells must satisfy a minimum support threshold such as a minimum count or sum. In
this situation, the Apriori property can be used to prune away the exploration of the
cell’s descendants. For example, if the count of a cell, c, in a cuboid is less than a
minimum support threshold, v, then the count of any of c’s descendant cells in the
lower-level cuboids can never be greater than or equal to v, and thus can be pruned.
In other words, if a condition (e.g., the iceberg condition speciﬁed in the having
clause) is violated for some cell c, then every descendant of c will also violate that con-
4
dition. Measures that obey this property are known as antimonotonic. This form
of pruning was made popular in frequent pattern mining, yet also aids in data cube
computation by cutting processing time and disk space requirements. It can lead to a
more focused analysis because cells that cannot pass the threshold are unlikely to be
of interest.

In the following sections, we introduce several popular methods for efﬁcient cube
computation that explore these optimization strategies.

5.2 Data Cube Computation Methods

Data cube computation is an essential task in data warehouse implementation. The pre-
computation of all or part of a data cube can greatly reduce the response time and
enhance the performance of online analytical processing. However, such computation
is challenging because it may require substantial computational time and storage
space. This section explores efﬁcient methods for data cube computation. Section 5.2.1
describes the multiway array aggregation (MultiWay) method for computing full cubes.
Section 5.2.2 describes a method known as BUC, which computes iceberg cubes from
the apex cuboid downward. Section 5.2.3 describes the Star-Cubing method, which
integrates top-down and bottom-up computation.
Finally, Section 5.2.4 describes a shell-fragment cubing approach that computes shell
fragments for efﬁcient high-dimensional OLAP. To simplify our discussion, we exclude

3 The Apriori property was proposed in the Apriori algorithm for association rule mining by Agrawal
and Srikant [AS94b]. Many algorithms in association rule mining have adopted this property (see
Chapter 6).
4 Antimonotone is based on condition violation. This differs from monotone, which is based on
condition satisfaction.

226 227 228 229 230 231 232 233 234 235 236