Page 254 -

P. 254

12-ch05-187-242-9780123814791
2011/6/1
HAN
3:19 Page 217
#31
5.2 Data Cube Computation Methods 217

for tuple count(), or by fetching item count() and sum() from the ID measure array to
compute average()) for the ﬁnal set of cells.

Example 5.12 Point query. Suppose a user wants to compute the point query ha 2 , b 1 , c 1 , d 1 , ∗: count()?i
for our database in Table 5.4 and that the shell fragments for the partitions (A, B, C)
and (D, E) are precomputed as described in Example 5.10. The query is broken down
into two subqueries based on the precomputed fragments: ha 2 , b 1 , c 1 , ∗ , ∗i and h∗, ∗ ,
∗ , d 1 , ∗i. The best-ﬁt precomputed shell fragments for the two subqueries are ABC and
D. The fetch of the TID lists for the two subqueries returns two lists: {4, 5} and {1, 3,
4, 5}. Their intersection is the list {4, 5}, which is of size 2. Thus, the ﬁnal answer is
count() = 2.

“How can we use shell fragments to answer a subcube query?” A subcube query returns
a local data cube based on the instantiated and inquired dimensions. Such a data cube
needs to be aggregated in a multidimensional way so that online analytical processing
(drilling, dicing, pivoting, etc.) can be made available to users for ﬂexible manipulation
and analysis. Because instantiated dimensions usually provide highly selective constants
that dramatically reduce the size of the valid TID lists, we should make maximal use of
the precomputed shell fragments by ﬁnding the fragments that best ﬁt the set of instan-
tiated dimensions, and fetching and intersecting the associated TID lists to derive the
reduced TID list. This list can then be used to intersect the best-ﬁtting shell fragments
consisting of the inquired dimensions. This will generate the relevant and inquired base
cuboid, which can then be used to compute the relevant subcube on-the-ﬂy using an
efﬁcient online cubing algorithm.
Let the subcube query be of the form hα i , α j , A k ?, α p , A q ? : M?i, where α i , α j , and
α p represent a set of instantiated values of dimension A i , A j , and A p , respectively, and A k
and A q represent two inquired dimensions. First, we check the shell fragment schema
to determine which dimensions among (1) A i , A j , and A p , and (2) A k and A q are in
the same fragment partition. Suppose A i and A j belong to the same fragment, as do A k
and A q , but that A p is in a different fragment. We fetch the corresponding TID lists in
the precomputed 2-D fragment for A i and A j using the instantiations α i and α j , then
fetch the TID list on the precomputed 1-D fragment for A p using instantiation α p , and
then fetch the TID lists on the precomputed 2-D fragments for A k and A q , respectively,
using no instantiations (i.e., all possible values). The obtained TID lists are intersected
to derive the ﬁnal TID lists, which are used to fetch the corresponding measures from
the ID measure array to derive the “base cuboid” of a 2-D subcube for two dimensions
(A k , A q ). A fast cube computation algorithm can be applied to compute this 2-D cube
based on the derived base cuboid. The computed 2-D cube is then ready for OLAP
operations.

Example 5.13 Subcube query. Suppose that a user wants to compute the subcube query, ha 2 , b 1 , ?, ∗
, ? : count()?i, for our database shown earlier in Table 5.4, and that the shell fragments
have been precomputed as described in Example 5.10. The query can be broken into
three best-ﬁt fragments according to the instantiated and inquired dimensions: AB, C,

249 250 251 252 253 254 255 256 257 258 259