Page 221 -
P. 221
2011/6/1
11-ch04-125-186-9780123814791
HAN
184 Chapter 4 Data Warehousing and Online Analytical Processing 3:17 Page 184 #60
4.16 A data cube, C, has n dimensions, and each dimension has exactly p distinct values
in the base cuboid. Assume that there are no concept hierarchies associated with the
dimensions.
(a) What is the maximum number of cells possible in the base cuboid?
(b) What is the minimum number of cells possible in the base cuboid?
(c) What is the maximum number of cells possible (including both base cells and
aggregate cells) in the C data cube?
(d) What is the minimum number of cells possible in C?
4.17 What are the differences between the three main types of data warehouse usage: infor-
mation processing, analytical processing, and data mining? Discuss the motivation behind
OLAP mining (OLAM).
4.8 Bibliographic Notes
There are a good number of introductory-level textbooks on data warehousing and
OLAP technology—for example, Kimball, Ross, Thornthwaite, et al. [KRTM08];
Imhoff, Galemmo, and Geiger [IGG03]; and Inmon [Inm96]. Chaudhuri and Dayal
[CD97] provide an early overview of data warehousing and OLAP technology. A set of
research papers on materialized views and data warehouse implementations were col-
lected in Materialized Views: Techniques, Implementations, and Applications by Gupta
and Mumick [GM99].
The history of decision support systems can be traced back to the 1960s. However,
the proposal to construct large data warehouses for multidimensional data analysis is
credited to Codd [CCS93] who coined the term OLAP for online analytical processing.
The OLAP Council was established in 1995. Widom [Wid95] identified several research
problems in data warehousing. Kimball and Ross [KR02] provide an overview of the
deficiencies of SQL regarding the ability to support comparisons that are common in the
business world, and present a good set of application cases that require data warehousing
and OLAP technology. For an overview of OLAP systems versus statistical databases, see
Shoshani [Sho97].
+
Gray et al. [GCB 97] proposed the data cube as a relational aggregation operator
generalizing group-by, crosstabs, and subtotals. Harinarayan, Rajaraman, and Ullman
[HRU96] proposed a greedy algorithm for the partial materialization of cuboids in the
computation of a data cube. Data cube computation methods have been investigated by
+
numerous studies such as Sarawagi and Stonebraker [SS94]; Agarwal et al. [AAD 96];
Zhao, Deshpande, and Naughton [ZDN97]; Ross and Srivastava [RS97]; Beyer and
Ramakrishnan [BR99]; Han, Pei, Dong, and Wang [HPDW01]; and Xin, Han, Li, and
Wah [XHLW03]. These methods are discussed in depth in Chapter 5.
The concept of iceberg queries was first introduced in Fang, Shivakumar, Garcia-
+
Molina et al. [FSGM 98]. The use of join indices to speed up relational query processing
was proposed by Valduriez [Val87]. O’Neil and Graefe [OG95] proposed a bitmapped