Page 48 -
P. 48
#11
2011/6/1
3:12
Page 11
HAN 08-ch01-001-038-9780123814791
1.3 What Kinds of Data Can Be Mined? 11
Data source in Chicago
Client
Clean
Data source in New York Integrate Data Query and
Transform Warehouse analysis tools
Load
Refresh
Data source in Toronto Client
Data source in Vancouver
Figure 1.6 Typical framework of a data warehouse for AllElectronics.
or sum(sales amount). A data cube provides a multidimensional view of data and allows
the precomputation and fast access of summarized data.
Example 1.3 A data cube for AllElectronics. A data cube for summarized sales data of AllElectronics
is presented in Figure 1.7(a). The cube has three dimensions: address (with city values
Chicago, New York, Toronto, Vancouver), time (with quarter values Q1, Q2, Q3, Q4), and
item (withitemtypevalueshomeentertainment,computer,phone,security).Theaggregate
value stored in each cell of the cube is sales amount (in thousands). For example, the total
sales for the first quarter, Q1, for the items related to security systems in Vancouver is
$400,000,asstoredincellhVancouver,Q1,securityi.Additionalcubesmaybeusedtostore
aggregatesumsovereachdimension,correspondingtotheaggregatevaluesobtainedusing
different SQL group-bys (e.g., the total sales amount per city and quarter, or per city and
item, or per quarter and item, or per each individual dimension).
By providing multidimensional data views and the precomputation of summarized
data, data warehouse systems can provide inherent support for OLAP. Online analyti-
cal processing operations make use of background knowledge regarding the domain of
the data being studied to allow the presentation of data at different levels of abstraction.
Such operations accommodate different user viewpoints. Examples of OLAP opera-
tions include drill-down and roll-up, which allow the user to view the data at differing
degrees of summarization, as illustrated in Figure 1.7(b). For instance, we can drill
down on sales data summarized by quarter to see data summarized by month. Sim-
ilarly, we can roll up on sales data summarized by city to view data summarized by
country.
Although data warehouse tools help support data analysis, additional tools for
data mining are often needed for in-depth analysis. Multidimensional data mining
(also called exploratory multidimensional data mining) performs data mining in