Page 191 -
P. 191
11-ch04-125-186-9780123814791
HAN
2011/6/1
154 Chapter 4 Data Warehousing and Online Analytical Processing 3:17 Page 154 #30
Data mining supports knowledge discovery by finding hidden patterns and associa-
tions, constructing analytical models, performing classification and prediction, and
presenting the mining results using visualization tools.
“How does data mining relate to information processing and online analytical process-
ing?” Information processing, based on queries, can find useful information. However,
answers to such queries reflect the information directly stored in databases or com-
putable by aggregate functions. They do not reflect sophisticated patterns or regularities
buried in the database. Therefore, information processing is not data mining.
Online analytical processing comes a step closer to data mining because it can derive
information summarized at multiple granularities from user-specified subsets of a data
warehouse. Such descriptions are equivalent to the class/concept descriptions discussed
in Chapter 1. Because data mining systems can also mine generalized class/concept
descriptions, this raises some interesting questions: “Do OLAP systems perform data
mining? Are OLAP systems actually data mining systems?”
The functionalities of OLAP and data mining can be viewed as disjoint: OLAP is a
data summarization/aggregation tool that helps simplify data analysis, while data mining
allows the automated discovery of implicit patterns and interesting knowledge hidden
in large amounts of data. OLAP tools are targeted toward simplifying and supporting
interactive data analysis, whereas the goal of data mining tools is to automate as much
of the process as possible, while still allowing users to guide the process. In this sense,
data mining goes one step beyond traditional online analytical processing.
An alternative and broader view of data mining may be adopted in which data mining
covers both data description and data modeling. Because OLAP systems can present
general descriptions of data from data warehouses, OLAP functions are essentially for
user-directed data summarization and comparison (by drilling, pivoting, slicing, dic-
ing, and other operations). These are, though limited, data mining functionalities. Yet
according to this view, data mining covers a much broader spectrum than simple OLAP
operations, because it performs not only data summarization and comparison but also
association, classification, prediction, clustering, time-series analysis, and other data
analysis tasks.
Data mining is not confined to the analysis of data stored in data warehouses. It may
analyze data existing at more detailed granularities than the summarized data provided
in a data warehouse. It may also analyze transactional, spatial, textual, and multimedia
data that are difficult to model with current multidimensional database technology. In
this context, data mining covers a broader spectrum than OLAP with respect to data
mining functionality and the complexity of the data handled.
Because data mining involves more automated and deeper analysis than OLAP, it
is expected to have broader applications. Data mining can help business managers find
and reach more suitable customers, as well as gain critical business insights that may help
drive market share and raise profits. In addition, data mining can help managers under-
stand customer group characteristics and develop optimal pricing strategies accordingly.
It can correct item bundling based not on intuition but on actual item groups derived
from customer purchase patterns, reduce promotional spending, and at the same time
increase the overall net effectiveness of promotions.