Page 60 -
P. 60
HAN 08-ch01-001-038-9780123814791
3:12
Page 23
2011/6/1
#23
1.5 Which Technologies Are Used? 23
Methods to assess pattern interestingness, and their use to improve data mining effi-
ciency, are discussed throughout the book with respect to each kind of pattern that can
be mined.
1.5 Which Technologies Are Used?
As a highly application-driven domain, data mining has incorporated many techniques
from other domains such as statistics, machine learning, pattern recognition, database
and data warehouse systems, information retrieval, visualization, algorithms, high-
performance computing, and many application domains (Figure 1.11). The interdisci-
plinary nature of data mining research and development contributes significantly to the
success of data mining and its extensive applications. In this section, we give examples
of several disciplines that strongly influence the development of data mining methods.
1.5.1 Statistics
Statistics studies the collection, analysis, interpretation or explanation, and presentation
of data. Data mining has an inherent connection with statistics.
A statistical model is a set of mathematical functions that describe the behavior of
the objects in a target class in terms of random variables and their associated proba-
bility distributions. Statistical models are widely used to model data and data classes.
For example, in data mining tasks like data characterization and classification, statistical
Statistics Machine learning Pattern recognition
Database systems Visualization
Data Mining
Data warehouse Algorithms
Information Applications High-performance
retrieval computing
Figure 1.11 Data mining adopts techniques from many domains.