Page 66 -
P. 66
Page 29
3:12
2011/6/1
#29
HAN 08-ch01-001-038-9780123814791
1.7 Major Issues in Data Mining 29
1.7 Major Issues in Data Mining
Life is short but art is long. – Hippocrates
Data mining is a dynamic and fast-expanding field with great strengths. In this section,
we briefly outline the major issues in data mining research, partitioning them into
five groups: mining methodology, user interaction, efficiency and scalability, diversity of
data types, and data mining and society. Many of these issues have been addressed in
recent data mining research and development to a certain extent and are now consid-
ered data mining requirements; others are still at the research stage. The issues continue
to stimulate further investigation and improvement in data mining.
1.7.1 Mining Methodology
Researchers have been vigorously developing new data mining methodologies. This
involves the investigation of new kinds of knowledge, mining in multidimensional
space, integrating methods from other disciplines, and the consideration of semantic ties
among data objects. In addition, mining methodologies should consider issues such as
data uncertainty, noise, and incompleteness. Some mining methods explore how user-
specified measures can be used to assess the interestingness of discovered patterns as
well as guide the discovery process. Let’s have a look at these various aspects of mining
methodology.
Mining various and new kinds of knowledge: Data mining covers a wide spectrum of
data analysis and knowledge discovery tasks, from data characterization and discrim-
ination to association and correlation analysis, classification, regression, clustering,
outlier analysis, sequence analysis, and trend and evolution analysis. These tasks may
use the same database in different ways and require the development of numerous
data mining techniques. Due to the diversity of applications, new mining tasks con-
tinue to emerge, making data mining a dynamic and fast-growing field. For example,
for effective knowledge discovery in information networks, integrated clustering and
ranking may lead to the discovery of high-quality clusters and object ranks in large
networks.
Mining knowledge in multidimensional space: When searching for knowledge in large
data sets, we can explore the data in multidimensional space. That is, we can search
for interesting patterns among combinations of dimensions (attributes) at varying
levels of abstraction. Such mining is known as (exploratory) multidimensional data
mining. In many cases, data can be aggregated or viewed as a multidimensional data
cube. Mining knowledge in cube space can substantially enhance the power and
flexibility of data mining.
Data mining—an interdisciplinary effort: The power of data mining can be substan-
tially enhanced by integrating new methods from multiple disciplines. For example,