Page 66 -
P. 66

Page 29
                                                                     3:12
                                                             2011/6/1
                                                                                   #29
                          HAN 08-ch01-001-038-9780123814791
                                                                    1.7 Major Issues in Data Mining  29

                       1.7     Major Issues in Data Mining


                                                                    Life is short but art is long. – Hippocrates

                               Data mining is a dynamic and fast-expanding field with great strengths. In this section,
                               we briefly outline the major issues in data mining research, partitioning them into
                               five groups: mining methodology, user interaction, efficiency and scalability, diversity of
                               data types, and data mining and society. Many of these issues have been addressed in
                               recent data mining research and development to a certain extent and are now consid-
                               ered data mining requirements; others are still at the research stage. The issues continue
                               to stimulate further investigation and improvement in data mining.


                         1.7.1 Mining Methodology

                               Researchers have been vigorously developing new data mining methodologies. This
                               involves the investigation of new kinds of knowledge, mining in multidimensional
                               space, integrating methods from other disciplines, and the consideration of semantic ties
                               among data objects. In addition, mining methodologies should consider issues such as
                               data uncertainty, noise, and incompleteness. Some mining methods explore how user-
                               specified measures can be used to assess the interestingness of discovered patterns as
                               well as guide the discovery process. Let’s have a look at these various aspects of mining
                               methodology.

                                 Mining various and new kinds of knowledge: Data mining covers a wide spectrum of
                                 data analysis and knowledge discovery tasks, from data characterization and discrim-
                                 ination to association and correlation analysis, classification, regression, clustering,
                                 outlier analysis, sequence analysis, and trend and evolution analysis. These tasks may
                                 use the same database in different ways and require the development of numerous
                                 data mining techniques. Due to the diversity of applications, new mining tasks con-
                                 tinue to emerge, making data mining a dynamic and fast-growing field. For example,
                                 for effective knowledge discovery in information networks, integrated clustering and
                                 ranking may lead to the discovery of high-quality clusters and object ranks in large
                                 networks.

                                 Mining knowledge in multidimensional space: When searching for knowledge in large
                                 data sets, we can explore the data in multidimensional space. That is, we can search
                                 for interesting patterns among combinations of dimensions (attributes) at varying
                                 levels of abstraction. Such mining is known as (exploratory) multidimensional data
                                 mining. In many cases, data can be aggregated or viewed as a multidimensional data
                                 cube. Mining knowledge in cube space can substantially enhance the power and
                                 flexibility of data mining.
                                 Data mining—an interdisciplinary effort: The power of data mining can be substan-
                                 tially enhanced by integrating new methods from multiple disciplines. For example,
   61   62   63   64   65   66   67   68   69   70   71