Page 60 -
P. 60

HAN 08-ch01-001-038-9780123814791
                                                                     3:12
                                                                           Page 23
                                                             2011/6/1
                                                                                   #23
                                                                  1.5 Which Technologies Are Used?  23


                                 Methods to assess pattern interestingness, and their use to improve data mining effi-
                               ciency, are discussed throughout the book with respect to each kind of pattern that can
                               be mined.

                       1.5     Which Technologies Are Used?



                               As a highly application-driven domain, data mining has incorporated many techniques
                               from other domains such as statistics, machine learning, pattern recognition, database
                               and data warehouse systems, information retrieval, visualization, algorithms, high-
                               performance computing, and many application domains (Figure 1.11). The interdisci-
                               plinary nature of data mining research and development contributes significantly to the
                               success of data mining and its extensive applications. In this section, we give examples
                               of several disciplines that strongly influence the development of data mining methods.

                         1.5.1 Statistics

                               Statistics studies the collection, analysis, interpretation or explanation, and presentation
                               of data. Data mining has an inherent connection with statistics.
                                 A statistical model is a set of mathematical functions that describe the behavior of
                               the objects in a target class in terms of random variables and their associated proba-
                               bility distributions. Statistical models are widely used to model data and data classes.
                               For example, in data mining tasks like data characterization and classification, statistical



                                   Statistics         Machine learning        Pattern recognition




                                Database systems                                Visualization

                                                       Data Mining


                                 Data warehouse                                Algorithms




                                  Information           Applications         High-performance
                                   retrieval                                    computing



                    Figure 1.11 Data mining adopts techniques from many domains.
   55   56   57   58   59   60   61   62   63   64   65