Page 20 -
P. 20

3:32
                                 04-fore-xix-xxii-9780123814791
                                                            2011/6/1
                                                                          Page xix
                                                                                    #1
                          HAN






                                                                              Foreword



















                               Analyzing large amounts of data is a necessity. Even popular science books, like “super
                               crunchers,” give compelling cases where large amounts of data yield discoveries and
                               intuitions that surprise even experts. Every enterprise benefits from collecting and ana-
                               lyzing its data: Hospitals can spot trends and anomalies in their patient records, search
                               engines can do better ranking and ad placement, and environmental and public health
                               agencies can spot patterns and abnormalities in their data. The list continues, with
                               cybersecurity and computer network intrusion detection; monitoring of the energy
                               consumption of household appliances; pattern analysis in bioinformatics and pharma-
                               ceutical data; financial and business intelligence data; spotting trends in blogs, Twitter,
                               and many more. Storage is inexpensive and getting even less so, as are data sensors. Thus,
                               collecting and storing data is easier than ever before.
                                 The problem then becomes how to analyze the data. This is exactly the focus of this
                               Third Edition of the book. Jiawei, Micheline, and Jian give encyclopedic coverage of all
                               the related methods, from the classic topics of clustering and classification, to database
                               methods (e.g., association rules, data cubes) to more recent and advanced topics (e.g.,
                               SVD/PCA, wavelets, support vector machines).
                                 The exposition is extremely accessible to beginners and advanced readers alike. The
                               book gives the fundamental material first and the more advanced material in follow-up
                               chapters. It also has numerous rhetorical questions, which I found extremely helpful for
                               maintaining focus.
                                 We have used the first two editions as textbooks in data mining courses at Carnegie
                               Mellon and plan to continue to do so with this Third Edition. The new version has
                               significant additions: Notably, it has more than 100 citations to works from 2006
                               onward, focusing on more recent material such as graphs and social networks, sen-
                               sor networks, and outlier detection. This book has a new section for visualization, has
                               expanded outlier detection into a whole chapter, and has separate chapters for advanced



                                                                                                 xix
   15   16   17   18   19   20   21   22   23   24   25