Page 72 -
P. 72

#35
                                                                     3:12
                                                             2011/6/1
                                                                           Page 35
                          HAN 08-ch01-001-038-9780123814791
                                                                          1.10 Bibliographic Notes  35


                               outlier analysis. Give examples of each data mining functionality, using a real-life
                               database that you are familiar with.
                           1.4 Present an example where data mining is crucial to the success of a business. What data
                               mining functionalities does this business need (e.g., think of the kinds of patterns that
                               could be mined)? Can such patterns be generated alternatively by data query processing
                               or simple statistical analysis?
                           1.5 Explain the difference and similarity between discrimination and classification, between
                               characterization and clustering, and between classification and regression.
                           1.6 Based on your observations, describe another possible kind of knowledge that needs to
                               be discovered by data mining methods but has not been listed in this chapter. Does it
                               require a mining methodology that is quite different from those outlined in this chapter?
                           1.7 Outliers are often discarded as noise. However, one person’s garbage could be another’s
                               treasure. For example, exceptions in credit card transactions can help us detect the
                               fraudulent use of credit cards. Using fraudulence detection as an example, propose two
                               methods that can be used to detect outliers and discuss which one is more reliable.
                           1.8 Describe three challenges to data mining regarding data mining methodology and user
                               interaction issues.
                           1.9 What are the major challenges of mining a huge amount of data (e.g., billions of tuples)
                               in comparison with mining a small amount of data (e.g., data set of a few hundred
                               tuple)?
                          1.10 Outline the major research challenges of data mining in one specific application domain,
                               such as stream/sensor data analysis, spatiotemporal data analysis, or bioinformatics.



                    1.10       Bibliographic Notes


                               The book Knowledge Discovery in Databases, edited by Piatetsky-Shapiro and Frawley
                               [P-SF91], is an early collection of research papers on knowledge discovery from data.
                               The book Advances in Knowledge Discovery and Data Mining, edited by Fayyad,
                               Piatetsky-Shapiro, Smyth, and Uthurusamy [FPSS+96], is a collection of later research
                               results on knowledge discovery and data mining. There have been many data min-
                               ing books published in recent years, including The Elements of Statistical Learning
                               by Hastie, Tibshirani, and Friedman [HTF09]; Introduction to Data Mining by Tan,
                               Steinbach, and Kumar [TSK05]; Data Mining: Practical Machine Learning Tools and
                               Techniques with Java Implementations by Witten, Frank, and Hall [WFH11]; Predic-
                               tive Data Mining by Weiss and Indurkhya [WI98]; Mastering Data Mining: The Art
                               and Science of Customer Relationship Management by Berry and Linoff [BL99]; Prin-
                               ciples of Data Mining (Adaptive Computation and Machine Learning) by Hand, Mannila,
                               and Smyth [HMS01]; Mining the Web: Discovering Knowledge from Hypertext Data by
                               Chakrabarti [Cha03a]; Web Data Mining: Exploring Hyperlinks, Contents, and Usage
   67   68   69   70   71   72   73   74   75   76   77