Page 119 -
P. 119

HAN 09-ch02-039-082-9780123814791


          82    Chapter 2 Getting to Know Your Data          2011/6/1  3:15  Page 82  #44



                         statistics-based visualization of data using boxplots, quantile plots, quantile–quantile
                         plots, scatter plots, and loess curves, see Cleveland [Cle93].
                           Pioneering work on data visualization techniques is described in The Visual Dis-
                         play of Quantitative Information [Tuf83], Envisioning Information [Tuf90], and Visual
                         Explanations: Images and Quantities, Evidence and Narrative [Tuf97], all by Tufte, in
                         addition to Graphics and Graphic Information Processing by Bertin [Ber81], Visualizing
                         Data by Cleveland [Cle93], and Information Visualization in Data Mining and Knowledge
                         Discovery edited by Fayyad, Grinstein, and Wierse [FGW01].
                           Major conferences and symposiums on visualization include ACM Human Factors
                         in Computing Systems (CHI), Visualization, and the International Symposium on Infor-
                         mation Visualization. Research on visualization is also published in Transactions on
                         Visualization and Computer Graphics, Journal of Computational and Graphical Statistics,
                         and IEEE Computer Graphics and Applications.
                           Many graphical user interfaces and visualization tools have been developed and can
                         be found in various data mining products. Several books on data mining (e.g., Data
                         Mining Solutions by Westphal and Blaxton [WB98]) present many good examples and
                         visual snapshots. For a survey of visualization techniques, see “Visual techniques for
                         exploring databases” by Keim [Kei97].
                           Similarity and distance measures among various variables have been introduced in
                         many textbooks that study cluster analysis, including Hartigan [Har75]; Jain and Dubes
                         [JD88]; Kaufman and Rousseeuw [KR90]; and Arabie, Hubert, and de Soete [AHS96].
                         Methods for combining attributes of different types into a single dissimilarity matrix
                         were introduced by Kaufman and Rousseeuw [KR90].
   114   115   116   117   118   119   120   121   122   123   124