Page 119 -
P. 119
HAN 09-ch02-039-082-9780123814791
82 Chapter 2 Getting to Know Your Data 2011/6/1 3:15 Page 82 #44
statistics-based visualization of data using boxplots, quantile plots, quantile–quantile
plots, scatter plots, and loess curves, see Cleveland [Cle93].
Pioneering work on data visualization techniques is described in The Visual Dis-
play of Quantitative Information [Tuf83], Envisioning Information [Tuf90], and Visual
Explanations: Images and Quantities, Evidence and Narrative [Tuf97], all by Tufte, in
addition to Graphics and Graphic Information Processing by Bertin [Ber81], Visualizing
Data by Cleveland [Cle93], and Information Visualization in Data Mining and Knowledge
Discovery edited by Fayyad, Grinstein, and Wierse [FGW01].
Major conferences and symposiums on visualization include ACM Human Factors
in Computing Systems (CHI), Visualization, and the International Symposium on Infor-
mation Visualization. Research on visualization is also published in Transactions on
Visualization and Computer Graphics, Journal of Computational and Graphical Statistics,
and IEEE Computer Graphics and Applications.
Many graphical user interfaces and visualization tools have been developed and can
be found in various data mining products. Several books on data mining (e.g., Data
Mining Solutions by Westphal and Blaxton [WB98]) present many good examples and
visual snapshots. For a survey of visualization techniques, see “Visual techniques for
exploring databases” by Keim [Kei97].
Similarity and distance measures among various variables have been introduced in
many textbooks that study cluster analysis, including Hartigan [Har75]; Jain and Dubes
[JD88]; Kaufman and Rousseeuw [KR90]; and Arabie, Hubert, and de Soete [AHS96].
Methods for combining attributes of different types into a single dissimilarity matrix
were introduced by Kaufman and Rousseeuw [KR90].