Page 11 -
P. 11
x Contents HAN 03-toc-ix-xviii-9780123814791 2011/6/1 3:32 Page x #2
1.6 Which Kinds of Applications Are Targeted? 27
1.6.1 Business Intelligence 27
1.6.2 Web Search Engines 28
1.7 Major Issues in Data Mining 29
1.7.1 Mining Methodology 29
1.7.2 User Interaction 30
1.7.3 Efficiency and Scalability 31
1.7.4 Diversity of Database Types 32
1.7.5 Data Mining and Society 32
1.8 Summary 33
1.9 Exercises 34
1.10 Bibliographic Notes 35
Chapter 2 Getting to Know Your Data 39
2.1 Data Objects and Attribute Types 40
2.1.1 What Is an Attribute? 40
2.1.2 Nominal Attributes 41
2.1.3 Binary Attributes 41
2.1.4 Ordinal Attributes 42
2.1.5 Numeric Attributes 43
2.1.6 Discrete versus Continuous Attributes 44
2.2 Basic Statistical Descriptions of Data 44
2.2.1 Measuring the Central Tendency: Mean, Median, and Mode 45
2.2.2 Measuring the Dispersion of Data: Range, Quartiles, Variance,
Standard Deviation, and Interquartile Range 48
2.2.3 Graphic Displays of Basic Statistical Descriptions of Data 51
2.3 Data Visualization 56
2.3.1 Pixel-Oriented Visualization Techniques 57
2.3.2 Geometric Projection Visualization Techniques 58
2.3.3 Icon-Based Visualization Techniques 60
2.3.4 Hierarchical Visualization Techniques 63
2.3.5 Visualizing Complex Data and Relations 64
2.4 Measuring Data Similarity and Dissimilarity 65
2.4.1 Data Matrix versus Dissimilarity Matrix 67
2.4.2 Proximity Measures for Nominal Attributes 68
2.4.3 Proximity Measures for Binary Attributes 70
2.4.4 Dissimilarity of Numeric Data: Minkowski Distance 72
2.4.5 Proximity Measures for Ordinal Attributes 74
2.4.6 Dissimilarity for Attributes of Mixed Types 75
2.4.7 Cosine Similarity 77
2.5 Summary 79
2.6 Exercises 79
2.7 Bibliographic Notes 81