Page 25 -
P. 25

xxiv  Preface  HAN    05-pref-xxiii-xxx-9780123814791  2011/6/1  3:35  Page xxiv  #2



                         Organization of the Book
                         Since the publication of the first two editions of this book, great progress has been
                         made in the field of data mining. Many new data mining methodologies, systems, and
                         applications have been developed, especially for handling new kinds of data, includ-
                         ing information networks, graphs, complex structures, and data streams, as well as text,
                         Web, multimedia, time-series, and spatiotemporal data. Such fast development and rich,
                         new technical contents make it difficult to cover the full spectrum of the field in a single
                         book. Instead of continuously expanding the coverage of this book, we have decided to
                         cover the core material in sufficient scope and depth, and leave the handling of complex
                         data types to a separate forthcoming book.
                           The third edition substantially revises the first two editions of the book, with numer-
                         ous enhancements and a reorganization of the technical contents. The core technical
                         material, which handles mining on general data types, is expanded and substantially
                         enhanced. Several individual chapters for topics from the second edition (e.g., data pre-
                         processing, frequent pattern mining, classification, and clustering) are now augmented
                         and each split into two chapters for this new edition. For these topics, one chapter encap-
                         sulates the basic concepts and techniques while the other presents advanced concepts
                         and methods.
                           Chapters from the second edition on mining complex data types (e.g., stream data,
                         sequence data, graph-structured data, social network data, and multirelational data,
                         as well as text, Web, multimedia, and spatiotemporal data) are now reserved for a new
                         book that will be dedicated to advanced topics in data mining. Still, to support readers
                         in learning such advanced topics, we have placed an electronic version of the relevant
                         chapters from the second edition onto the book’s web site as companion material for
                         the third edition.
                           The chapters of the third edition are described briefly as follows, with emphasis on
                         the new material.
                           Chapter 1 provides an introduction to the multidisciplinary field of data mining. It
                         discusses the evolutionary path of information technology, which has led to the need
                         for data mining, and the importance of its applications. It examines the data types to be
                         mined, including relational, transactional, and data warehouse data, as well as complex
                         data types such as time-series, sequences, data streams, spatiotemporal data, multimedia
                         data, text data, graphs, social networks, and Web data. The chapter presents a general
                         classification of data mining tasks, based on the kinds of knowledge to be mined, the
                         kinds of technologies used, and the kinds of applications that are targeted. Finally, major
                         challenges in the field are discussed.
                           Chapter 2 introduces the general data features. It first discusses data objects and
                         attribute types and then introduces typical measures for basic statistical data descrip-
                         tions. It overviews data visualization techniques for various kinds of data. In addition
                         to methods of numeric data visualization, methods for visualizing text, tags, graphs,
                         and multidimensional data are introduced. Chapter 2 also introduces ways to measure
                         similarity and dissimilarity for various kinds of data.
   20   21   22   23   24   25   26   27   28   29   30