Page 38 -
P. 38
2011/6/1
3:12
Page 1
HAN 08-ch01-001-038-9780123814791
#1
1
Introduction
This book is an introduction to the young and fast-growing field of data mining (also known
as knowledge discovery from data, or KDD for short). The book focuses on fundamental
data mining concepts and techniques for discovering interesting patterns from data in
various applications. In particular, we emphasize prominent techniques for developing
effective, efficient, and scalable data mining tools.
This chapter is organized as follows. In Section 1.1, you will learn why data mining is
in high demand and how it is part of the natural evolution of information technology.
Section 1.2 defines data mining with respect to the knowledge discovery process. Next,
you will learn about data mining from many aspects, such as the kinds of data that can
be mined (Section 1.3), the kinds of knowledge to be mined (Section 1.4), the kinds of
technologies to be used (Section 1.5), and targeted applications (Section 1.6). In this
way, you will gain a multidimensional view of data mining. Finally, Section 1.7 outlines
major data mining research and development issues.
1.1 Why Data Mining?
Necessity, who is the mother of invention. – Plato
We live in a world where vast amounts of data are collected daily. Analyzing such data
is an important need. Section 1.1.1 looks at how data mining can meet this need by
providing tools to discover knowledge from data. In Section 1.1.2, we observe how data
mining can be viewed as a result of the natural evolution of information technology.
1.1.1 Moving toward the Information Age
“We are living in the information age” is a popular saying; however, we are actually living
1
in the data age. Terabytes or petabytes of data pour into our computer networks, the
World Wide Web (WWW), and various data storage devices every day from business,
1 A petabyte is a unit of information or computer storage equal to 1 quadrillion bytes, or a thousand
terabytes, or 1 million gigabytes.
Data Mining: Concepts and Techniques 1
c
2012 Elsevier Inc. All rights reserved.