Page 51 -
P. 51

HAN 08-ch01-001-038-9780123814791


          14    Chapter 1 Introduction                       2011/6/1  3:12  Page 14  #14



                         of items that are frequently sold together. The mining of such frequent patterns from
                         transactional data is discussed in Chapters 6 and 7.


                   1.3.4 Other Kinds of Data
                         Besides relational database data, data warehouse data, and transaction data, there are
                         many other kinds of data that have versatile forms and structures and rather different
                         semantic meanings. Such kinds of data can be seen in many applications: time-related
                         or sequence data (e.g., historical records, stock exchange data, and time-series and bio-
                         logical sequence data), data streams (e.g., video surveillance and sensor data, which are
                         continuously transmitted), spatial data (e.g., maps), engineering design data (e.g., the
                         design of buildings, system components, or integrated circuits), hypertext and multi-
                         media data (including text, image, video, and audio data), graph and networked data
                         (e.g., social and information networks), and the Web (a huge, widely distributed infor-
                         mation repository made available by the Internet). These applications bring about new
                         challenges, like how to handle data carrying special structures (e.g., sequences, trees,
                         graphs, and networks) and specific semantics (such as ordering, image, audio and video
                         contents, and connectivity), and how to mine patterns that carry rich structures and
                         semantics.
                           Various kinds of knowledge can be mined from these kinds of data. Here, we list
                         just a few. Regarding temporal data, for instance, we can mine banking data for chang-
                         ing trends, which may aid in the scheduling of bank tellers according to the volume of
                         customer traffic. Stock exchange data can be mined to uncover trends that could help
                         you plan investment strategies (e.g., the best time to purchase AllElectronics stock). We
                         could mine computer network data streams to detect intrusions based on the anomaly of
                         message flows, which may be discovered by clustering, dynamic construction of stream
                         models or by comparing the current frequent patterns with those at a previous time.
                         With spatial data, we may look for patterns that describe changes in metropolitan
                         poverty rates based on city distances from major highways. The relationships among
                         a set of spatial objects can be examined in order to discover which subsets of objects
                         are spatially autocorrelated or associated. By mining text data, such as literature on data
                         mining from the past ten years, we can identify the evolution of hot topics in the field. By
                         mining user comments on products (which are often submitted as short text messages),
                         we can assess customer sentiments and understand how well a product is embraced by
                         a market. From multimedia data, we can mine images to identify objects and classify
                         them by assigning semantic labels or tags. By mining video data of a hockey game, we
                         can detect video sequences corresponding to goals. Web mining can help us learn about
                         the distribution of information on the WWW in general, characterize and classify web
                         pages, and uncover web dynamics and the association and other relationships among
                         different web pages, users, communities, and web-based activities.
                           It is important to keep in mind that, in many applications, multiple types of data
                         are present. For example, in web mining, there often exist text data and multimedia
                         data (e.g., pictures and videos) on web pages, graph data like web graphs, and map
                         data on some web sites. In bioinformatics, genomic sequences, biological networks, and
   46   47   48   49   50   51   52   53   54   55   56