Page 46 -
P. 46

Page 9
                                                              2011/6/1
                                                                      3:12
                           HAN 08-ch01-001-038-9780123814791
                                                                                   #9
                                                              1.3 What Kinds of Data Can Be Mined?  9


                         1.3.1 Database Data
                               A database system, also called a database management system (DBMS), consists of a
                               collection of interrelated data, known as a database, and a set of software programs to
                               manage and access the data. The software programs provide mechanisms for defining
                               database structures and data storage; for specifying and managing concurrent, shared,
                               or distributed data access; and for ensuring consistency and security of the information
                               stored despite system crashes or attempts at unauthorized access.
                                 A relational database is a collection of tables, each of which is assigned a unique
                               name. Each table consists of a set of attributes (columns or fields) and usually stores
                               a large set of tuples (records or rows). Each tuple in a relational table represents an
                               object identified by a unique key and described by a set of attribute values. A semantic
                               data model, such as an entity-relationship (ER) data model, is often constructed for
                               relational databases. An ER data model represents the database as a set of entities and
                               their relationships.


                  Example 1.2 A relational database for AllElectronics. The fictitious AllElectronics store is used to
                               illustrate concepts throughout this book. The company is described by the following
                               relation tables: customer, item, employee, and branch. The headers of the tables described
                               here are shown in Figure 1.5. (A header is also called the schema of a relation.)

                                 The relation customer consists of a set of attributes describing the customer infor-
                                 mation, including a unique customer identity number (cust ID), customer name,
                                 address, age, occupation, annual income, credit information, and category.
                                 Similarly, each of the relations item, employee, and branch consists of a set of attri-
                                 butes describing the properties of these entities.
                                 Tables can also be used to represent the relationships between or among multiple
                                 entities. In our example, these include purchases (customer purchases items, creating
                                 a sales transaction handled by an employee), items sold (lists items sold in a given
                                 transaction), and works at (employee works at a branch of AllElectronics).




                               customer  (cust ID, name, address, age, occupation, annual income, credit information,
                                        category, ...)
                                   item  (item ID, brand, category, type, price, place made, supplier, cost, ...)
                               employee  (empl ID, name, category, group, salary, commission, ...)
                                 branch  (branch ID, name, address, ...)
                               purchases  (trans ID, cust ID, empl ID, date, time, method paid, amount)
                              items sold  (trans ID, item ID, qty)
                               works at  (empl ID, branch ID)
                     Figure 1.5 Relational schema for a relational database, AllElectronics.
   41   42   43   44   45   46   47   48   49   50   51