Page 166 -
P. 166

3:17
                                                             2011/6/1
                                                                           Page 129
                                                                                    #5
                          HAN 11-ch04-125-186-9780123814791
                                                                 4.1 Data Warehouse: Basic Concepts  129


                                 Database design: An OLTP system usually adopts an entity-relationship (ER) data
                                 model and an application-oriented database design. An OLAP system typically
                                 adopts either a star or a snowflake model (see Section 4.2.2) and a subject-oriented
                                 database design.
                                 View: An OLTP system focuses mainly on the current data within an enterprise or
                                 department, without referring to historic data or data in different organizations. In
                                 contrast, an OLAP system often spans multiple versions of a database schema, due to
                                 the evolutionary process of an organization. OLAP systems also deal with informa-
                                 tion that originates from different organizations, integrating information from many
                                 data stores. Because of their huge volume, OLAP data are stored on multiple storage
                                 media.
                                 Access patterns: The access patterns of an OLTP system consist mainly of short,
                                 atomic transactions. Such a system requires concurrency control and recovery mech-
                                 anisms. However, accesses to OLAP systems are mostly read-only operations (because
                                 most data warehouses store historic rather than up-to-date information), although
                                 many could be complex queries.

                                 Other features that distinguish between OLTP and OLAP systems include database
                               size, frequency of operations, and performance metrics. These are summarized in
                               Table 4.1.


                         4.1.3 But, Why Have a Separate Data Warehouse?
                               Because operational databases store huge amounts of data, you may wonder, “Why not
                               perform online analytical processing directly on such databases instead of spending addi-
                               tional time and resources to construct a separate data warehouse?” A major reason for such
                               a separation is to help promote the high performance of both systems. An operational
                               database is designed and tuned from known tasks and workloads like indexing and
                               hashing using primary keys, searching for particular records, and optimizing “canned”
                               queries. On the other hand, data warehouse queries are often complex. They involve the
                               computation of large data groups at summarized levels, and may require the use of spe-
                               cial data organization, access, and implementation methods based on multidimensional
                               views. Processing OLAP queries in operational databases would substantially degrade
                               the performance of operational tasks.
                                 Moreover, an operational database supports the concurrent processing of multiple
                               transactions. Concurrency control and recovery mechanisms (e.g., locking and logging)
                               are required to ensure the consistency and robustness of transactions. An OLAP query
                               often needs read-only access of data records for summarization and aggregation. Con-
                               currency control and recovery mechanisms, if applied for such OLAP operations, may
                               jeopardize the execution of concurrent transactions and thus substantially reduce the
                               throughput of an OLTP system.
                                 Finally, the separation of operational databases from data warehouses is based on
                               the different structures, contents, and uses of the data in these two systems. Decision
   161   162   163   164   165   166   167   168   169   170   171