Page 215 -
P. 215

2011/6/1
                         HAN
                               11-ch04-125-186-9780123814791
          178   Chapter 4 Data Warehousing and Online Analytical Processing  3:17 Page 178  #54



                         OLAP operations may be performed on the target and contrasting classes as deemed
                         necessary by the user in order to adjust the abstraction levels of the final description.

                           In summary, attribute-oriented induction for data characterization and generaliza-
                         tion provides an alternative data generalization method in comparison to the data cube
                         approach. It is not confined to relational data because such an induction can be per-
                         formed on spatial, multimedia, sequence, and other kinds of data sets. In addition, there
                         is no need to precompute a data cube because generalization can be performed online
                         upon receiving a user’s query.
                           Moreover, automated analysis can be added to such an induction process to auto-
                         matically filter out irrelevant or unimportant attributes. However, because attribute-
                         oriented induction automatically generalizes data to a higher level, it cannot efficiently
                         support the process of drilling down to levels deeper than those provided in the general-
                         ized relation. The integration of data cube technology with attribute-oriented induction
                         may provide a balance between precomputation and online computation. This would
                         also support fast online computation when it is necessary to drill down to a level deeper
                         than that provided in the generalized relation.


                 4.6     Summary


                           A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile data
                           collection organized in support of management decision making. Several factors
                           distinguish data warehouses from operational databases. Because the two systems
                           provide quite different functionalities and require different kinds of data, it is
                           necessary to maintain data warehouses separately from operational databases.
                           Data warehouses often adopt a three-tier architecture. The bottom tier is a ware-
                           house database server, which is typically a relational database system. The middle tier
                           is an OLAP server, and the top tier is a client that contains query and reporting tools.
                           A data warehouse contains back-end tools and utilities for populating and refresh-
                           ing the warehouse. These cover data extraction, data cleaning, data transformation,
                           loading, refreshing, and warehouse management.
                           Data warehouse metadata are data defining the warehouse objects. A metadata
                           repository provides details regarding the warehouse structure, data history, the algo-
                           rithms used for summarization, mappings from the source data to the warehouse
                           form, system performance, and business terms and issues.
                           A multidimensional data model is typically used for the design of corporate data
                           warehouses and departmental data marts. Such a model can adopt a star schema,
                           snowflake schema, or fact constellation schema. The core of the multidimensional
                           model is the data cube, which consists of a large set of facts (or measures) and a
                           number of dimensions. Dimensions are the entities or perspectives with respect to
                           which an organization wants to keep records and are hierarchical in nature.
   210   211   212   213   214   215   216   217   218   219   220