Page 172 -
P. 172

2011/6/1
                               11-ch04-125-186-9780123814791
                                                                                    #11
                                                                     3:17 Page 135
                         HAN
                                                   4.2 Data Warehouse Modeling: Data Cube and OLAP  135


                                 Operational metadata, which include data lineage (history of migrated data and the
                                 sequence of transformations applied to it), currency of data (active, archived, or
                                 purged), and monitoring information (warehouse usage statistics, error reports, and
                                 audit trails).
                                 The algorithms used for summarization, which include measure and dimension
                                 definition algorithms, data on granularity, partitions, subject areas, aggregation,
                                 summarization, and predefined queries and reports.
                                 Mapping from the operational environment to the data warehouse, which includes
                                 source databases and their contents, gateway descriptions, data partitions, data
                                 extraction, cleaning, transformation rules and defaults, data refresh and purging
                                 rules, and security (user authorization and access control).
                                 Data related to system performance, which include indices and profiles that improve
                                 data access and retrieval performance, in addition to rules for the timing and
                                 scheduling of refresh, update, and replication cycles.

                                 Business metadata, which include business terms and definitions, data ownership
                                 information, and charging policies.

                               A data warehouse contains different levels of summarization, of which metadata is one.
                               Other types include current detailed data (which are almost always on disk), older
                               detailed data (which are usually on tertiary storage), lightly summarized data, and highly
                               summarized data (which may or may not be physically housed).
                                 Metadata play a very different role than other data warehouse data and are important
                               for many reasons. For example, metadata are used as a directory to help the decision
                               support system analyst locate the contents of the data warehouse, and as a guide to
                               the data mapping when data are transformed from the operational environment to the
                               data warehouse environment. Metadata also serve as a guide to the algorithms used for
                               summarization between the current detailed data and the lightly summarized data, and
                               between the lightly summarized data and the highly summarized data. Metadata should
                               be stored and managed persistently (i.e., on disk).



                       4.2     Data Warehouse Modeling: Data Cube

                               and OLAP

                               Data warehouses and OLAP tools are based on a multidimensional data model. This
                               model views data in the form of a data cube. In this section, you will learn how data cubes
                               model n-dimensional data (Section 4.2.1). In Section 4.2.2, various multidimensional
                               models are shown: star schema, snowflake schema, and fact constellation. You will also
                               learn about concept hierarchies (Section 4.2.3) and measures (Section 4.2.4) and how
                               they can be used in basic OLAP operations to allow interactive mining at multiple levels
                               of abstraction. Typical OLAP operations such as drill-down and roll-up are illustrated
   167   168   169   170   171   172   173   174   175   176   177