Page 177 -
P. 177

11-ch04-125-186-9780123814791
                         HAN
                                                            2011/6/1
          140   Chapter 4 Data Warehousing and Online Analytical Processing  3:17 Page 140  #16



                               time               sales               item
                           Dimension table       Fact table       Dimension table
                         time_key              time_key           item_key
                         day                   item_key           item_name
                         day_of_the_week       branch_key         brand
                         month                 location_key       type
                         quarter               dollars_sold       supplier_type
                         year                  units_sold




                             branch                                  location
                          Dimension table                         Dimension table
                          branch_key                             location_key
                          branch_name                            street
                          branch_type                            city
                                                                 province_or_state
                                                                 country


               Figure 4.6 Star schema of sales data warehouse.


                         with two measures: dollars sold and units sold. To minimize the size of the fact table,
                         dimension identifiers (e.g., time key and item key) are system-generated identifiers.

                           Notice that in the star schema, each dimension is represented by only one table, and
                         each table contains a set of attributes. For example, the location dimension table contains
                         the attribute set {location key, street, city, province or state, country}. This constraint may
                         introduce some redundancy. For example, “Urbana” and “Chicago” are both cities in the
                         state of Illinois, USA. Entries for such cities in the location dimension table will create
                         redundancy among the attributes province or state and country; that is, (..., Urbana, IL,
                         USA) and (..., Chicago, IL, USA). Moreover, the attributes within a dimension table may
                         form either a hierarchy (total order) or a lattice (partial order).
                         Snowflake schema: The snowflake schema is a variant of the star schema model,
                           where some dimension tables are normalized, thereby further splitting the data into
                           additional tables. The resulting schema graph forms a shape similar to a snowflake.

                           The major difference between the snowflake and star schema models is that the
                         dimension tables of the snowflake model may be kept in normalized form to reduce
                         redundancies. Such a table is easy to maintain and saves storage space. However, this
                         space savings is negligible in comparison to the typical magnitude of the fact table. Fur-
                         thermore, the snowflake structure can reduce the effectiveness of browsing, since more
                         joins will be needed to execute a query. Consequently, the system performance may be
                         adversely impacted. Hence, although the snowflake schema reduces redundancy, it is not
                         as popular as the star schema in data warehouse design.
   172   173   174   175   176   177   178   179   180   181   182