Page 177 -

P. 177

11-ch04-125-186-9780123814791
HAN
2011/6/1
140 Chapter 4 Data Warehousing and Online Analytical Processing 3:17 Page 140 #16

time sales item
Dimension table Fact table Dimension table
time_key time_key item_key
day item_key item_name
day_of_the_week branch_key brand
month location_key type
quarter dollars_sold supplier_type
year units_sold

branch location
Dimension table Dimension table
branch_key location_key
branch_name street
branch_type city
province_or_state
country

Figure 4.6 Star schema of sales data warehouse.

with two measures: dollars sold and units sold. To minimize the size of the fact table,
dimension identiﬁers (e.g., time key and item key) are system-generated identiﬁers.

Notice that in the star schema, each dimension is represented by only one table, and
each table contains a set of attributes. For example, the location dimension table contains
the attribute set {location key, street, city, province or state, country}. This constraint may
introduce some redundancy. For example, “Urbana” and “Chicago” are both cities in the
state of Illinois, USA. Entries for such cities in the location dimension table will create
redundancy among the attributes province or state and country; that is, (..., Urbana, IL,
USA) and (..., Chicago, IL, USA). Moreover, the attributes within a dimension table may
form either a hierarchy (total order) or a lattice (partial order).
Snowﬂake schema: The snowﬂake schema is a variant of the star schema model,
where some dimension tables are normalized, thereby further splitting the data into
additional tables. The resulting schema graph forms a shape similar to a snowﬂake.

The major difference between the snowﬂake and star schema models is that the
dimension tables of the snowﬂake model may be kept in normalized form to reduce
redundancies. Such a table is easy to maintain and saves storage space. However, this
space savings is negligible in comparison to the typical magnitude of the fact table. Fur-
thermore, the snowﬂake structure can reduce the effectiveness of browsing, since more
joins will be needed to execute a query. Consequently, the system performance may be
adversely impacted. Hence, although the snowﬂake schema reduces redundancy, it is not
as popular as the star schema in data warehouse design.

172 173 174 175 176 177 178 179 180 181 182