Page 189 -
P. 189
2011/6/1
HAN
11-ch04-125-186-9780123814791
152 Chapter 4 Data Warehousing and Online Analytical Processing 3:17 Page 152 #28
forward at considerably less expense and to evaluate the technological benefits before
making significant commitments. In the combined approach, an organization can
exploit the planned and strategic nature of the top-down approach while retaining the
rapid implementation and opportunistic application of the bottom-up approach.
From the software engineering point of view, the design and construction of a data
warehouse may consist of the following steps: planning, requirements study, problem
analysis, warehouse design, data integration and testing, and finally deployment of the
data warehouse. Large software systems can be developed using one of two methodo-
logies: the waterfall method or the spiral method. The waterfall method performs a
structured and systematic analysis at each step before proceeding to the next, which
is like a waterfall, falling from one step to the next. The spiral method involves the rapid
generation of increasingly functional systems, with short intervals between successive
releases. This is considered a good choice for data warehouse development, especially
for data marts, because the turnaround time is short, modifications can be done quickly,
and new designs and technologies can be adapted in a timely manner.
In general, the warehouse design process consists of the following steps:
1. Choose a business process to model (e.g., orders, invoices, shipments, inventory,
account administration, sales, or the general ledger). If the business process is orga-
nizational and involves multiple complex object collections, a data warehouse model
should be followed. However, if the process is departmental and focuses on the
analysis of one kind of business process, a data mart model should be chosen.
2. Choose the business process grain, which is the fundamental, atomic level of data
to be represented in the fact table for this process (e.g., individual transactions,
individual daily snapshots, and so on).
3. Choose the dimensions that will apply to each fact table record. Typical dimensions
are time, item, customer, supplier, warehouse, transaction type, and status.
4. Choose the measures that will populate each fact table record. Typical measures are
numeric additive quantities like dollars sold and units sold.
Because data warehouse construction is a difficult and long-term task, its imple-
mentation scope should be clearly defined. The goals of an initial data warehouse
implementation should be specific, achievable, and measurable. This involves determin-
ing the time and budget allocations, the subset of the organization that is to be modeled,
the number of data sources selected, and the number and types of departments to be
served.
Once a data warehouse is designed and constructed, the initial deployment of
the warehouse includes initial installation, roll-out planning, training, and orienta-
tion. Platform upgrades and maintenance must also be considered. Data warehouse
administration includes data refreshment, data source synchronization, planning for
disaster recovery, managing access control and security, managing data growth, man-
aging database performance, and data warehouse enhancement and extension. Scope