Page 115 -
P. 115
114 Part II • Descriptive Analytics
deal with naming conflicts and discrepancies among units of measure. A data ware-
house is presumed to be totally integrated.
• Time variant (time series). A warehouse maintains historical data. The data
do not necessarily provide current status (except in real-time systems). They detect
trends, deviations, and long-term relationships for forecasting and comparisons, lead-
ing to decision making. Every data warehouse has a temporal quality. Time is the one
important dimension that all data warehouses must support. Data for analysis from
multiple sources contains multiple time points (e.g., daily, weekly, monthly views).
• Nonvolatile. After data are entered into a data warehouse, users cannot change or
update the data. Obsolete data are discarded, and changes are recorded as new data.
These characteristics enable data warehouses to be tuned almost exclusively for data
access. Some additional characteristics may include the following:
• Web based. Data warehouses are typically designed to provide an efficient
computing environment for Web-based applications.
• Relational/multidimensional. A data warehouse uses either a relational struc-
ture or a multidimensional structure. A recent survey on multidimensional structures
can be found in Romero and Abelló (2009).
• Client/server. A data warehouse uses the client/server architecture to provide
easy access for end users.
• Real time. Newer data warehouses provide real-time, or active, data-access and
analysis capabilities (see Basu, 2003; and Bonde and Kuckuk, 2004).
• Include metadata. A data warehouse contains metadata (data about data) about
how the data are organized and how to effectively use them.
Whereas a data warehouse is a repository of data, data warehousing is literally the
entire process (see Watson, 2002). Data warehousing is a discipline that results in appli-
cations that provide decision support capability, allows ready access to business infor-
mation, and creates business insight. The three main types of data warehouses are data
marts, operational data stores (ODS), and enterprise data warehouses (EDW). In addition
to discussing these three types of warehouses next, we also discuss metadata.
Data Marts
Whereas a data warehouse combines databases across an entire enterprise, a data mart
is usually smaller and focuses on a particular subject or department. A data mart is a
subset of a data warehouse, typically consisting of a single subject area (e.g., marketing,
operations). A data mart can be either dependent or independent. A dependent data
mart is a subset that is created directly from the data warehouse. It has the advantages
of using a consistent data model and providing quality data. Dependent data marts sup-
port the concept of a single enterprise-wide data model, but the data warehouse must be
constructed first. A dependent data mart ensures that the end user is viewing the same
version of the data that is accessed by all other data warehouse users. The high cost of
data warehouses limits their use to large companies. As an alternative, many firms use a
lower-cost, scaled-down version of a data warehouse referred to as an independent data
mart. An independent data mart is a small warehouse designed for a strategic business
unit (SBU) or a department, but its source is not an EDW.
operational Data stores
An operational data store (ODs) provides a fairly recent form of customer information
file (CIF). This type of database is often used as an interim staging area for a data ware-
house. Unlike the static contents of a data warehouse, the contents of an ODS are updated
throughout the course of business operations. An ODS is used for short-term decisions
M03_SHAR9209_10_PIE_C03.indd 114 1/25/14 7:35 AM