Page 113 -
P. 113
112 Part II • Descriptive Analytics
The motivations that led to developing data warehousing technologies go back to
the 1970s, when the computing world was dominated by the mainframes. Real business
data-processing applications, the ones run on the corporate mainframes, had complicated
file structures using early-generation databases (not the table-oriented relational databases
most applications use today) in which they stored data. Although these applications did
a decent job of performing routine transactional data-processing functions, the data cre-
ated as a result of these functions (such as information about customers, the products
they ordered, and how much money they spent) was locked away in the depths of the
files and databases. When aggregated information such as sales trends by region and by
product type was needed, one had to formally request it from the data-processing depart-
ment, where it was put on a waiting list with a couple hundred other report requests
(Hammergren and Simon, 2009). Even though the need for information and the data that
could be used to generate it existed, the database technology was not there to satisfy it.
Figure 3.1 shows a timeline where some of the significant events that led to the develop-
ment of data warehousing are shown.
Later in this decade, commercial hardware and software companies began to emerge
with solutions to this problem. Between 1976 and 1979, the concept for a new company,
Teradata, grew out of research at the California Institute of Technology (Caltech), driven
from discussions with Citibank’s advanced technology group. Founders worked to design
a database management system for parallel processing with multiple microprocessors,
targeted specifically for decision support. Teradata was incorporated on July 13, 1979, and
started in a garage in Brentwood, California. The name Teradata was chosen to symbolize
the ability to manage terabytes (trillions of bytes) of data.
The 1980s were the decade of personal computers and minicomputers. Before any-
one knew it, real computer applications were no longer only on mainframes; they were
all over the place—everywhere you looked in an organization. That led to a portentous
problem called islands of data. The solution to this problem led to a new type of soft-
ware, called a distributed database management system, which would magically pull the
requested data from databases across the organization, bring all the data back to the same
place, and then consolidate it, sort it, and do whatever else was necessary to answer the
user’s question. Although the concept was a good one and early results from research
were promising, the results were plain and simple: They just didn’t work efficiently in the
real world, and the islands-of-data problem still existed.
Mainframe computers Centralized data storage Big Data analytics
Simple data entry Data warehousing was born Social media analytics
Routine reporting Inmon, Building the Data Warehouse Text and Web analytics
Primitive database structures Kimball, The Data Warehouse Toolkit Hadoop, MapReduce, NoSQL
Teradata incorporated EDW architecture design In-memory, in-database
1970s 1980s 1990s 2000s 2010s
Mini/personal computers (PCs) Exponentially growing data Web data
Business applications for PCs Consolidation of DW/BI industry
Distributer DBMS Data warehouse appliances emerged
Relational DBMS Business intelligence popularized
Teradata ships commercial DBs Data mining and predictive modeling
Business Data Warehouse coined Open source software
SaaS, PaaS, Cloud computing
figure 3.1 A List of Events That Led to Data Warehousing Development.
M03_SHAR9209_10_PIE_C03.indd 112 1/25/14 7:35 AM