Page 113 -
P. 113

112  Part II  •  Descriptive Analytics

                                         The motivations that led to developing data warehousing technologies go back to
                                    the 1970s, when the computing world was dominated by the mainframes. Real business
                                    data-processing applications, the ones run on the corporate mainframes, had complicated
                                    file structures using early-generation databases (not the table-oriented relational databases
                                    most applications use today) in which they stored data. Although these applications did
                                    a decent job of performing routine transactional data-processing functions, the data cre-
                                    ated as a result of these functions (such as information about customers, the products
                                    they ordered, and how much money they spent) was locked away in the depths of the
                                    files and databases. When aggregated information such as sales trends by region and by
                                    product type was needed, one had to formally request it from the data-processing depart-
                                    ment, where it was put on a waiting list with a couple hundred other report requests
                                    (Hammergren and Simon, 2009). Even though the need for information and the data that
                                    could be used to generate it existed, the database technology was not there to satisfy it.
                                    Figure 3.1 shows a timeline where some of the significant events that led to the develop-
                                    ment of data warehousing are shown.
                                         Later in this decade, commercial hardware and software companies began to emerge
                                    with solutions to this problem. Between 1976 and 1979, the concept for a new company,
                                    Teradata, grew out of research at the California Institute of Technology (Caltech), driven
                                    from discussions with Citibank’s advanced technology group. Founders worked to design
                                    a database management system for parallel processing with multiple microprocessors,
                                    targeted specifically for decision support. Teradata was incorporated on July 13, 1979, and
                                    started in a garage in Brentwood, California. The name Teradata was chosen to symbolize
                                    the ability to manage terabytes (trillions of bytes) of data.
                                         The 1980s were the decade of personal computers and minicomputers. Before any-
                                    one knew it, real computer applications were no longer only on mainframes; they were
                                    all over the place—everywhere you looked in an organization. That led to a portentous
                                    problem called islands of data. The solution to this problem led to a new type of soft-
                                    ware, called a distributed database management system, which would magically pull the
                                    requested data from databases across the organization, bring all the data back to the same
                                    place, and then consolidate it, sort it, and do whatever else was necessary to answer the
                                    user’s question. Although the concept was a good one and early results from research
                                    were promising, the results were plain and simple: They just didn’t work efficiently in the
                                    real world, and the islands-of-data problem still existed.


                                         Mainframe computers   Centralized data storage    Big Data analytics
                                         Simple data entry     Data warehousing was born   Social media analytics
                                         Routine reporting     Inmon, Building the Data Warehouse  Text and Web analytics
                                         Primitive database structures  Kimball, The Data Warehouse Toolkit  Hadoop, MapReduce, NoSQL
                                         Teradata incorporated  EDW architecture design    In-memory, in-database



                                               1970s        1980s        1990s         2000s        2010s



                                                          Mini/personal computers (PCs)  Exponentially growing data Web data
                                                          Business applications for PCs  Consolidation of DW/BI industry
                                                          Distributer DBMS          Data warehouse appliances emerged
                                                          Relational DBMS           Business intelligence popularized
                                                          Teradata ships commercial DBs  Data mining and predictive modeling
                                                          Business Data Warehouse coined  Open source software
                                                                                    SaaS, PaaS, Cloud computing
                                    figure 3.1  A List of Events That Led to Data Warehousing Development.








           M03_SHAR9209_10_PIE_C03.indd   112                                                                     1/25/14   7:35 AM
   108   109   110   111   112   113   114   115   116   117   118