Page 59 - Data Architecture
P. 59

Chapter 1.4: Demographics of Corporate Data
           created. And suppose you are looking for phone calls relating to terrorism. Out of the
           millions and millions of phone calls made, only a handful will relate to activities of
           terrorism.


           The same phenomenon is true of click stream data, analog data, metering data, and so
           forth. There do exist however records that are not directly business-relevant but are
           potentially business-relevant. These potentially business-relevant records are records that

           are not immediately useful to the business but are potentially useful under other
           circumstances.


           Now, let's consider the business relevancy of nonrepetitive unstructured data.
           Nonrepetitive unstructured data are made up of records such as e-mail, call center data,
           conversations, and insurance claims. Fig. 1.4.8 depicts nonrepetitive unstructured data.





















               Fig. 1.4.8 Business relevancy.


           In nonrepetitive unstructured data, there are data such as spam, blather, and stop words.
           These types of data are not business-relevant. But much of the data found in the
           nonrepetitive unstructured category are business-relevant (or are at least potentially
           business-relevant).


           Now, let's stop and take a look at the demographics of business relevancy as they relate
           to unstructured data (big data). Fig. 1.4.9 shows where business relevancy lies.















                                                                                                                59
   54   55   56   57   58   59   60   61   62   63   64