Page 47 - Data Architecture
P. 47

Chapter 1.3: The “Great Divide”










































               Fig. 1.3.6 An infinite amount of data.


           There is then an emphasis on doing the normal tasks of data management in the Hadoop
           environment where the process must be able to handle very large amounts of data.



           Nonrepetitive Unstructured Data



           The emphasis in the nonrepetitive unstructured environment is quite different than the
           emphasis on the management of the Hadoop big data technology. In the nonrepetitive
           unstructured environment, there is an emphasis on “textual disambiguation” (or on
           “textual ETL”). This emphasis is shown in Fig. 1.3.7.















                                                                                                                47
   42   43   44   45   46   47   48   49   50   51   52