Page 48 - Data Architecture
P. 48

Chapter 1.3: The “Great Divide”
































               Fig. 1.3.7 Textual disambiguation centric unstructured data.

           Textual disambiguation is the process of taking nonrepetitive unstructured data and

           manipulating it into a format that can be analyzed by standard analytic software. There
           are many facets to textual disambiguation, but perhaps the most important functionality is
           one that can be called “contextualization.” Contextualization is the process by which text
           is read and analyzed and the context of the text is derived. Once the context of the text is
           derived, the text is then reformatted into a standard database format where the text can
           be read and analyzed by standard “business intelligence” software.


           The process of textual disambiguation is shown in Fig. 1.3.8.





















               Fig. 1.3.8 From unstructured to structured data.


           There are many facets to textual disambiguation. Textual disambiguation is completely
                                                                                                                48
   43   44   45   46   47   48   49   50   51   52   53