Page 48 - Data Architecture

P. 48

Chapter 1.3: The “Great Divide”

Fig. 1.3.7 Textual disambiguation centric unstructured data.

Textual disambiguation is the process of taking nonrepetitive unstructured data and

manipulating it into a format that can be analyzed by standard analytic software. There
are many facets to textual disambiguation, but perhaps the most important functionality is
one that can be called “contextualization.” Contextualization is the process by which text
is read and analyzed and the context of the text is derived. Once the context of the text is
derived, the text is then reformatted into a standard database format where the text can
be read and analyzed by standard “business intelligence” software.

The process of textual disambiguation is shown in Fig. 1.3.8.

Fig. 1.3.8 From unstructured to structured data.

There are many facets to textual disambiguation. Textual disambiguation is completely
48

43 44 45 46 47 48 49 50 51 52 53