Page 187 - Data Architecture
P. 187

Chapter 4.7: Taxonomies
           But the largest advantage of using a commercially created taxonomy is that the
           commercially created taxonomy does not require a large investment in the creation of the
           taxonomy. If an organization decides to manually create their own taxonomies, the
           organization is inviting a disaster because of the organization's inability to estimate how

           much effort is required to actually build and maintain the taxonomies that it needs.


           Dynamics of Taxonomies and Textual Disambiguation



           The dynamics of how a taxonomy interacts with textual disambiguation is illustrated in
           the simple example seen in Fig. 4.7.6.

























               Fig. 4.7.6 The application of a taxonomy to raw text.


           In Fig. 4.7.6, raw text is shown. The raw text is passed against the taxonomies for a car
           and another taxonomy for a motor thoroughfare. The output shows that where the word
           “Porsche” is encountered, it is recognized to be part of the taxonomy for car. The word
           “Porsche” is changed to the expression “Porsche/car” in the output. The same processing
           occurs for “Volkswagen” and “Honda.”


           Using the taxonomy for thoroughfare, the term “highway” is seen to be a form of “road.”
           The output for “highway” is written out as “highway/road.”


           The example in the figure is very simple. But the example serves to illustrate the
           dynamics of how the taxonomy is used to interact with raw text inside the textual
           disambiguation process. In reality, the actual uses of taxonomies are usually much more
           sophisticated and elaborated than this simple example.


           It is the use of taxonomies that has been described that is the key to opening the door to
                                                                                                               187
   182   183   184   185   186   187   188   189   190   191   192