Page 175 - Data Architecture
P. 175

Chapter 4.6: Textual Disambiguation



















               Fig. 4.6.10 Filtering emails.


           Spreadsheets



           Another special case is the case of spreadsheets. Spreadsheets are ubiquitous. Sometimes,
           the information on the spreadsheet is purely numerical. But on other occasions, there is

           character-based information on a spreadsheet. As a rule, textual disambiguation does not
           process numerical information from a spreadsheet. That is because there are no metadata
           to accurately describe numeric values on a spreadsheet. (Note: there is formulaic
           information for the numbers found on a spreadsheet, but the spreadsheet formulas are
           almost worthless as metadata descriptions of the meaning of the numbers.) For this
           reason, the only data that are found on the spreadsheet that make its way into textual
           ETL are the character-based descriptive data.


           To this end, there is an interface that allows the data on the spreadsheet that are useful to
           be formatted from the spreadsheet into a working database. From the working database,
           the data are then sent into textual disambiguation, as seen in Fig. 4.6.11.


























                                                                                                               175
   170   171   172   173   174   175   176   177   178   179   180