Page 161 -
P. 161
158 G. Polhill
• Token matching or token transformation – this makes use of automated token
matching via textual analysis or leveraging existing ontologies to provide
correspondences between previously unrelated ontological entities. Most of the
tools in Table 8.2 use this kind of matching at some level.
• Graph analysis of the ontology – this includes formal concept analysis (FCA),
which uses graphs to link informationally related items (Yang and Feng 2012)
and other general graph matching or analysis algorithms such as in S-Match
(Giunchiglia et al. 2012).
• Machine learning – examples of this include GLUE (Doan et al. 2004) and the
more recent YAMCC (Ngo and Bellahsene 2012), both using machine learning
to try and create correspondences between ontological elements.
• Information flow (IF) or semantic information content – this has been around
quite a while but is still largely theoretical (Barwise and Seligman 1997).
Premised by the externalist assumption and the assumption of veridical nature
of information, this is treatment of information and its relations using category
theory (Kalfoglou and Schorlemmer 2003). There are no useful implementations
of this methodology so far.
• Some combination of all the above.
Token matching and graph analysis are most prevalent. An additional review of
the available systems and software for ontology interoperability can be found in
Shvaiko and Euzenat (2013), and some other, older, but still useful methodologies
may also be found in Jean-Mary et al. (2009).
Potentially, therefore, the tools and infrastructure exist to evaluate interoperabil-
ity between domain and model ontologies. The ‘validation data’ would comprise
a pre-existing domain ontology not used to build the model or a domain ontology
obtained through a second knowledge elicitation exercise with experts or stakehold-
ers. The model’s ontology could be extracted automatically (e.g. using tools such
as Polhill’s (2015) NetLogo extension or appropriately designed object-oriented
programs enabling exploitation of one-to-one mappings from UML to OWL) or
manually, and then applications such as those in Table 8.2 used to assess their
interoperability. Such an exercise is rather more effort than fit-to-data validation:
the maturity of the area is far from being in a position where it is simply a matter
of invoking a function call in the appropriate R library as in the examples in
Appendix 2.
There is also the issue that effectively the model is assessed twice, once with
respect to its fit-to-data (which is still information, even if arguably not dependable
as a sole indicator of how ‘good’ a model is) and once with respect to its ontology.
If we are not to assume that a richer ontology automatically leads to a better fit-
to-data, the trade-off between fit-to-data and ontological interoperability is not a
trivial choice to make. Even in more established model assessment metrics that use
some information about model structure, the differences in penalty of parameters
between the AIC and BIC illustrate the scope for potential controversy. Indeed,
Brewer et al. (2016) argue that the choice of which of these to use is sensitive