Page 145 -
P. 145
142 G. Polhill
8.1 Introduction
The chapter argues for the importance of the ontological structure in social
simulation – that is, what basic entities exist, their attributes and their relationships
with each other. In particular, simply getting a good fit of the outcomes to data
is not enough to establish the adequacy of the model. To make this point vivid,
it considers the opposite extreme, an example of a machine learning algorithm
where the ‘model’ is simply induced from the data – where there is the minimum
predefined ontological structure. The example chosen is that of neural networks,
though almost any black-box machine learning approach would have done as
well.
Neural networks are universal function approximators (Hornik et al. 1989). This
means that given a set of data, they can approximate it to within an arbitrary degree
of accuracy simply by adding more parameters. Though it may seem strange to
compare neural networks with agent-based models for the purposes of validation
and generalization, there are useful lessons from so doing that illustrate where agent-
based models add value to traditional modelling approaches and why validation
is not so straightforward. The main contrast between neural networks and agent-
based models comes down to the ‘ontology’. Essentially, apart from the labels
assigned to the input and output units of a neural network, neural networks don’t
have an ontology at all. What they do have is a mathematical structure that allows
the number of parameters to be arbitrarily varied and, with that, arbitrary degrees
of fit to a set of data to be achieved. By contrast, agent-based models have a rich
and highly descriptive ontology but, like neural networks, potentially have a large
number of parameters that can be varied (especially if we consider each agent
uniquely).
In this chapter, we examine some approaches to validation and generalization in
neural networks and consider what they tell us about agent-based modelling. Our
arguments are that validation needs to look beyond the relatively trivial question
of fit-to-data, especially in non-ergodic complex systems. Rather than being a
weakness of agent-based modelling, the challenges of validation and generalization
point to its strengths, especially in social systems, where the language used to
describe them is influenced by evolving cultural considerations.
The chapter starts with an introduction to neural networks followed by
how these are calibrated and validated. It then discusses the issue at the
heart of the chapter the importance of predetermined model bias – that is the
imposed structure derived from knowledge about what is being modelled. It
uses a particular measure (the VC dimension) to show the amount of data
needed to infer a good model without imposing such a bias is typically
infeasible. It summarizes the various measures one might use for checking
fit-to-data. This paves the way for a discussion on validating ontologies
discussing a number of approaches and the tools that might be useful for
this.