Page 191 -
P. 191
188 N. David et al.
in order to determine which ones have the greatest effect on output behaviour
(Evans et al. 2017). This information can be used to improve model accuracy and
reduce output variance—issues directly related with model validation—and also to
promote model parsimony by fixing inconsequential parameters and simplifying
assumptions, reducing dimensionality of the input parameter space and the model’s
computational cost (Law 2015; Lee et al. 2015). Conversely, sensitivity analysis
may also point to underspecified assumptions, which may require additional detail
in order to accurately represent some aspect of the target system (Law 2015). If the
output remains unpredictable even with controlled changes, the modeller should be
concerned about making claims about the model.
A number of techniques for sampling the solution space are described in the
modelling and simulation literature. The one-factor-at-a-time (OFAT) approach is
one of the simplest sampling techniques. The effects of individual assumptions
(factors) on model behaviour are analysed in isolation by iterating each one over a
set of discretised levels while keeping the other factors unchanged (Lee et al. 2015).
Unfortunately, this technique ignores possible interactions between factors (Law
2015). This issue is handled by factorial-type designs, for which the different factor
levels are combined in specific configurations (e.g. full factorial, fractional factorial
or central composite designs) (Pereda et al. 2015). Space-filling designs are another
type of sampling technique, and aim to cover the solution space more evenly (Pereda
et al. 2015). Monte Carlo random sampling is probably the most common space-
filling approach, consisting in sampling each parameter range randomly. However,
care should be taken with this approach since clustered observations and empty
spaces are bound appear by chance. Space-filling alternatives such as quasi-Monte
Carlo or Latin Hypercube Sampling (McKay et al. 1979) cover the input space more
evenly and are often preferred. In turn, sampling based on meta-heuristics, such
as genetic algorithms, can search for pre-specified output behaviours. Thus, such
techniques are commonly used when the researcher wishes to estimate parameters
for calibration and/or optimisation purposes (Miller 1998; Calvez and Hutzler 2005;
Stonedahl and Wilensky 2010).
Since the vast majority of the models of interest in social simulation are
stochastic, one should also consider the issue of having to perform several runs
with different seeds for each sampled assumption set in order to reduce the
uncertainty about the expected output value. Consequently, there is a trade-off
between assumption space coverage and output accuracy, which can severely limit
the exploration of models with long execution times (Pereda et al. 2015). This issue
can be minimised with the use of metamodels, which can act as computationally
inexpensive proxies of more complex models (Lee et al. 2015). A metamodel, or a
model of a model, can be used for predicting the original model’s response for non-
simulated assumption sets or finding combinations of assumptions that optimise (i.e.
minimise or maximise) a response (Law 2015). A metamodel usually takes the form
of a regression function relating inputs with an output response, typically a statistic
representative of model behaviour. Statistical learning techniques such as regression
analysis, Gaussian process modelling (Kriging), neural networks or random forests
are commonly used for building the metamodel function (Law 2015; Pereda et al.
2015).