Page 191 -
P. 191

188                                                     N. David et al.

            in order to determine which ones have the greatest effect on output behaviour
            (Evans et al. 2017). This information can be used to improve model accuracy and
            reduce output variance—issues directly related with model validation—and also to
            promote model parsimony by fixing inconsequential parameters and simplifying
            assumptions, reducing dimensionality of the input parameter space and the model’s
            computational cost (Law 2015; Lee et al. 2015). Conversely, sensitivity analysis
            may also point to underspecified assumptions, which may require additional detail
            in order to accurately represent some aspect of the target system (Law 2015). If the
            output remains unpredictable even with controlled changes, the modeller should be
            concerned about making claims about the model.
              A number of techniques for sampling the solution space are described in the
            modelling and simulation literature. The one-factor-at-a-time (OFAT) approach is
            one of the simplest sampling techniques. The effects of individual assumptions
            (factors) on model behaviour are analysed in isolation by iterating each one over a
            set of discretised levels while keeping the other factors unchanged (Lee et al. 2015).
            Unfortunately, this technique ignores possible interactions between factors (Law
            2015). This issue is handled by factorial-type designs, for which the different factor
            levels are combined in specific configurations (e.g. full factorial, fractional factorial
            or central composite designs) (Pereda et al. 2015). Space-filling designs are another
            type of sampling technique, and aim to cover the solution space more evenly (Pereda
            et al. 2015). Monte Carlo random sampling is probably the most common space-
            filling approach, consisting in sampling each parameter range randomly. However,
            care should be taken with this approach since clustered observations and empty
            spaces are bound appear by chance. Space-filling alternatives such as quasi-Monte
            Carlo or Latin Hypercube Sampling (McKay et al. 1979) cover the input space more
            evenly and are often preferred. In turn, sampling based on meta-heuristics, such
            as genetic algorithms, can search for pre-specified output behaviours. Thus, such
            techniques are commonly used when the researcher wishes to estimate parameters
            for calibration and/or optimisation purposes (Miller 1998; Calvez and Hutzler 2005;
            Stonedahl and Wilensky 2010).
              Since the vast majority of the models of interest in social simulation are
            stochastic, one should also consider the issue of having to perform several runs
            with different seeds for each sampled assumption set in order to reduce the
            uncertainty about the expected output value. Consequently, there is a trade-off
            between assumption space coverage and output accuracy, which can severely limit
            the exploration of models with long execution times (Pereda et al. 2015). This issue
            can be minimised with the use of metamodels, which can act as computationally
            inexpensive proxies of more complex models (Lee et al. 2015). A metamodel, or a
            model of a model, can be used for predicting the original model’s response for non-
            simulated assumption sets or finding combinations of assumptions that optimise (i.e.
            minimise or maximise) a response (Law 2015). A metamodel usually takes the form
            of a regression function relating inputs with an output response, typically a statistic
            representative of model behaviour. Statistical learning techniques such as regression
            analysis, Gaussian process modelling (Kriging), neural networks or random forests
            are commonly used for building the metamodel function (Law 2015; Pereda et al.
            2015).
   186   187   188   189   190   191   192   193   194   195   196