Page 223 -
P. 223

220                                                     A. Evans et al.

            3. Tracking the causal processes through the model.
              It may seem obvious, and yet it is worth pointing out, that model outputs can only
            causally relate to model inputs, not additional data in the real world. Plainly insights
            into the system can come from comparison with external data that is correlated or
            miscorrelated with model outputs, but this is not the same as understanding your
            model and the way it currently represents the system. One would imagine that this
            means that understanding of a model cannot be facilitated by comparing it with
            other, external, data, and yet it can often be worth:
            4. Comparing model results with real-world data, because the relationships between
              real data and both model inputs and model outputs may be clearer than the
              relationships between these two things within the model.
              Let’s imagine, for example, a model that predicts the location of burglaries across
            a day in a city region where police corruption is rife. The model inputs are known
            offenders’ homes, potential target locations and attractiveness, the position of the
            owners of these targets and the police, who prefer to serve the wealthy. We may
            be able to recognise a pattern of burglaries that moves, over the course of the day,
            from the suburbs to the city centre. Although we have built into our model the fact
            that police respond faster to richer people, we may find, using (1), that our model
            doesn’t show less burglaries in rich areas, because the rich areas are so spatially
            distributed that the police response times are stretched between them. We can then
            alter the weighting of the bias away from the wealthy (2) to see if it actually reduces
            the burglary rate in the rich areas by placing police nearer these neighbourhoods
            as an ancillary effect of responding to poor people more. We may be able to fully
            understand this aspect of the model and how it arises (3), but still have a higher
            than expected burglary rate in wealthy areas. Finally, it may turn out (4) that there
            is a strong relationship between these burglaries and real data on petrol sales, for no
            other reason than both are high at transition times in this social system, when the
            police would be most stretched between regions—suggesting in turn that the change
            in police locations over time is as important as their positions at any one time.
              Let us look at each of these methodologies for developing understanding in turn.
            Correlation Most social scientists will be familiar with linear regression as a
            means for describing data or testing for a relationship between two variables; there is
            a long scientific tradition of correlating data between models and external variables,
            and this tradition is equally applicable to intra-model comparisons. Correlating
            datasets is one of the areas where automation can be applied. As an exploratory
            tool, regression modelling has its attractions, not least its simplicity in both concept
            and execution. Simple regressions can be achieved in desktop applications like
            Microsoft Excel, as well as all the major statistical packages (R, SAS, SPSS, etc.).
            Standard methodologies are well known for cross-correlation of both continuous
            normal data and time series. However even for simple analyses with a single
            input and single output variable, linear regression is not always an appropriate
            technique. For example, logistic regression models will be more appropriate for
            binary response data, Poisson models will be superior when values in the dependent
   218   219   220   221   222   223   224   225   226   227   228