Page 213 -
P. 213
210 A. Evans et al.
1
Stata. Entropy statistics can also be used to describe flows across networks. In this
sense they provide a valuable addition to network statistics: most network statistics
concentrate on structure rather than the variable values across them. Unless they
are looking specifically at the formation of networks over time, or the relationship
between some other variable and network structure, modellers are relatively bereft
of techniques to look at variation on a network.
In the case where variability is caused and constrained by neighbourhood effects,
we would expect the variation to be smoother across a region. We generally expect
objects in space under neighbourhood effects to obey Tobler’s first law of geography
(Tobler 1970) that everything is related, but closer things are related more. This
leads to spatial auto- or cross-correlation, in which the values of variables at a
point reflect those of their neighbours. Statistics for quantifying such spatial auto-
or cross-correlation at the global level, or for smaller regions, such as Moran’s I and
Geary’s C, are well established in the geography literature (e.g. Haining 1990); a
useful summary can be found in Getis (2007).
Such global statistics can be improved on by giving some notion of the direction
of change of the auto- or cross-correlation. Classically this is achieved through semi-
variograms, which map out the intensity of correlation in each direction traversed
across a surface (for details, see Isaaks and Srivastava 1990). In the case where it
is believed that local relationships hold between variables, local linear correlations
can be determined, for example, using geographically weighted regression (GWR;
for details, see Fotheringham et al. 2002). GWR is a technique which allows the
2
mapping of R s calculated within moving windows across a multivariate surface
and, indeed, mapping of the regression parameter weights. For example, it would
be possible in our retail results to produce a map of the varying relationship
between the amount of A purchased by customers and the population density, if
we believed these were related. GWR would not just allow a global relationship
to be determined, but also how this relationship changed across a country. One
important but somewhat overlooked capability of GWR is its ability to assess how
the strength of correlations varies with scale by varying the window size. This can
be used to calculate the key scales at which there is sufficient overlap between the
geography of variables to generate strong relationships (though some care is needed
in interpreting such correlations, as correlation strength generally increases with
scale: Robinson 1950; Gehlke and Biehl 1934). Plainly, identifying the key scale at
which the correlations between variables improve gives us some ability to recognise
key distance scales at which causality plays out. In our example, we may be able to
see that the scale at which there is a strong relationship between sales of A and the
local population density increases as the population density decreases, suggesting
rural consumers have to travel further and a concomitant non-linearity in the model
components directing competition.
1
Confusingly, “generalised entropy” methods are also widely used in econometrics for the
estimation of missing data. Routines which provide this capability, e.g. in SAS, are not helpful
in the description of simulation model outputs!