Page 215 -
P. 215
212 A. Evans et al.
between wave amplitude and propagation distance. Viboud et al. (2006) provide
a particularly nice example of such a use, looking at the strength of the propagation
of influenza epidemics as influenced by city size and average human travel distances
in the USA. Other more traditional statistics, such as the Rayleigh statistic (Fisher
et al. 1987; Korie et al. 1998), can also be used to assess the significance of diffusion
from point sources.
In addition to global and regional aggregate statistics of single variables or
cross-correlations, it may be that there is simply too great a dimensionality to
recognise patterns in outputs and relate them to model inputs. At this point it
is necessary to engage in multidimensional scaling. If an individual has more
than four characteristics, then multidimensional scaling methods can be used to
represent the individuals in two or three dimensions. In essence, the problem is
to represent the relation between individuals such that those which are most similar
in n-dimensions still appear to be closest in a lower-dimensional space which can
be visualised more easily. The most popular technique is Sammon mapping. This
method relies on the ability to optimise an error function which relates original
values in high-dimensional space to the transformed values. This can be achieved
using standard optimisation methods within packages such as MATLAB or using
a number of bespoke R packages. Multidimensional scaling can be useful in
visualising the relative position of different individuals within a search space, for
exploring variations in a multi-criteria objective function within a parameter space
or for comparing individual search paths within different simulations (Pohlheim
2006).
Eigenvector methods are another form of multidimensional scaling. Any multidi-
mensional representation of data in n-dimensional space can be transformed into an
equivalent space governed by n orthogonal eigenvectors. The main significance of
this observation is that the principal eigenvector constitutes the most efficient way
to represent a multidimensional space within a single value. For example, Moon,
Schneider and Carley (Moon et al. 2006) use the concept of “eigenvector centrality”
within a social network to compute a univariate measure of relative position based
on a number of constituent factors.
Eigenvector analyses, however, can be nonintuitive to those not used to them.
Somewhat simpler presentations of multidimensional data can be made using clus-
tering techniques. These collapse multidimensional data so that individual cases are
members of a single group or cluster, classified on the basis of a similarity metric.
The method may therefore be appropriate if the modeller wishes to understand the
distribution of an output variable in relation to the combination of several input
variables. Cluster analysis is easy to implement in all the major statistics packages
(R, SAS, SPSS). The technique is likely to be most useful in empirical applications
with a relatively large number of agent characteristics (i.e. six or more) rather than
in idealised simulations with simple agent rules. One advantage of this technique
over others is that it is possible to represent statistical variation within the cluster
space, for example, by displaying the interquartile variation in the attribute variable
within clusters.