Page 248 -

P. 248

246 R. Seri and D. Secchi

The reason is that ABM studies are often simpliﬁed representations of reality.
Therefore, the effect of a treatment is rarely their desired outcome, as it is clear that
the value obtained from an ABM will generally not be the same value observed in
reality. Second, even when the outcome of an ABM study is of interest in itself, it is
rarely the case that one has a precise idea of what the width of a conﬁdence interval
should be. This may be different whenever the outcome variable is measured on
a well-known scale, as it is often the case in the disciplines in which AIPE is an
established alternative to power analysis. The paper (Schönbrodt and Perugini 2013)
(see also Lakens and Evers 2014) provides an interesting example, based on Cohen
(1988), of how to determine the width of an interval, but this seems difﬁcult to
generalize to other situations.

11.5.2 Concluding Remarks

The message of this article is that statistical power analysis can help modelers to
reﬁne their ideas on how many times their ABM simulation should be performed.
In this chapter, we ﬁrst wrote a few notes on the importance of determining the
number of runs, and then turned our attention to the type of models that would
beneﬁt the most from this approach. The focus is then moved to testing theory so that
we could provide an appropriate statistical background for this approach. Finally,
some practical examples show the risks and perils of under- or over-estimating the
number of runs in a simulation. The implications are then further discussed at the
beginning of this section.
As a way to provide a summary of this chapter and, at the same time, help
modelers clarify what under- and over-power imply, Table 11.4 shows calculations
of power for ˛ D 0:01 and 1 ˇ D 0:95, using the formula that we developed and
also appearing in the Appendix.
The left column in Table 11.4 shows the hypothetical number of parameter
conﬁgurations (or groups G) that a potential ABM could have. Knowing how to
determine the appropriate number of conﬁgurations is a complex issue that falls
beyond the scope of this chapter. However, sensitivity and steady state analyses
can provide sound support (Thiele et al. 2015). The table calculates the number
of runs that are necessary to reach 1 ˇ D 0:95 at ˛ D 0:01 for ﬁve different
effect sizes, respectively ultra-micro D 0:01, micro (0:05), small (0:1), medium
(0:2), large (0:4), and huge (0:8). Results from these calculations conﬁrm with more
granularity of details that small simulations, with few conﬁgurations of parameters
(up to 10) need to be performed many times unless the effect size is large or very
large. As the number of conﬁgurations grows, the number of runs to perform clearly
decreases signiﬁcantly to the point where one run per conﬁguration is enough when
variability is spread to its limits (from 1000 and up) in the presence of large and
very large effect sizes.

243 244 245 246 247 248 249 250 251 252 253