Page 238 -

P. 238

236 R. Seri and D. Secchi

It is important to recall that the Neyman–Pearson theory of testing is essentially
designed to provide the researcher with a decision rule guaranteeing, in the long run,
a speciﬁed error probability under the null hypothesis. The decision rule equating
the rejection of H 0 with the occurrence of a value of t inside A makes sure that when
a large number of tests are performed, the null hypothesis is incorrectly rejected
100˛ percent of the times, but does not guarantee a good performance in the case
of the single test. Otherwise stated, in the Neyman–Pearson approach a controlled
long-run performance is obtained if the researcher chooses ˛ and A and decides on
the basis of the fact that t belongs to A or not (or, equivalently, on the basis of the
fact that the p-value is larger than ˛ or not). In general, it is also expected that the
researcher sets a value of ˇ and chooses, on the basis of experience or pilot runs, a
value of d, and computes N on the basis of these values.
However, this is not the way in which tests are generally performed in practice.
Indeed, it is customary that the researcher computes the test statistic t and the p-value
and uses the latter as a measure of the support in the null hypothesis. For example,
it is quite common that a p-value just under 5% is treated differently than a p-value
under 1%, the latter providing a stronger evidential value against the null hypothesis.
This is so widespread that some researchers do not report the p-value but only p <
1%or p <5%. From the point of view of the Neyman–Pearson theory of testing
this is nonsensical. However this has entered common practice and has evolved
into an approach of its own, different from the Fisher and the Neyman–Pearson
approaches, yet gathering aspects of both, and called Null-Hypothesis Signiﬁcance
Testing (NHST). This approach takes from the Fisher approach the emphasis on the
p-value and its disregard for power; from the Neyman–Pearson theory, the approach
emphasizes the threshold values of ˛.
In this chapter, we follow more closely the original Neyman–Pearson theory than
the NHST. The elements of this approach are the two probabilities of error ˛ and ˇ,
a measure of the effect under scrutiny or of the distance between the alternative and
the null hypothesis d, and the sample size N. These quantities are linked by some
equations. We will see below that determining a value N amounts at choosing some
values for the quantities ˛, ˇ and d, whose interpretation is generally simpler than
the one of N.

11.4 The Use of Power in Practice: Two Examples

In order to show how power analysis can help to determine the number of runs in
a simulation, we decided to select a model and to proceed with some calculations.
The simulation we selected for this computational exercise is an agent-based model
that was developed by Fioretti and Lomi (2008, 2010) on the basis of the famous
“garbage can” model (Cohen et al. 1972), hereby GCM.
There are several reasons that led to the selection of this ABM. One of the
obvious reasons is that it describes a very well-known model that informed the

233 234 235 236 237 238 239 240 241 242 243