Page 238 -
P. 238

236                                                 R. Seri and D. Secchi

              It is important to recall that the Neyman–Pearson theory of testing is essentially
            designed to provide the researcher with a decision rule guaranteeing, in the long run,
            a specified error probability under the null hypothesis. The decision rule equating
            the rejection of H 0 with the occurrence of a value of t inside A makes sure that when
            a large number of tests are performed, the null hypothesis is incorrectly rejected
            100˛ percent of the times, but does not guarantee a good performance in the case
            of the single test. Otherwise stated, in the Neyman–Pearson approach a controlled
            long-run performance is obtained if the researcher chooses ˛ and A and decides on
            the basis of the fact that t belongs to A or not (or, equivalently, on the basis of the
            fact that the p-value is larger than ˛ or not). In general, it is also expected that the
            researcher sets a value of ˇ and chooses, on the basis of experience or pilot runs, a
            value of d, and computes N on the basis of these values.
              However, this is not the way in which tests are generally performed in practice.
            Indeed, it is customary that the researcher computes the test statistic t and the p-value
            and uses the latter as a measure of the support in the null hypothesis. For example,
            it is quite common that a p-value just under 5% is treated differently than a p-value
            under 1%, the latter providing a stronger evidential value against the null hypothesis.
            This is so widespread that some researchers do not report the p-value but only p <
            1%or p <5%. From the point of view of the Neyman–Pearson theory of testing
            this is nonsensical. However this has entered common practice and has evolved
            into an approach of its own, different from the Fisher and the Neyman–Pearson
            approaches, yet gathering aspects of both, and called Null-Hypothesis Significance
            Testing (NHST). This approach takes from the Fisher approach the emphasis on the
            p-value and its disregard for power; from the Neyman–Pearson theory, the approach
            emphasizes the threshold values of ˛.
              In this chapter, we follow more closely the original Neyman–Pearson theory than
            the NHST. The elements of this approach are the two probabilities of error ˛ and ˇ,
            a measure of the effect under scrutiny or of the distance between the alternative and
            the null hypothesis d, and the sample size N. These quantities are linked by some
            equations. We will see below that determining a value N amounts at choosing some
            values for the quantities ˛, ˇ and d, whose interpretation is generally simpler than
            the one of N.




            11.4 The Use of Power in Practice: Two Examples

            In order to show how power analysis can help to determine the number of runs in
            a simulation, we decided to select a model and to proceed with some calculations.
            The simulation we selected for this computational exercise is an agent-based model
            that was developed by Fioretti and Lomi (2008, 2010) on the basis of the famous
            “garbage can” model (Cohen et al. 1972), hereby GCM.
              There are several reasons that led to the selection of this ABM. One of the
            obvious reasons is that it describes a very well-known model that informed the
   233   234   235   236   237   238   239   240   241   242   243