Page 124 - Foundations of Cognitive Psychology : Core Readings
P. 124

Experimental Design in Psychological Research  127

               tests are beyond the scope of this chapter,and the reader is referred to the sta-
               tistics textbooks mentioned earlier.
               Significance Testing  Suppose you wish to observe differences in interval iden-
               tification ability between brass players and string players. The question is
               whether the difference you observe between the two groups can be wholly
               accounted for by measurement and performance error,or whether a difference of
               the size you observe indicates a true difference in the abilities of these musicians.
                 Significance tests provide the user with a ‘‘p value,’’ the probability that the
               experimental result could have arisen by chance. By convention,if the p value
               is less than .05,meaning that the result could have arisen by chance less than
               5% of the time,scientists accept the result as statistically significant. Of course,
               p <:05 is arbitrary,and it doesn’t deal directly with the opposite case,the
               probability that the data you collected indicate a genuine effect,but the statis-
               ticaltestfailedtodetectit(apower analysis is required forthis).Inmany
               studies,the probability of failing to detect an effect,when it exists,can soar to
               80% (Schmidt 1996). An additional problem with a criterion of 5% is that a
               researcher who measures 20 different effects is likely to measure one as signifi-
               cant by chance,even if no significant effect actually exists.
                 Statistical significance tests,such as the analysis of variance (ANOVA),the
               f-test,chi-square test,and t-test,are methods to determine the probability that
               observed values in an experiment differ only as a result of measurement errors.
               For details about how to choose and conduct the appropriate tests,or to learn
               more about the theory behind them,consult a statistics textbook (e.g., Daniel
               1990; Glenberg 1988; Hayes 1988).

               Alternatives to Classical Significance Testing  Because of problems with tradi-
               tional significance testing,there is a movement,at the vanguard of applied
               statistics and psychology,to move away from ‘‘p value’’ tests and to rely on
               alternative methods,such as Bayesian inferencing,effect sizes, confidence
               intervals,and meta-analyses (refer to Cohen 1994; Hunter and Schmidt 1990;
               Schmidt 1996). Yet many people persist in clinging to the belief that the most
               important thing to do with experimental data is to test them for statistical sig-
               nificance. There is great pressure from peer-reviewed journals to perform sig-
               nificance tests,because so many people were taught to use them. The fact is,the
               whole point of significance testing is to determine whether a result is repeatable
               when one doesn’t have the resources to repeat an experiment.
                 Let us return to the hypothetical example mentioned earlier,in which we
               examined the effect of music on study habits using a ‘‘within-subjects’’ design
               (each subject is in each condition). One possible outcome is that the difference
               in the mean test scores among groups was not significantly different by an
               analysis of variance (ANOVA). Yet suppose that,ignoring the means, every
               subject in the music-listening condition had a higher score than in the no-music
               condition. We are not interested in the size of the difference now,only in the
               direction of the difference. The null hypothesis predicts that the manipulation
               wouldhavenoeffectatall,andthathalfof the subjects should show adiffer-
               ence in one direction and half in the other. The probability of all 10 sub-
               jects showing an effect in the same direction is 1/2 10  or 0.0009,which is highly
   119   120   121   122   123   124   125   126   127   128   129