Page 65 -
P. 65
50 CHAPTER 3 Experimental design
groups of participants take part in the experiment and each group only uses one specific
type of keyboard. If the task is to type a document of 500 words, then each participant
types one document using one of the keyboards.
In contrast, a within-group design (also called “within-subject design”) requires
each participant to be exposed to multiple experimental conditions. Only one group of
participants is needed for the entire experiment. If we use the keyboard experiment as an
example, as shown in Figure 3.4, one group of participants uses all three types of key-
board during the experiment. If the task is to type a document of 500 words, then each
participant types three documents, using each of the three keyboards for one document.
QWERTY DVORAK Alphabetic
keyboard keyboard keyboard
FIGURE 3.4
Within-group design.
Please note that different statistical approaches are needed to analyze data col-
lected from the two different design methods. The details of statistical analysis are
discussed in Chapter 4.
3.3.1.1 Advantages and disadvantages of between-group design
From the statistical perspective, between-group design is a cleaner design. Since the
participant is only exposed to one condition, the users do not learn from different task
conditions. Therefore, it allows us to avoid the learning effect. In addition, since the
participants only need to complete tasks under one condition, the time it takes each par-
ticipant to complete the experiment is much shorter than in a within-group design. As a
result, confounding factors such as fatigue and frustration can be effectively controlled.
On the other hand, between-group design also has notable disadvantages. In a
between-group experiment, we are comparing the performance of one group of par-
ticipants against the performance of another group of participants. The results are
subject to substantial impacts from individual differences: the difference between
the multiple values that we expect to observe can be buried in a high level of “noise”
caused by individual differences. Therefore, it is harder to detect significant differ-
ences and Type II errors are more likely to occur.