Page 55 -
P. 55

40     CHAPTER 2  Experimental research




                         or the rewards offered for participation. This phenomenon, called the “Hawthorne
                         effect,” was documented around 60 years ago (Landsberger, 1958). In many cases,
                         being observed can cause users to make short-term improvements that typically do
                         not last once the observation is over.
                            However, it should be noted that the context of the Hawthorne studies and
                         HCI-related experiments is significantly different (Macefield, 2007). First, the
                         Hawthorne studies were all longitudinal while most HCI experiments are not.
                         Secondly, all the participants in the Hawthorne studies were experts in the tasks
                         being observed while most HCI experiments observe novice users.  Thirdly, the
                         Hawthorne studies primarily focused on efficiency while HCI experiments value
                         other important measures, such as error rates. Finally, the participants in the
                         Hawthorne study had a vested interest in a successful outcome for the study since
                         it was a point of contact between them and their senior management. In contrast,
                         most HCI studies do not carry this motivation. Based on those reasons, we believe
                         that the difference between the observed results of HCI experiments and the actual
                         performance is not as big as that observed in the Hawthorne studies. But still, we
                         should keep this potential risk in mind and take precautions to avoid or alleviate the
                         impact of the possible Hawthorne effect.




                           EMPIRICAL EVALUATION IN HCI
                           The validity of empirical experiments and quantitative evaluation in HCI
                           research has been doubted by some researchers. They argue that the nature
                           of research in HCI is very different from traditional scientific fields, such as
                           physics or chemistry, and, therefore, the results of experimental studies that
                           suggest one interface is better than another may not be truly valid.
                             The major concern with the use of empirical experiments in HCI is the control
                           of all possible related factors (Lieberman, 2007). In experiments in physics or
                           chemistry, it is possible to strictly control all major related factors so that multiple
                           experimental conditions are only different in the states of the independent
                           variables. However, in HCI experiments, it is very difficult to control all potential
                           factors and create experimental conditions that are exactly the same with the
                           only exception of the independent variable. For instance, it is almost impossible
                           to recruit two or more groups of participants with exactly the same age,
                           educational background, and computer experience. All three factors may impact
                           the interaction experience as well as the performance. It is argued that the use
                           of significance tests in the data analysis stage only provides a veneer of validity
                           when the potentially influential factors are not fully controlled (Lieberman, 2007).
                             We agree that experimental research has its limitations and deficiencies, just
                           as any other research method does. But we believe that the overall validity of
                           experimental research in the field of HCI is well-grounded. Simply observing
                           a few users trying two interfaces does not provide convincing results on the
   50   51   52   53   54   55   56   57   58   59   60