Page 443 -
P. 443

434    CHAPTER 14  Online and ubiquitous HCI research




                           web-page  instrumenting  techniques (Chapter 12) to collect mouse and keyboard us-
                         age data sufficient for building “task fingerprints” capable of predicting performance
                         (Rzeszotarski and Kittur, 2011).
                            Successful design of a crowdsourced study does not end with the design of
                         individual tasks.  Although some studies—particularly studies involving  online
                         evaluation of user interface designs—may be based on large numbers of workers
                         completing very similar tasks, more complex control structures have been used
                         in crowdsourcing studies to decompose large problems, to introduce feedback—
                         whereby responses to some questions will influence the content of subsequent ques-
                         tion, or to influence workflows. Edith Law and Luis von Ahn provide a summary of
                         different workflow strategies in their in-depth review of human computation (Law
                         and Ahn, 2011).

                         14.3.2.3   Pros and cons of crowdsourced studies
                         Easy to create, potentially inexpensive, and backed by services that simplify re-
                         cruitment and enrollment of participants, crowdsourced studies can be very appeal-
                         ing. Other potential advantages include potentially decreased bias and increased
                         validity, as participants who do not interact directly with researchers or even
                         know that they are participating in an experiment might be less susceptible to im-
                         plicit or explicit pressures (Paolacci et al., 2010). Although the use of services
                         like Mechanical  Turk does remove some knowledge about participants (Kittur
                         et al., 2008), some have argued that Turk users may be demographically similar
                         to broader populations (Paolacci et al., 2010). Technical questions might influence
                         the validity of task completion times from crowdsourced experiments, as network
                         delays might impact task completion times (see Chapter 12). Finally, the lack of
                         direct interaction with participants eliminates the possibility of gaining any in-
                         sight from direct observation of task completion. Pairing studies—as discussed
                         earlier—provides one possible means of avoiding this lack of feedback. A small
                         lab study might give you the insight associated with direct interaction with users,
                         while a companion human computation study will help you enroll larger numbers
                         of participants.
                            Before jumping into studies using systems like Mechanical Turk, you should
                         take care to ensure that your software components are implemented and tested cor-
                         rectly, and that you understand the social dynamics of the workers. Online forums for
                         mechanical Turk users, including Turkopticon (https://turkopticon.ucsd.edu) (Irani
                         and Silberman, 2013) and Turker Nation (http://turkernation.com), provide work-
                         ers with the opportunity to discuss interesting tasks, problems with task requestors,
                         and other topics of interest to workers trying to earn money through Mechanical
                         Turk. These groups can provide valuable resources and feedback to researchers using
                         human computation in their work. Brian McInnis and Gilly Leshed described how
                         interactions with these groups proved particularly useful when software errors pre-
                         vented tasks from working correctly, and workers from being paid. Interactions with
                         the participant community helped resolve the issues and provide fair payment, thus
   438   439   440   441   442   443   444   445   446   447   448