Page 441 -
P. 441
432 CHAPTER 14 Online and ubiquitous HCI research
2012, 2013); to enable synchronous and longitudinal studies (Mao et al., 2012); to
use Turk workers to plan the contents of tasks to be completed by subsequent workers
(Anand et al., 2011); and to simplify the construction of tasks (Greg et al., 2010),
potentially including components for predicting and evaluating confidence level and
costs (Barowy et al., 2016).
The infrastructure provided by Mechanical Turk and similar crowdsourcing plat-
forms provides many advantages over “roll-your-own” designs. As any experienced
HCI researcher knows well, the challenges of recruiting, enrolling, and consent-
ing participants can consume substantial amounts of time. Even if you are able to
build your own web application to do the trick, you might find that leveraging these
platforms—particularly with one of the add-on libraries—might simplify your life
considerably. These advantages aside, commercial crowdsourcing tools have po-
tential downsides. Financially, payment for microtasks might be more expensive
than the gifts or small payments traditionally made to study participants. Be sure
to estimate your costs before you embark on a study. Technical challenges may
arise— integration of complex, preexisting web applications with APIs provided by
the crowd worker platforms might be a complex task. Consider a preliminary study
to prove the concept and test the tasks thoroughly before starting a study. If the
commercial platform does not work out, you might want to fall back on homegrown
tools. In any case, you will still have to deal with the selection of which tasks you
want users to complete and how you might design the tasks to ensure high-quality
responses.
14.3.2.2 Tasks and study design
Law and von Ahn present a framework for developing appropriate tasks for human
computation studies (Law and Ahn, 2011). Tasks can be seen as containing three
main elements: introductory description, clear definitions of success criteria, and in-
centives (financial for Mechanical Turk and other systems, entertainment for games,
access to services for CAPTCHA). Each task will involve multiple design decisions,
including which information is presented to encourage completion of tasks without
bias; tradeoffs in granularity between the value of the result and the time required
to complete; whether tasks are completed individually or collaboratively; which in-
centives are offered, and how quality is ensured (Law and Ahn, 2011). For an in-
depth discussion of these and related issues, Law and von Ahn's in-depth discussion
(Law and Ahn, 2011) is highly recommended. An alternative model is presented by
Alexander Quinn and Benjamin Bederson, who developed a multidimensional clas-
sification taxonomy. The Quinn-Bederson model describes human computation sys-
tems in terms of motivations for participation, quality control measures, techniques
for aggregating responses, required human skills, orders and workflows for process-
ing tasks, and the cardinality of tasks to requests (how many users are mapped to
each task) (Quinn and Bederson, 2011).
Concerns over quality control have led to a variety of approaches in task design
to attempt to ensure high-quality results from crowdsourcing studies (Table 14.1).