Page 442 -
P. 442

14.3  Human computation     433




                   Table 14.1  Quality Control Measures for Crowdsourcing Studies
                   Strategy         Proposed Approach
                   Question design  Include questions with known answers (Kittur et al., 2008)
                                    Make accurate answers easy to provide (Kittur et al., 2008)
                   Study design     Develop predictive models based on question types to determine
                                    how many responses are needed to ensure high-quality answers
                                    for each question type (Barowy et al., 2016)
                                    Use micro-diversions or other distracters to offset declines in
                                    response quality as users get bored or tired (Dai et al., 2015)
                   Task performance   Look for patterns indicating answers that might have been faked
                   data analysis    or rushed, including repeated free text or questions answered too
                                    quickly (Kittur et al., 2008)
                                    Use task completion metadata to develop predictive models of
                                    individual workers (Ipeirotis et al., 2010) and tasks (Rzeszotarski
                                    and Kittur, 2011; Zhu et al., 2012)


                  Anniket Kittur, Ed Chi, and Bongwun Suh (Kittur et al., 2008) made three sugges-
                  tions for designing high-quality crowdsourcing tasks. (1) Each task should include
                  questions with known answers that can be easily checked. Asking participants to
                  count the number of images in the page, or to answer a simple question based on the
                  text in the page, can help determine if they are answering seriously or simply rushing
                  through. (2) Accurate answers should be no harder to provide than rushed, inaccurate
                  answers. For example, a task asking users to summarize a site might be easily sub-
                  verted by short one-word answers, but an explicit requirement that users provide a
                  certain number of keywords to describe content might be easier to fill out accurately.
                  (3) Look for other ways to find low-quality answers, such as by identifying tasks that
                  are completed too quickly or have answers repeated across multiple tasks (Kittur et
                  al., 2008). Having multiple users complete each task and using agreement on results
                  as a measure of quality—just as described earlier for CAPTCHA—is another possi-
                  bility, but redundancy can be expensive (Ipeirotis et al., 2010). Alternatively, models
                  of the complexity of different response types (checkboxes, radio boxes, free text) can
                  be used to predict the number of responses needed to arrive at high-quality levels
                  with high confidence (Barowy et al., 2016). “Micro-diversions”—games or other
                  entertaining distractions designed to disrupt the monotony of performing multiple
                  repeated tasks over long periods of time—might also help improve response quality
                  (Dai et al., 2015).
                     Other studies have used task completion metadata to develop predictive mod-
                  els suitable for identifying invalid answers. Noting that Mechanical Turk collects
                  detailed data on each task, including measures of start and end time, Zhu and col-
                  leagues built predictive models based on initial estimates of task performance and
                  data from actual tasks.  They then used these models to classify subsequent re-
                  sponses as either valid or invalid (Zhu et al., 2012). Other efforts have explored
                  building models of individual workers (Ipeirotis et al., 2010) and using JavaScript
   437   438   439   440   441   442   443   444   445   446   447