Page 283 -
P. 283

272    CHAPTER 10  Usability testing




                         may include wireframes or paper prototypes, also known as low-fidelity prototypes
                         (Dumas and Fox, 2007). This type of usability testing is often more informal, with
                         more communication between test moderators and participants (Rubin and Chisnell,
                         2008). In early exploratory testing, there is more of a focus on how the user perceives
                         an interface component rather than on how well the user completes a task (Rubin and
                         Chisnell, 2008). Paper prototypes are especially useful, because they are low cost and
                         multiple designs can be quickly presented and evaluated by participants. In addition,
                         because paper prototypes involve little development time, designers and developers
                         tend not to become committed to a specific design early on. And users may feel more
                         comfortable giving feedback or criticizing the interface when they see that not much
                         work has been done yet on the interface. With fully functional prototypes, users may
                         be hesitant to criticize, since they feel that the system is already finished and their
                         feedback won’t matter that much. More information on paper prototyping can be
                         found in Snyder (2003).
                            Usability testing that takes place when there is a more formal prototype ready, when
                         high-level design choices have already been made, is known as a summative test. The
                         goal is to evaluate the effectiveness of specific design choices. These mostly functional
                         prototypes are also known as high-fidelity prototypes (Dumas and Fox, 2007).
                            Finally, a usability test sometimes takes place right before an interface is released
                         to the general user population. In this type of test, known as a validation test, the
                         new interface is compared to a set of benchmarks for other interfaces. The goal is to
                         ensure that, for instance, 90% of users can complete each task within 1 minute (if
                         that statistic is an important benchmark). Validation testing is far less common than
                         formative or summative testing.
                            It is important to note that there are variations in how usability testing is struc-
                         tured, regardless of the type of usability test or the stage of interface development.
                         So in general, the data collected in a validation test or summative test will tend to be
                         much more quantitative, and less focused on users “thinking aloud.” More formative
                         testing, on earlier prototypes, will tend to be more thinking aloud and qualitative
                         data. But none of these are 100% definite. With well-developed paper prototypes,
                         you theoretically could measure task performance quantitatively, and you could uti-
                         lize the thinking aloud protocol when an interface is fully developed. The key thing
                         to remember is that, the more that users “think aloud” and speak, the more that their
                         cognitive flow will be interrupted, and the longer time a task will take to complete
                         (Hertzum, 2016; Van Den Haak et al., 2003). It is also important to remember that,
                         at first, individual children participants involved in usability testing may not feel
                         comfortable criticizing an interface out loud (Hourcade, 2007), but pairs of children
                         doing usability testing may be more effective (Als et al., 2005). Usability testing is
                         flexible and needs to be structured around the activities that are most likely to result
                         in actual changes in the interface being evaluated.
                            Different authors use different definitions for these terms. For instance, we have
                         used the definitions from Rubin and Chisnell. West and Lehman, however, define for-
                         mative tests as those that find specific interface problems to fix and summative tests
                         as those that have a goal of benchmarking an interface’s usability to other  similar
   278   279   280   281   282   283   284   285   286   287   288