Page 299 -
P. 299

288    CHAPTER 10  Usability testing




                         Chapter 16. Generally, if researchers do not intervene, this means that the data col-
                         lection is over, and that would be a missed opportunity to learn more about other
                         aspects of an interface or other aspects of data collection. An intervention is when a
                         researcher helps the participant move forward by providing advice or suggesting a
                         action. Before beginning any usability testing, a researcher should have a clear deci-
                         sion on whether any interventions will be allowed, under what circumstances, how
                         they will be documented, and how this will be accounted for in reporting the results.
                         Typically, the researchers (moderators) don’t get involved with providing advice to
                         users, and interventions are not a frequent methodological occurrence. However, the
                         benefit of interventions is that they allow for the maximal amount of feedback about
                         what aspects of the interface need improvement. The details of the intervention should
                         be clearly noted in any data results or write up (Dumas and Fox, 2007)


                         10.5.6   MEASUREMENT
                         There are many different types of data that can be collected during usability testing.
                         The three most common quantitative measurements are task performance, time perfor-
                         mance, and user satisfaction. Task performance or correctness means how many tasks
                         were correctly completed (and the related metrics of how many tasks were  attempted but
                         not successfully completed). Time performance means how long each task took to suc-
                         cessfully complete (and the related metrics of how long people spent on incorrect tasks
                         before they gave up). User satisfaction is often measured by a standardized, validated
                         survey tool. See Section 5.8 for a list of standard survey tools for measuring satisfaction.
                            While these are the three most common quantitative measurements in usability
                         testing, there are many other metrics that could be useful. For instance, additional
                         metrics might include the number of errors, average time to recover from an error,
                         time spent using the help feature, and number of visits to the search feature or index.
                         Depending on the purpose of the usability testing, additional specific metrics might
                         be useful. For instance, if you have redesigned the search engine on a website and
                         the usability testing tasks are focused on the search engine, then an important metric
                         might be something like the average number of search engine responses clicked on,
                         or the average search ranking of the choice that provided the solution. If you utilize
                         key logging, there are many metrics that can be easily analyzed, such as the time
                         spent on specific web pages, the number of web pages viewed, mouse movements,
                         typing speed (Atterer and Schmidt, 2007). See Chapter 12 for information on key
                         logging. Eye tracking used to be prohibitively expensive, but as costs have come
                         down, eye tracking has become more prevalent for usability testing. For more infor-
                         mation about eye tracking, see Chapter 13.
                            In usability testing, especially formative usability testing (on early-stage de-
                         signs), qualitative data is often just as important as quantitative data. For instance,
                         users are often encouraged to “think aloud” as they are going through the interface
                         (known as the “thinking aloud” protocol). This is more common in formative usabil-
                         ity testing than in summative usability testing (when users may be expected to focus
                         more on task completion). When users state their feelings, their frustrations, and their
   294   295   296   297   298   299   300   301   302   303   304