Page 299 -
P. 299
288 CHAPTER 10 Usability testing
Chapter 16. Generally, if researchers do not intervene, this means that the data col-
lection is over, and that would be a missed opportunity to learn more about other
aspects of an interface or other aspects of data collection. An intervention is when a
researcher helps the participant move forward by providing advice or suggesting a
action. Before beginning any usability testing, a researcher should have a clear deci-
sion on whether any interventions will be allowed, under what circumstances, how
they will be documented, and how this will be accounted for in reporting the results.
Typically, the researchers (moderators) don’t get involved with providing advice to
users, and interventions are not a frequent methodological occurrence. However, the
benefit of interventions is that they allow for the maximal amount of feedback about
what aspects of the interface need improvement. The details of the intervention should
be clearly noted in any data results or write up (Dumas and Fox, 2007)
10.5.6 MEASUREMENT
There are many different types of data that can be collected during usability testing.
The three most common quantitative measurements are task performance, time perfor-
mance, and user satisfaction. Task performance or correctness means how many tasks
were correctly completed (and the related metrics of how many tasks were attempted but
not successfully completed). Time performance means how long each task took to suc-
cessfully complete (and the related metrics of how long people spent on incorrect tasks
before they gave up). User satisfaction is often measured by a standardized, validated
survey tool. See Section 5.8 for a list of standard survey tools for measuring satisfaction.
While these are the three most common quantitative measurements in usability
testing, there are many other metrics that could be useful. For instance, additional
metrics might include the number of errors, average time to recover from an error,
time spent using the help feature, and number of visits to the search feature or index.
Depending on the purpose of the usability testing, additional specific metrics might
be useful. For instance, if you have redesigned the search engine on a website and
the usability testing tasks are focused on the search engine, then an important metric
might be something like the average number of search engine responses clicked on,
or the average search ranking of the choice that provided the solution. If you utilize
key logging, there are many metrics that can be easily analyzed, such as the time
spent on specific web pages, the number of web pages viewed, mouse movements,
typing speed (Atterer and Schmidt, 2007). See Chapter 12 for information on key
logging. Eye tracking used to be prohibitively expensive, but as costs have come
down, eye tracking has become more prevalent for usability testing. For more infor-
mation about eye tracking, see Chapter 13.
In usability testing, especially formative usability testing (on early-stage de-
signs), qualitative data is often just as important as quantitative data. For instance,
users are often encouraged to “think aloud” as they are going through the interface
(known as the “thinking aloud” protocol). This is more common in formative usabil-
ity testing than in summative usability testing (when users may be expected to focus
more on task completion). When users state their feelings, their frustrations, and their