Page 358 -
P. 358
348 CHAPTER 12 Automated data collection methods
INSTRUMENTED SOFTWARE FOR HCI DATA COLLECTION—CONT'D
Instrumenting Open-Source Software
As few commercial products offer customization tools comparable to those
found in Microsoft Office, instrumentation of open-source software has
proven to be a fruitful alternative. One study of web navigation patterns used
an instrumented version of the Firefox web browser to collect data on the use
of browser features such as the “back” button, history views, and bookmarks.
This relatively small study (25 users) combined instrumented software with
web proxies in order to identify new patterns in web browser feature usage and
browsing behavior, some of which may have been related to the rise in tabbed
browsing and other relatively new browser features (Obendorf et al., 2007).
Ingimp provides an example of broader use of instrumented open-source
software for HCI data collection. Short for “instrumented GIMP,” ingimp was
an instrumented version of the Gnu Image Manipulation Program, a powerful
open-source tool for photo editing and image processing. Created by a group
from the University of Waterloo, ingimp was widely publicized in the hope of
motivating users to participate in the study.
Ingimp collected a variety of data, including usage timing, the number of
windows and layers open at a time, command usage, and task-switching details.
Instrumenting GIMP to collect this data required modifying the open-source
program to record appropriate events and transmit them to a central server.
Interaction data is transmitted at the end of each session. If the software crashes
before a log is transmitted, the incomplete log is detected and sent to the server
when the program is next used.
The ingimp instrumentation approach involved several privacy protection
measures. Although mouse events and key press events are recorded, specific
details—which key was pressed or where the mouse was moved—are not
recorded. A dialog box on startup provides users with the option of disabling
event logging for the current session. As GIMP is an open-source project, the
developers of ingimp made all the source code available. Knowledgeable users
can investigate “patches”—descriptions of the differences between the original
GIMP and ingimp. These differences reveal where the logging code has been
added and what details it logs. Although few (if any) users are likely to take the
trouble to do this, this does represent a thorough attempt at full disclosure.
Ingimp's developers used this information to improve the usability of GIMP
and other free or open-source software tools (Terry et al., 2008).
Whichever approach you select for implementing data collection instruments,
you should think carefully about the data that you are collecting. Although you
may be tempted to collect as much data as possible, doing so may not be benefi-
cial. Instrumenting every possible interaction in a complex application may require
a great deal of effort, and as the amount of data collected may increase with the