Page 358 -
P. 358

348    CHAPTER 12  Automated data collection methods





                           INSTRUMENTED SOFTWARE FOR HCI DATA COLLECTION—CONT'D
                             Instrumenting Open-Source Software
                           As few commercial products offer customization tools comparable to those
                           found in Microsoft Office, instrumentation of open-source software has
                           proven to be a fruitful alternative. One study of web navigation patterns used
                           an instrumented version of the Firefox web browser to collect data on the use
                           of browser features such as the “back” button, history views, and bookmarks.
                           This relatively small study (25 users) combined instrumented software with
                           web proxies in order to identify new patterns in web browser feature usage and
                           browsing behavior, some of which may have been related to the rise in tabbed
                           browsing and other relatively new browser features (Obendorf et al., 2007).
                             Ingimp provides an example of broader use of instrumented open-source
                           software for HCI data collection. Short for “instrumented GIMP,” ingimp was
                           an instrumented version of the Gnu Image Manipulation Program, a powerful
                           open-source tool for photo editing and image processing. Created by a group
                           from the University of Waterloo, ingimp was widely publicized in the hope of
                           motivating users to participate in the study.
                             Ingimp collected a variety of data, including usage timing, the number of
                           windows and layers open at a time, command usage, and task-switching details.
                           Instrumenting GIMP to collect this data required modifying the open-source
                           program to record appropriate events and transmit them to a central server.
                           Interaction data is transmitted at the end of each session. If the software crashes
                           before a log is transmitted, the incomplete log is detected and sent to the server
                           when the program is next used.
                             The ingimp instrumentation approach involved several privacy protection
                           measures. Although mouse events and key press events are recorded, specific
                           details—which key was pressed or where the mouse was moved—are not
                           recorded. A dialog box on startup provides users with the option of disabling
                           event logging for the current session. As GIMP is an open-source project, the
                           developers of ingimp made all the source code available. Knowledgeable users
                           can investigate “patches”—descriptions of the differences between the original
                           GIMP and ingimp. These differences reveal where the logging code has been
                           added and what details it logs. Although few (if any) users are likely to take the
                           trouble to do this, this does represent a thorough attempt at full disclosure.
                             Ingimp's developers used this information to improve the usability of GIMP
                           and other free or open-source software tools (Terry et al., 2008).


                            Whichever approach you select for implementing data collection instruments,
                         you should think carefully about the data that you are collecting. Although you
                         may be tempted to collect as much data as possible, doing so may not be benefi-
                         cial. Instrumenting every possible interaction in a complex application may require
                         a great deal of effort, and as the amount of data collected may increase with the
   353   354   355   356   357   358   359   360   361   362   363