Page 372 -

P. 372

362 CHAPTER 12 Automated data collection methods

There are three broad categories of question that might benefit from automated
methods of data collection:

• Retrospective analyses of information management behavior: These studies look
at artifacts of computer use, including location of documents, email folders, and
other structures created during the course of using and managing information, in
order to understand how people use these tools.
• Controlled experiments: Web server logs and completely customized software
can be used to collect timing and related data for experiments. As web logs
contain entries for each link selection event, they are most useful for cases
involving the study of selection of web links. With proper design, web links can
be used to model menu layouts and related topics. Fully customized software
may be needed if additional data (such as mouse movements) is required, but
hybrids may be useful. For example, JavaScript embedded in a web page might
be used to record mouse movements and translate those movements into events
stored in a log file alongside the basic server logs.
• Usability studies and other explorations of how users work with tools: Web
server logs, proxy server logs, keystroke loggers, and activity loggers record
user interaction events with one or more websites, applications, or operating
environments. The interactions can be used to examine which features of a tool
a user used and when. With appropriate analysis, this data can be used to find
interaction problems and identify opportunities for usability improvements.

Successful use of any of these approaches requires careful consideration of the
appropriate granularity of data to be collected and the tools to be used for data analy-
sis. As with other data collection approaches, the key is to precisely identify the data
that is needed and collect only that data.
Tools that collect data on user activities have potential privacy implications. This
is particularly important when the goal is to study how users work with tools to com-
plete real tasks: providing artificial tasks in the hopes of reducing privacy concerns
may decrease the realism of the data. Experiments involving this set of data should be
carefully designed, in consultation with appropriate institutional review boards (see
Chapter 15), to avoid violations of participant privacy and trust.

DISCUSSION QUESTIONS

1. Online spreadsheets, word processors, and other office productivity tools
blur the line between websites and traditional software. In doing so, they
provide both opportunities and challenges for HCI researchers. As the
software infrastructure for online tools resides completely on the hosting
server, researchers can easily modify and redeploy interfaces without having
to update individual computers. As with traditional web interfaces, requests
for content from the server can be logged and resulting files can be analyzed.
However, as client-side interactions (usually executed through JavaScript code)

367 368 369 370 371 372 373 374 375 376 377