Page 371 -
P. 371
12.9 Summary 361
lets you fine-tune your analysis—looking only at high-level events, such as menu
selections; only at low-level events, such as mouse movements; or perhaps some
hybrid approach that examines low-level events that precede or follow interesting
high-level events.
As with any HCI research, proper attention to pilot testing can be important.
Pilot testing of both the data collection and data analysis pieces of the experiment
can help you verify that the data you are collecting actually tells you what you want
it to. Analyzing the pilot data may help you verify that you are collecting data of the
appropriate granularity.
All of the approaches to automatic data collection raise potential security con-
cerns. Logs of web browser activity can say a good deal about a person browsing
the web. This information might be used to infer sensitive or embarrassing details
about a person's habits, interests, or medical concerns. Although the potential harm
from the logs of any single website may be relatively minimal, proxy servers can be
configured to capture all of the interactions with every website visited by a given
computer. Indirect (and sometimes nonexistent) links between people and comput-
ers make matters even worse in this regard. Web logs track the identity (in terms of
the IP number) of the computer that makes each request. A number in a web server
log may correspond to the computer on your desk, but this does not mean that you
were the person that was at the computer when the browser visited embarrassing
websites.
Activity loggers and keystroke loggers make matters even worse. By tracking
every input action, these tools collect enough data to reconstruct documents, emails,
calendars, and other damaging evidence. These tools have been surreptitiously used
in criminal investigations and divorce proceedings. Regardless of your views on the
appropriateness of using secretive software to spy on family members, you should
take care to ensure that your data collection tools do not gather data that others would
find sensitive, damaging, or otherwise private. Some approaches include customizing
your tools to avoid potentially problematic data, such as specific keys that are pressed
(as opposed to simply noting that a key was pressed) and window titles (which may
contain document titles).
12.9 SUMMARY
Automated data collection systems give researchers the ability to easily collect de-
tailed user interaction information. Appropriately configured software tools can be
used to replace labor-intensive approaches such as manual observation or coding of
events on video. The result is a qualitative, as well as quantitative difference: not only
can more data be collected, but the increased ease of data collection allows research-
ers to conduct experiments that otherwise would be too difficult or expensive. These
strengths make automated data collection a clear first choice for many HCI research
efforts.