Page 348 -
P. 348
338 CHAPTER 12 Automated data collection methods
12.2.2 STORED APPLICATION DATA
As we use computers, we leave traces that provide valuable information about how
we interact with applications and store and manage information. The tools that we
use collect substantial data trails that implicitly and explicitly describe user activities.
Examples include (but are not limited to):
• File systems: The files and folders that we create and use present a model of
how we organize information. Do we separate work activities from home? Do
we have many folders, each containing a small number of files, or only a few
folders, each with many files?
• Graphical user interface (GUI) desktops: Some people have dozens of icons on
their desktops, while others have only a few. Does this say anything about their
organizational preferences?
• Email programs: Many people use an email “inbox” as a todo list, reminding
them of tasks that must be completed. Some users make extensive use of filing
and filtering capabilities, while others leave all messages in one folder.
• Web bookmarks can also be more or less organized.
• Social networking tools such as Facebook or LinkedIn provide detailed
perspectives on how people connect to each other and why.
Each of these domains (and others) can be (and have been) studied to understand
usage patterns and to potentially inform new designs. This research is a form of HCI
archeology—digging through artifacts to understand complex behavior patterns.
There are attractive aspects to using existing data that is stored by tools that users
work with on a daily basis. Interference with the user's work or habits is minimal.
Users do not have to participate in experimental sessions to be part of the study and
no training is necessary.
The generality of this approach is limited by the tools involved and the data
that they collect. The example tools given earlier (file explorers, email clients, web
browser bookmark tools, GUI desktops, etc.) all provide tools that can be used to
manipulate and maintain organizations of information. As a result, they can be used
to identify which structures exist, which categories’ items might be placed in, etc.
As more transient activities—such as selections of menu items—are generally not
recorded, this approach is not well suited for the study of specific implementations.
Instead, this approach to data analysis is best suited for the study of long-term pat-
terns of ongoing tasks such as those described earlier.
As the analyses may involve exploration of potentially sensitive matters such as
email messages, file system content, and web bookmarks, investigators using these
approaches should be sensitive to privacy concerns. In addition to properly inform-
ing participants of the privacy risk (see Chapter 15), researchers should exercise dis-
cretion when examining potentially sensitive data. Investigations should be limited
to only the data that is strictly necessary. An exploration of email communication
patterns might reduce privacy risks by examining message headers, instead of mes-
sage bodies. If this is not sufficient, anonymizing the content to simply indicates
that A had an email conversation with B can provide further privacy protection.