Page 369 -
P. 369

12.8  Challenges of computerized data collection  359




                     However, effective use of these tools requires addressing several important chal-
                  lenges. As with any method for data collection, automated methods work best if their
                  use is carefully considered in the context of the specific situation that is being studied
                  and the research questions that are being asked. Before collecting data—or design-
                  ing a system to collect data—you should ask yourself what you hope to learn from
                  your research and how the data that you collect will help you answer those questions.
                  Collecting too much data, too little data, or the wrong data will not be particularly
                  helpful.
                     Computer use can be considered over a wide range of time scales, with vastly
                  different interpretations. At one end of the spectrum, individual keyboard and mouse
                  interactions can take place as frequently as 10 times per second. At the other extreme,
                  repeated uses of information resources and tools in the context of ongoing projects
                  may occur over the course of years (Hilbert and Redmiles, 2000). Successful experi-
                  ments must be designed to collect data that is appropriate for the questions being
                  asked. If you want to understand usage patterns that occur over months and years,
                  you probably do not want to collect every mouse event and key click; the volume
                  of data will be simply overwhelming. Similarly, understanding dynamics of menu
                  choices with specific applications requires more detailed information than simply
                  which applications were used and when.
                     The amount of data collected is generally referred to as the granularity or reso-
                  lution of the data. Fine-grained, high-resolution data involves every possible user
                  interaction event; coarse-grained, low-resolution data contains fewer events, perhaps
                  involving specific menu items, selection of specific buttons, or interaction with spe-
                  cific dialog boxes.
                     The specificity of the questions that you are asking may help determine the gran-
                  ularity of data that you need to collect. Many experiments involve structured ques-
                  tions regarding closed tasks on specific interfaces: which version of a web-based
                  menu layout is better? To support these studies, automated data collection tools must
                  collect data indicating which links are clicked, and when (see the Simultaneous vs
                  Sequential Menus sidebar for an example). Web server logs are very well suited for
                  such studies.
                     Open-ended  studies aimed  at understanding  patterns of user  interactions may
                  pose greater challenges. To study how someone works with a word processor, we
                  may need to determine which functions are used and when. Activity loggers that
                  track individual mouse movements and key presses may help us understand some
                  user actions, but they do not record structured, higher-order details that may be nec-
                  essary to help understand how these individual actions come together to involve the
                  completion of meaningful tasks. Put another way, if we want to understand sequences
                  of important operations in word-processing tasks, we do not necessarily want a list of
                  keystrokes and mouse clicks. Instead, we would like to know that the user formatted
                  text, inserted a table, and then viewed a print preview. Still higher-level concepts are
                  even harder to track: how do we know when a user has completed a task?
                     Researchers have tried a variety of approaches for inferring higher-level tasks
                  from low-level interaction events. Generally, these approaches involve combining
   364   365   366   367   368   369   370   371   372   373   374