Page 353 -
P. 353
12.3 Activity-logging software 343
sites that are potentially embarrassing or inappropriate. Records of this sort should
be treated carefully, including detailed and clear explanations in any consent forms.
Possible approaches for avoiding embarrassment include anonymization of users or
websites.
Logging of web requests is only the tip of the proxy iceberg. Several researchers
have made extensive—and creative—use of web proxies to collect web usage data.
WebQuilt (Hong et al., 2001) was a proxy server specifically designed to aid in the
collection of usability data. WebQuilt combines logging facilities with an engine for
transforming log file entries into inferred user actions, a tool for aggregating log files
into graph structures, and visualization components for the display of graphs display-
ing user paths through a site.
Unlike general-purpose web proxies, WebQuilt was designed to be used to col-
lect data on a site-specific basis. To run a usability test for a given website, the ex-
perimenter asked users to visit a URL specifically designed to support proxy-based
access to the site under investigation. The WebQuilt proxy handled all requests for
the site, including the modification of page content to route subsequent requests for
that site through the proxy. As a result, WebQuilt did not require any configuration
of the browser software.
Proxy servers can be quite powerful, but they present numerous technical chal-
lenges. Installing, configuring, and managing a proxy server can be difficult. Your
proxy server must have the processing power and network bandwidth for effective
operation. If your study involves only a small set of users for a short time frame, a
single machine might be sufficient. Large-scale studies involving multiple users for
extended time periods might need a more robust solution, involving many machines
and more bandwidth. Cutting corners on proxy capabilities might jeopardize your
study: if users find that the proxy is too slow for effective web use, they might tem-
porarily or permanently stop using the proxy, effectively removing their data from
your study.
Instrumentation software can often provide an attractive alternative to prox-
ies. MouseTracks (Arroyo et al., 2006), UsaProxy (Atterer et al., 2006), and other
similar systems (Kiciman and Livshits, 2010; Carta et al., 2011a,b; Huang et al.,
2011, 2012) modified pages with JavaScript code that recorded low-level interac-
tion data including mouse movements. This data was sent to the server for logging
and visualization. Similar approaches have also been used to track touch interac-
tions on mobile devices (Buschek et al., 2015). Although somewhat less flexible
than JavaScript, which can be delivered solely from the server hosting relevant web
pages, browser plugins can also be used to record detailed user interactions (Guo
and Agichtein, 2009).
Selection of appropriate tools for tracking web interactions will likely require
tradeoffs between expressive power and complexity. Proxies are relatively easy to
configure. An organizational proxy can transparently collect and log access informa-
tion for multiple web sites, without requiring any changes to those sites. Capture of
more fine-grained data, through JavaScript, plugins, or any of the research idea tools