Page 430 -
P. 430

14.2  Online research  421




                     If your data source is either inaccessible due to business concerns, lack of an
                  open API, or unacceptable costs, you might consider reframing your study to match
                  what can be accomplished within your means. Substituting smaller scale studies or
                  qualitative research for broad examinations into usage patterns might be one ap-
                  proach. One study used a set of interviews with Facebook users to understand how
                  the content, layout, and functionality of the site influenced communication of health
                  information (Menefee et al., 2016). Although smaller qualitative studies lack the
                  broad appeal of the analysis of millions of posts, they might be more economical
                  to complete.
                     If you are lucky enough to get your hands on a large dataset relevant to your
                  interests, you might use a variety of techniques, depending on your interests
                  and goals. Be prepared to spend some time on data cleaning and extraction,
                  potentially taking textual representations of tweets, posts, or other data and
                  formatting them in a normalized pattern suitable for querying or text search-
                  ing (Baeza-Yates and Riberio-Neto, 2011). Once the data is ready for analysis,
                  you may use any of a range of techniques. Possibilities include natural-language
                  processing approaches that try to extract key concepts and relationships from
                  free text (Hedegaard and Simonsen, 2013), and information retrieval techniques
                  (Baeza-Yates and Riberio-Neto, 2011) to model similarities between documents
                  and common concepts and terms. Other approaches have used descriptive sta-
                  tistics tracking types of activities and relationships (Kittur and Kraut, 2008),
                  relative frequencies of different types of events (White et  al., 2013), and any
                  number of other techniques as appropriate. For social media analysis, you might
                  build networks indicating relationships between individuals, topics, and other
                  items of interest. Graph algorithms might be used to find network members who
                  are “hubs”—outliers in terms of number of connections or presence on impor-
                  tant paths (Scott, 2013). The Social Media Research Foundation (http://www.
                  smrfoundation.org) has developed a tool known as NodeXL, which supports the
                  development of networks, calculation of centrality measures, and visualization,
                  all through spreadsheet data (Bonsignore et al., 2009; Hansen and Shneiderman,
                  2010).
                     In a refrain that should be familiar to readers who have made it this far, any
                  of these data sources can be augmented by appropriate analysis with related data
                  collected through different modalities. Examples include the use of surveys to
                  understand user practices and beliefs with regard to searches for health informa-
                  tion (White, 2013) and the use of instrumented web pages (Chapter 12) (Huang
                  et al., 2012) or eye tracking (Chapter 13) (Huang et al., 2011) to capture fine-
                  grain data correlated with search engine interactions. Approaches like these also
                  open search engine interaction research to those who are not directly working
                  with the relevant companies, as logging toolkits and eye-tracking experiments
                  might be conducted in usability labs lacking access to large volumes of search
                  interaction logs.
   425   426   427   428   429   430   431   432   433   434   435