Page 326 -
P. 326
11.4 Analyzing text content 315
measure. Due to its high subjectivity, face validity is more susceptible to bias and
is a weaker criterion compared to construct validity and criterion validity. Although
face validity should be viewed with a critical eye, it can serve as a helpful technique
to detect suspicious data in the findings that need further investigation (Blandford
et al., 2016).
Criterion validity tries to assess how accurate a new measure can predict a previ-
ously validated concept or criterion. For example, if we developed a new tool for
measuring workload, we might want participants to complete a set of tasks, using
the new tool to measure the participants’ workload. We also ask the participants to
complete the well-established NASA Task Load Index (NASA-TLX) to assess their
perceived workload. We can then calculate the correlation between the two mea-
sures to find out how the new tool can effectively predict the NASA-TLX results.
A higher correlation coefficient would suggest higher criterion validity. There are
three subtypes of criterion validity, namely predictive validity, concurrent validity,
and retrospective validity. For more details regarding each subtype—see Chapter 9
“Reliability and Validity” in Wrench et al. (2013).
Construct or factorial validity is usually adopted when a researcher believes that
no valid criterion is available for the research topic under investigation. Construct
validity is a validity test of a theoretical construct and examines “What constructs
account for variance in test performance?” (Cronbach and Meehl, 1955). In
Section 11.4.1.1 we discussed the development of potential theoretical constructs
using the grounded theory approach. The last stage of the grounded theory method
is the formation of a theory. The theory construct derived from a study needs to be
validated through construct validity. From the technical perspective, construct or
factorial validity is based on the statistical technique of “factor analysis” that allows
researchers to identify the groups of items or factors in a measurement instrument.
In a recent study, Suh and her colleagues developed a model for user burden that
consists of six constructs and, on top of the model, a User Burden Scale. They used
both criterion validity and construct validity to measure the efficacy of the model
and the scale (Suh et al., 2016).
In HCI research, establishing validity implies constructing a multifaceted argu-
ment in favor of your interpretation of the data. If you can show that your interpreta-
tion is firmly grounded in the data, you go a long way towards establishing validity.
The first step in this process is often the construction of a database (Yin, 2014) that
includes all the materials that you collect and create during the course of the study,
including notes, documents, photos, and tables. Procedures and products of your
analysis, including summaries, explanations, and tabular presentations of data can be
included in the database as well.
If your raw data is well organized in your database, you can trace the analytic
results back to the raw data, verifying that relevant details behind the cases and the
circumstances of data collection are similar enough to warrant comparisons between
observations. This linkage forms a chain of evidence, indicating how the data sup-
ports your conclusions (Yin, 2014). Analytic results and descriptions of this chain of
evidence can be included in your database, providing a roadmap for further analysis.