Page 121 - Composition in Convergence The Impact of the New Media on Writing Assessment
P. 121
88 CHAPTER 4
scale, and the ability of two parallel forms to measure the same con-
cept (Wimmer & Dominick, 1997, pp. 54-55). 1
Validity frequently is defined as whether a test measures what it is
supposed to measure. There are many many other forms of validity
that can affect an assessment, however. Evaluators look to see
whether a test has face validity, predictive validity, concurrent valid-
ity, and construct validity to determine whether a test's questions
gauge the information fairly for what the question asks of the re-
spondent (Wimmer & Dominick, 1997, pp. 55-56). More precisely in
performance situations, such as a writing exam, validity addresses
the significance of test scores. Samuel Messick, of ETS, following edu-
cational researcher L. J. Cronbach's view of test validation, stated that
these scores are a function not only of the items or stimulus conditions
but also of the persons responding as well as the context of the assess-
ment. In particular, what needs to be valid is the meaning or interpreta-
tion of the scores as well as any implications for action that this
meaning entails. (1989, p. 15)
Generally speaking, for most genuine writing evaluation circum-
stances, validity is not a totalizing situation; validity depends on the
evaluators' skill in judging whether an item measures what it is sup-
posed to. Even Messick (1989) supported this position. For Messick, va-
Briefly, following Wimmer and Dominick's explanations (1997, pp. 55-56), face validity
describes whether on the face of an exam or an assignment the question measures what it is
supposed to measure. Predictive validity examines an assessment instrument against a future
outcome. In writing assessment, if a multiple-choice exam on grammar can predict the suc-
cess of students in a first-year composition (FYC) course because the exam correlated posi-
tively with passing scores in FYC, then faculty can say that the exam has high predictive
validity even though the face validity is extremely low. This is because the multiple-choice
exam is not testing the student writing, only a subset of skills. Concurrent validity evaluates
how a measurement tool performs against an established criterion. For instance, if writing
teachers wanted to gauge the validity of an editing exam, they could administer the exam to a
group of professional copyeditors and a group of students. As Wimmer and Dominick noted,
if the exam shows a clear discrimination between the two groups (and, of course, it should
based on predictive ability), then faculty can claim that the editing exam has concurrent va-
lidity. Construct validity connects the measurement tool to a theoretical structure to show a
connection related to other items in the structure. Linking this idea to composition classes, an
assessment instrument needs to relate to the program's or the instructor's pedagogical prac-
tices to indicate there is some relation between what is being measured and other variables in
the course. The converse here is also possible: An assessment method can have construct va-
lidity if it does not relate to other variables in the course or if there is no theoretical or peda-
gogical reason for a relationship to exist. System validity describes the process that the exam or
evaluation has to a larger structure, such as a writing curriculum or institution, to ensure
that what is being assessed bears a relation to the state goals outlined by a program, depart-
ment, or institution.