Page 97 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R
P. 97
76 2 Presenting and Summarising the Data
objects to be assigned to one of c categories (columns). Furthermore, assume that k
judges assigned the objects to the categories, with n ij representing the number of
judges that assigned object i to category j.
The sums of the counts along the rows totals k. Let c j denote the sum of the
counts along the column j. If all the judges were in perfect agreement one would
find a column filed in with k and the others with zeros, i.e., one of the c j would be
rk and the others zero. The proportion of objects assigned to the jth category is:
p = c j /(rk ) .
j
If the judges make their assignments at random, the expected proportion of
2
agreement for each category is p and the total expected agreement for all
j
categories is:
c
2
P () = ∑ p . 2.30
E
j
= j 1
The extent of agreement, s i, concerning the ith object, is the proportion of the
number of pairs for which there is agreement to the possible pairs of agreement:
c n k
s i = ∑ ij / .
j =1 2 2
The total proportion of agreement is the average of these proportions across all
objects:
1 r
P( A) = ∑ i 2.31
s .
r = i 1
The κ (kappa) statistic, based on the formulas 2.30 and 2.31, is defined as:
E
P () PA − ( )
κ = . 2.32
E
1 − P ()
If there is complete agreement among the judges, then κ = 1 (P(A) = 1,
P(E) = 0). If there is no agreement among the judges other than what would be
expected by chance, then κ = 0 (P(A) = P(E)).
Example 2.11
Q: Consider the FHR dataset, which includes 51 foetal heart rate cases, classified
by three human experts (E1C, E2C, E3C) and an automatic diagnostic system
(SPC) into three categories: normal (0), suspect (1) and pathologic (2). Determine
the degree of agreement among all 4 classifiers (experts and automatic system).