Page 97 - Applied Statistics Using SPSS, STATISTICA, MATLAB and R

P. 97

76 2 Presenting and Summarising the Data

objects to be assigned to one of c categories (columns). Furthermore, assume that k
judges assigned the objects to the categories, with n ij representing the number of
judges that assigned object i to category j.
The sums of the counts along the rows totals k. Let c j denote the sum of the
counts along the column j. If all the judges were in perfect agreement one would
find a column filed in with k and the others with zeros, i.e., one of the c j would be
rk and the others zero. The proportion of objects assigned to the jth category is:

p = c j /(rk ) .
j

If the judges make their assignments at random, the expected proportion of
2
agreement for each category is p and the total expected agreement for all
j
categories is:

c
2
P () = ∑ p . 2.30
E
j
= j 1

The extent of agreement, s i, concerning the ith object, is the proportion of the
number of pairs for which there is agreement to the possible pairs of agreement:

c n    k
s i = ∑   ij    /   .

j =1  2   2 

The total proportion of agreement is the average of these proportions across all
objects:

1 r
P( A) = ∑ i 2.31
s .
r = i 1

The κ (kappa) statistic, based on the formulas 2.30 and 2.31, is defined as:

E
P () PA − ( )
κ = . 2.32
E
1 − P ()

If there is complete agreement among the judges, then κ = 1 (P(A) = 1,
P(E) = 0). If there is no agreement among the judges other than what would be
expected by chance, then κ = 0 (P(A) = P(E)).

Example 2.11
Q: Consider the FHR dataset, which includes 51 foetal heart rate cases, classified
by three human experts (E1C, E2C, E3C) and an automatic diagnostic system
(SPC) into three categories: normal (0), suspect (1) and pathologic (2). Determine
the degree of agreement among all 4 classifiers (experts and automatic system).

92 93 94 95 96 97 98 99 100 101 102