Page 327 -
P. 327
Chapter 8 ■ Classification 301
entries, tomatoes both, and the area and the green component are used in a
classification, then the data points are:
P = (1634, 46)
Q = (1384, 53)
The Euclidean distance between these two is:
√
2 2
(1634 − 1384) + (46 − 53) = 62500 + 49 − 250.1
Now change the green component of P by 1 to (1634, 45). The distance
between P and Q is now 250.13. Changing the area component by 1 so that
P = (1635, 46) changes the P − Q distance to 251.1. This shows that a change
in the first coordinate makes a bigger difference in the distance than does a
change in the second. Or in other words, the scales of the two coordinate axes
are different. This is very common in computer vision problems, and it really
does make sense. Why would we expect that each of the measurements would
have units of the same size?
Normalizing with respect to scale can be done using statistics. The standard
deviation is a measure of variability, or what the range of values is. Dividing
sample values by the standard deviation should narrow the range of values,
and convert the units to universal ones. This is the basic idea behind Mahanalo-
bis distance. For example, consider the same points P and Q as before and the
normalized points P’and Q’. The overall standard deviations are:
s area = 429.5 s green = 25.2
The points are:
P = (1634, 46) Q = (1384, 53) distance (P, Q) = 250.1
P = (3.8, 1.83) Q = (3.2, 2.1) distance (P , Q ) = 0.64
The standard deviations are used to normalize the raw sample values before
computing distance. It’s actually more complex than that; reality tends to
make the math harder. The formula for computing the Mahanalobis distance
between P and Q is:
T −1
d M (P, Q) = (P − Q) S (P − Q) (EQ 8.5)
which is a matrix equation, in which P and Q are the points (vectors) for which
T
the distance is being computed, (P − Q) is the transpose of the difference of
the vectors, and S is the covariance matrix.
The variance is the mean of the squared distances between a value and the
mean of those values:
n
(P i − µ i )(P i − µ i )
i = 1
VAR = (EQ 8.6)
n − 1

