Page 326 -

P. 326

300 Chapter 8 ■ Classiﬁcation

8.2.1 Distance Metrics

The common, intuitive deﬁnition of distance is called Euclidean distance,
because of Euclid’s connection with many other common geometric con-
cepts. It should be (and was in the past) called the Pythagorean distance because
it uses the famous formula for the hypotenuse. The distance between a point
P = (p 1 , p 2 ) and a point Q = (q 1 , q 2 )is:

2
d = (p 1 − q 1 ) + (p 2 − q 2 ) 2 (EQ 8.1)
For points in a space having more than two dimensions, say N dimensions,
this formula generalizes as:
%
& N
1 &
2 2 2 2
((p 1 − q 1 ) + (p 2 + q 2 ) + ··· + p N + q N ) ) 2 = ' (p i − q i ) (EQ 8.2)
i = 1
This is the distance ‘‘as the crow ﬂies,’’ and while it makes sense in everyday
life, there are problems with it in images. The main one is that pixel locations are
integers, whereas the distance between pixels can be ﬂoating point. Another
practical problem is that this calculation requires a square root operation,
which is likely to take a hundred times longer to calculate than a simple
integer operation. It is true that computers are faster than they used to be, but
images have gotten bigger, too. Therefore, it is usual to omit the square root
2
and work with d whenever possible.
A commonly used distance measure when using pixels is the 8-distance.
This is the maximum of the horizontal and vertical difference between the
coordinates of the pixel; or, for the previously deﬁned P and Q:

d 8 = max(|p 1 − q 1 |, |p 2 − q 2 |) (EQ 8.3)

One way to think of this is as the number of pixels between P and Q.It
is called 8-distance because the path traced between P and Q uses the eight
discrete directions that are possible on a discrete grid.
If there is an 8-distance, then why not a 4-distance? There is, and it is
also called Manhattan distance or city block distance. It is the distance in pixels
between P and Q using only up/down and left/right directions (4 connected
pixels). Mathematically:

d 4 =|p 1 − q 1 |+ |p 2 − q 2 | (EQ 8.4)

Finally, at least for the purposes here, there is the most exotic, complex,
and useful distance measure: Mahanalobis distance. It is difﬁcult to explain
in the general case, but for speciﬁc examples in classiﬁcation it is more
obvious. Consider the data in Table 8.1 again. If P and Q are the ﬁrst two

321 322 323 324 325 326 327 328 329 330 331