Page 326 -
P. 326
300 Chapter 8 ■ Classification
8.2.1 Distance Metrics
The common, intuitive definition of distance is called Euclidean distance,
because of Euclid’s connection with many other common geometric con-
cepts. It should be (and was in the past) called the Pythagorean distance because
it uses the famous formula for the hypotenuse. The distance between a point
P = (p 1 , p 2 ) and a point Q = (q 1 , q 2 )is:
2
d = (p 1 − q 1 ) + (p 2 − q 2 ) 2 (EQ 8.1)
For points in a space having more than two dimensions, say N dimensions,
this formula generalizes as:
%
& N
1 &
2 2 2 2
((p 1 − q 1 ) + (p 2 + q 2 ) + ··· + p N + q N ) ) 2 = ' (p i − q i ) (EQ 8.2)
i = 1
This is the distance ‘‘as the crow flies,’’ and while it makes sense in everyday
life, there are problems with it in images. The main one is that pixel locations are
integers, whereas the distance between pixels can be floating point. Another
practical problem is that this calculation requires a square root operation,
which is likely to take a hundred times longer to calculate than a simple
integer operation. It is true that computers are faster than they used to be, but
images have gotten bigger, too. Therefore, it is usual to omit the square root
2
and work with d whenever possible.
A commonly used distance measure when using pixels is the 8-distance.
This is the maximum of the horizontal and vertical difference between the
coordinates of the pixel; or, for the previously defined P and Q:
d 8 = max(|p 1 − q 1 |, |p 2 − q 2 |) (EQ 8.3)
One way to think of this is as the number of pixels between P and Q.It
is called 8-distance because the path traced between P and Q uses the eight
discrete directions that are possible on a discrete grid.
If there is an 8-distance, then why not a 4-distance? There is, and it is
also called Manhattan distance or city block distance. It is the distance in pixels
between P and Q using only up/down and left/right directions (4 connected
pixels). Mathematically:
d 4 =|p 1 − q 1 |+ |p 2 − q 2 | (EQ 8.4)
Finally, at least for the purposes here, there is the most exotic, complex,
and useful distance measure: Mahanalobis distance. It is difficult to explain
in the general case, but for specific examples in classification it is more
obvious. Consider the data in Table 8.1 again. If P and Q are the first two

