Page 302 - Introduction to Statistical Pattern Recognition

P. 302

284 Introduction to Statistical Pattern Recognition

Experiment 3: A similar experiment was conducted for a double
exponential waveform as

(6.1 18)

where three parameters are uniformly distributed in
0.7 5 a 5 1.3 ,
0.3 5 m 5 0.7 , (6.1 19)
0.3 5 z 10.6 .
Using eight sampling points and 250 waveforms, the intrinsic dimensionality
of the data was estimated, and the results are shown in Table 6-4. Again,
fairly accurate estimates of the intrinsic dimensionality (which is 3) were
obtained.

Experiment 4: The intrinsic dimensionalities of Data RADAR were
estimated by (6.115). They were found to be 19.8 for Chevrolet Camaro and
17.7 for Dodge Van, down from the original dimensionality of 66. This indi-
cates that the number of features could be reduced significantly. Although this
technique does not suggest how to reduce the number of features, the above
numbers could serve as a guide to know how small the number of features
should be.

Very Large Number of Classes

Another application in which the kNN distance is useful is a
classification scenario where the number of classes is very large, perhaps in the
hundreds. For simplicity, let us assume that we have N classes whose expected
vectors Mi (i = l,,.,,N) are distributed uniformly with a covariance matrix I,
and each class is distributed normally with the covariance matrix 0~1.
When only a pair of classes, 0; and ai, considered, the Bayes
is
classifier becomes a bisector between Mi and M,i, and the resulting error is
m
Ep = j 1 e-.'ZIZ dx (pairwise error) , (6.120)
d(M,.M,)iZo
where d(Mj,Mj) is the Euclidean distance between Mi and Mj. Equation
(6.120) indicates that E,, depends only on the signal-to-noise ratio, d(Mi,Mj)l~.
When the number of classes is increased, Mi is surrounded by many neighbor-
ing classes as seen in Fig. 6-5, where MmN is the center of the kth nearest

297 298 299 300 301 302 303 304 305 306 307