Page 299 - Introduction to Statistical Pattern Recognition
P. 299
6 Nonparametric Density Estimation 28 I
The conventional technique used to measure the dimensionality is to
compute the eigenvalues and eigenvectors of the covariance matrix and count
the number of dominant eigenvalues. The corresponding eigenvectors form the
effective subspace. Although this technique is powerful, it is limited because it
is based on a linear transformation. For example, in Fig. 6-3, a one-
Fig. 6-3 Intrinsic dimensionality and linear mapping.
dimensional distribution is shown by a solid line. The eigenvalues and eigen-
vectors of this distribution are the same as the ones of the two-dimensional
normal distribution of the dotted line. Thus, the conventional technique fails to
demonstrate the intrinsic dimensionality, which is one for this example.
The intrinsic dimensionality is, in essence, a local characteristic of the
distribution, as shown in Fig. 6-4. If we establish small local regions around
XI, Xa. X3, etc., the dimensionality within the local region must be close to 1
[ 19],[20]. Because of this, the intrinsic dimensionality is sometimes called the
local dimensionali5. This approach is similar to the local linearization of a
nonlinear function.
When k nearest neighbors are used to estimate dimensionality, the esti-
mate relies on the local properties of the distribution and is not related to the
global properties. Thus, the estimated dimensionality must be the intrinsic
dimensionality. Keeping this in mind, let us compute the ratio of two NN dis-
tances from (6.108)-(6.110)