Page 299 - Introduction to Statistical Pattern Recognition
P. 299

6  Nonparametric Density Estimation                           28 I



                        The  conventional  technique  used  to  measure  the  dimensionality is  to
                   compute the eigenvalues and eigenvectors of the covariance matrix and count
                   the number of dominant eigenvalues.  The corresponding eigenvectors form the
                   effective subspace.  Although this technique is powerful, it is limited because it
                    is  based  on  a  linear  transformation.  For  example,  in  Fig.  6-3,  a  one-




















                              Fig. 6-3  Intrinsic dimensionality and linear mapping.


                    dimensional distribution is shown by  a solid line.  The eigenvalues and eigen-
                    vectors  of  this  distribution  are the  same as  the  ones  of  the  two-dimensional
                    normal distribution of the dotted line.  Thus, the conventional technique fails to
                    demonstrate the intrinsic dimensionality, which is one for this example.
                        The  intrinsic dimensionality  is,  in  essence, a  local  characteristic of  the
                    distribution, as  shown in  Fig. 6-4.  If  we  establish small local regions around
                    XI, Xa. X3, etc., the dimensionality within the local region must be close to  1
                    [ 19],[20]. Because of this, the intrinsic dimensionality is sometimes called the
                    local dimensionali5.  This approach is  similar to the  local  linearization of  a
                    nonlinear function.
                         When  k  nearest neighbors are used  to estimate dimensionality, the esti-
                    mate relies on  the local properties of  the distribution and is not related to the
                    global  properties.  Thus,  the  estimated  dimensionality must  be  the  intrinsic
                    dimensionality.  Keeping this in mind, let us compute the ratio of two NN dis-
                    tances from (6.108)-(6.110)
   294   295   296   297   298   299   300   301   302   303   304