Page 123 -
P. 123

110     4 Statistical Classification


                           reproducibility are less accessible to formal analysis, and in  practice they have to
                           be assessed in a purely experimental way.


                           4.3.1 The Parzen Window Method

                           As explained  previously,  in  order  to  estimate p(x) we  will choose  a  sufficiently
                           small region R  centred at x. Lel us select R  as a d-dimensional hypercube whose
                           edge  h(n) varies  with  n.  By  varying  h(n) we  are  able  to  select  an  appropriate
                           hypercube size depending on how  many training patterns we have available. The
                           volume of R is:




                              Let us define the following counting function:

                                      1  if  (~~11112, k=l, ..., d;
                              v(x) =
                                     0 otherwise.
                              Therefore,  if  a  point falls inside the unit  hypercube centred at the origin, it is
                           counted; otherwise, it is not counted. Using this function we can express compactly
                           the number k(n) of points xi falling inside any hypercube centred at x, as:







                              Tn  this formula function p is scaled by  the hypercube edge length h(n), and the
                           counting criterion depends on the difference vector x - xi, between a feature vector
                            x and a training set feature vector xi.
                              With this k(n) we can now express equation (4-32) as:






                              From formula (4-36) we see that if there is a large agglomeration of points in the
                            immediate neighbourhood of x one obtains a high value of  p(x, n) . If  the number
                            of such points is small, the value of  j(x, n) is also small.
                              Figure 4.27 illustrates the Parzen window method applied to a one-dimensional
                            distribution  and  using  a  rectangular  window. The pdf estimates  at  the  regularly
                            spaced marks are given by the height of the solid bars, proportional to the number
                            of points (circles) that fall inside the window associated with a specific position.
   118   119   120   121   122   123   124   125   126   127   128