Page 187 - Computational Statistics Handbook with MATLAB
P. 187

174                        Computational Statistics Handbook with MATLAB



                                  ure
                                ct
                                  ureRemovRemov
                                  ure
                             St SStt  rr uucctt tureRemovRemova  al aall l
                             Strruuc
                             In PPEDA, we locate a projection that provides a maximum of the projection
                             index. We have no reason to assume that there is only one interesting projec-
                             tion, and there might be other views that reveal insights about our data. To
                             locate other views, Friedman [1987] devised a method called structure
                             removal. The overall procedure is to perform projection pursuit as outlined
                             above, remove the structure found at that projection, and repeat the projec-
                             tion pursuit process to find a projection that yields another maximum value
                             of the projection pursuit index. Proceeding in this manner will provide a
                             sequence of projections providing informative views of the data.
                              Structure removal in two dimensions is an iterative process. The procedure
                             repeatedly transforms data that are projected to the current solution plane
                             (the one that maximized the projection pursuit index) to standard normal
                             until they stop becoming more normal. We can measure ‘more normal’ using
                             the projection pursuit index.
                              We start with a d ×  d   matrix U *  , where the first two rows of the matrix are
                                                                                               *
                             the vectors of the projection obtained from PPEDA. The rest of the rows of U
                             have ones on the diagonal and zero elsewhere. For example, if d =  4  , then


                                                             *  *  *  *
                                                           α 1 α 2 α 3 α 4
                                                             *
                                                                  *
                                                               *
                                                       *   β β β β   *
                                                     U =     1  2  3  4 .
                                                            0 010
                                                            0 001
                             We use the Gram-Schmidt process [Strang, 1988] to make  U *   orthonormal.
                             We denote the orthonormal version as U  .
                              The next step in the structure removal process is to transform the Z matrix
                             using the following

                                                                 T
                                                          T =  UZ  .                       (5.17)
                             In Equation 5.17, T is d ×  n  , so each column of the matrix corresponds to a d-
                             dimensional observation. With this transformation, the first two dimensions
                             (the first two rows of T) of every transformed observation are the projection
                             onto the plane given by  α β,(  *  * )  .
                              We now remove the structure that is represented by the first two dimen-
                             sions. We let Θ   be a transformation that transforms the first two rows of T to
                             a standard normal and the rest remain unchanged. This is where we actually
                             remove the structure, making the data normal in that projection (the first two
                                                     represent the first two rows of T, we define the
                             rows). Letting  T 1   and  T 2
                             transformation as follows





                            © 2002 by Chapman & Hall/CRC
   182   183   184   185   186   187   188   189   190   191   192