Page 186 - Computational Statistics Handbook with MATLAB
P. 186

Chapter 5: Exploratory Data Analysis                            173


                             steps with no improvement in the value of the projection pursuit index.
                             When the neighborhood is small, then the optimization process is termi-
                             nated.
                              A summary of the steps for the exploratory projection pursuit algorithm is
                             given here. Details on how to implement these steps are provided in
                             Example 5.27 and in Appendix C. The complete search for the best plane
                             involves repeating steps 2 through 9 of the procedure m times, using m ran-
                             dom starting planes. Keep in mind that the best plane  α β,(  *  * )   is the plane
                             where the projected data exhibit the greatest departure from normality.

                             PROCEDURE - PROJECTION PURSUIT EXPLORATORY DATA ANALYSIS


                                1. Sphere the data using the following transformation
                                                              ˆ
                                                     ⁄
                                                        T
                                                                            ,
                                                                         ,
                                             Z =  Λ Λ Λ Λ –  12 Q X –(  i  µ µ µ µ)  i =  1 … n  ,
                                               i
                                                                                         ˆ
                                                                                         Σ Σ Σ Σ
                                   where the columns of Q  are the eigenvectors obtained from  ,  Λ ΛΛ Λ
                                                                                         is the
                                   is a diagonal matrix of corresponding eigenvalues, and  X i
                                   i-th observation.
                                                                 (
                                                                    ,
                                2. Generate a random starting plane,  α 0 β 0 ) . This is the current best
                                   plane, α β,(  *  *  . )
                                3. Evaluate the projection index  PI χ α 0 β 0 )   for the starting plane.
                                                                    ,
                                                                 (
                                                                2
                                                                  ,
                                                                           (
                                4. Generate  two candidate  planes  a 1 b 1 )   and  a 2 b 2 )   according to
                                                                (
                                                                              ,
                                   Equation 5.16.
                                5. Evaluate  the  value of the projection index  for these planes,
                                                     (
                                                       ,
                                      (
                                         ,
                                   PI χ a 1 b 1 )  and  PI χ a 2 b 2 ) .
                                      2
                                                    2
                                6. If one of the candidate planes yields a higher value of the projection
                                   pursuit  index, then that one  becomes  the current best plane
                                   ( α β,  *  . )
                                     *
                                7.  Repeat  steps 4 through 6 while there are improvements in the
                                   projection pursuit index.
                                8. If the index does not improve for half times, then decrease the value
                                   of c by half.
                                9. Repeat steps 4 through 8 until c is some small number set by the
                                   analyst.
                              Note that in PPEDA we are working with sphered or standardized versions
                             of the original data. Some researchers in this area [Huber, 1985] discuss the
                             benefits and the disadvantages of this approach.

                            © 2002 by Chapman & Hall/CRC
   181   182   183   184   185   186   187   188   189   190   191