Page 183 - Computational Statistics Handbook with MATLAB
P. 183

170                        Computational Statistics Handbook with MATLAB


                             Posse [1995a, 1995b] uses a random search to locate the global optimum of the
                             projection index and combines it with the structure removal of Freidman
                             [1987] to get a sequence of interesting 2-D projections. Each projection found
                             shows a structure that is less important (in terms of the projection index) than
                             the previous one. Before we describe this method for PPEDA, we give a sum-
                             mary of the notation that we use in projection pursuit exploratory data anal-
                             ysis.


                             NOTATION - PROJECTION PURSUIT EXPLORATORY DATA ANALYSIS
                                                                  (
                                X is an n ×  d  matrix, where each row  X i )   corresponds to a d-dimen-
                                   sional observation and n is the sample size.
                                Z is the sphered version of X.
                                 ˆ
                                µ µ µ µ  is the  1 ×  d  sample mean:

                                                         ˆ
                                                        µ µ µ µ =  ∑ X i n⁄  .             (5.10)
                                 ˆ
                                Σ Σ Σ Σ   is the sample covariance matrix:

                                                  ˆ     1         ˆ     ˆ T
                                                                   (
                                                 Σ Σ Σ Σ ij =  ------------ ∑ ( X i –  µ µ µ µ) X j –  µ µ µ µ)  .  (5.11)
                                                      n –  1
                                αβ   are orthonormal (α α =  1 =  β β   and  α β =  0  ) d-dimensional
                                  ,
                                                                         T
                                                      T
                                                                 T
                                   vectors that span the projection plane.
                                    ,
                                  (
                                P αβ)   is the projection plane spanned by  α   and  .
                                                                              β
                                 α ,  β                                                 α
                                z i z i   are the sphered observations projected onto the vectors   and
                                   β  :
                                                           α    T
                                                          z =  z α
                                                           i    i                          (5.12)
                                                           β    T
                                                          z i =  z i β
                                ( α β,  * )   denotes the plane where the index is maximum.
                                   *
                                PI χ αβ,(  )    denotes  the chi-square projection index evaluated using
                                   2
                                   the data projected onto the plane spanned by  α   and  .
                                                                                   β
                                    is the standard bivariate normal density.
                                φ 2
                                   is the probability evaluated over the k-th region using the standard
                                c k
                                   bivariate normal,
                                                             ∫ ∫  d  .                     (5.13)
                                                       c k =  φ 2 zd 1 z 2
                                                            B
                                                             k

                            © 2002 by Chapman & Hall/CRC
   178   179   180   181   182   183   184   185   186   187   188