Page 189 - Computational Statistics Handbook with MATLAB
P. 189
176 Computational Statistics Handbook with MATLAB
Once the structure is removed using this process, we must transform the
data back using
T
Z′ = U Θ UZ( T . ) (5.21)
In other words, we transform back using the transpose of the orthonormal
matrix U. From matrix theory [Strang, 1988], we see that all directions orthog-
onal to the structure (i.e., all rows of T other than the first two) have not been
changed. Whereas, the structure has been Gaussianized and then trans-
formed back.
PROCEDURE - STRUCTURE REMOVAL
1. Create the orthonormal matrix U, where the first two rows of U
contain the vectors α β, * .
*
2. Transform the data Z using Equation 5.17 to get T.
3. Using only the first two rows of T, rotate the observations using
Equation 5.19.
4. Normalize each rotated point according to Equation 5.20.
,
⁄
⁄
,
⁄
,
5. For angles of rotation γ = 0 π 4 π 8 3π 8 , repeat steps 3
through 4.
(
(
1 t + 1) 2 t + 1)
6. Evaluate the projection index using z j and z j , after going
through an entire cycle of rotation (Equation 5.19) and normaliza-
tion (Equation 5.20).
7. Repeat steps 3 through 6 until the projection pursuit index stops
changing.
8. Transform the data back using Equation 5.21.
Example 5.27
We use a synthetic data set to illustrate the MATLAB functions used for
PPEDA. The source code for the functions used in this example is given in
Appendix C. These data contain two structures, both of which are clusters. So
we will search for two planes that maximize the projection pursuit index.
First we load the data set that is contained in the file called ppdata. This
loads a matrix X containing 400 six-dimensional observations. We also set up
the constants we need for the algorithm.
% First load up a synthetic data set.
% This has structure
% in two planes - clusters.
% Note that the data is in
% ppdata.mat
load ppdata
© 2002 by Chapman & Hall/CRC