Page 187 - Computational Statistics Handbook with MATLAB

P. 187

174 Computational Statistics Handbook with MATLAB

ure
ct
ureRemovRemov
ure
St SStt rr uucctt tureRemovRemova al aall l
Strruuc
In PPEDA, we locate a projection that provides a maximum of the projection
index. We have no reason to assume that there is only one interesting projec-
tion, and there might be other views that reveal insights about our data. To
locate other views, Friedman [1987] devised a method called structure
removal. The overall procedure is to perform projection pursuit as outlined
above, remove the structure found at that projection, and repeat the projec-
tion pursuit process to find a projection that yields another maximum value
of the projection pursuit index. Proceeding in this manner will provide a
sequence of projections providing informative views of the data.
Structure removal in two dimensions is an iterative process. The procedure
repeatedly transforms data that are projected to the current solution plane
(the one that maximized the projection pursuit index) to standard normal
until they stop becoming more normal. We can measure ‘more normal’ using
the projection pursuit index.
We start with a d × d matrix U * , where the first two rows of the matrix are
*
the vectors of the projection obtained from PPEDA. The rest of the rows of U
have ones on the diagonal and zero elsewhere. For example, if d = 4 , then

* * * *
α 1 α 2 α 3 α 4
*
*
*
* β β β β *
U = 1 2 3 4 .
0 010
0 001
We use the Gram-Schmidt process [Strang, 1988] to make U * orthonormal.
We denote the orthonormal version as U .
The next step in the structure removal process is to transform the Z matrix
using the following

T
T = UZ . (5.17)
In Equation 5.17, T is d × n , so each column of the matrix corresponds to a d-
dimensional observation. With this transformation, the first two dimensions
(the first two rows of T) of every transformed observation are the projection
onto the plane given by α β,( * * ) .
We now remove the structure that is represented by the first two dimen-
sions. We let Θ be a transformation that transforms the first two rows of T to
a standard normal and the rest remain unchanged. This is where we actually
remove the structure, making the data normal in that projection (the first two
represent the first two rows of T, we define the
rows). Letting T 1 and T 2
transformation as follows

182 183 184 185 186 187 188 189 190 191 192