Page 193 - Computational Statistics Handbook with MATLAB
P. 193
180 Computational Statistics Handbook with MATLAB
The fact that the pseudo grand tour is easily reversible enables the analyst to
recover the projection for further analysis. Two versions of the pseudo grand
tour are available: one that projects onto a line and one that projects onto a
plane.
As with projection pursuit, we need unit vectors that comprise the desired
projection. In the 1-D case, we require a unit vector α αα α t() such that
d
α α α α t() 2 = ∑ α α α α t() = 1
2
i
i = 1
for every t, where t represents a point in the sequence of projections. For the
pseudo grand tour, α αα α t() must be a continuous function of t and should pro-
duce all possible orientations of a unit vector.
We obtain the projection of the data using
α α α α t() T
z i = α α α α t()x i , (5.22)
is the i-th d-dimensional data point. To get the movie view of the
where x i
α α α α t()
pseudo grand tour, we plot z i on a fixed 1-D coordinate system, re-display-
ing the projected points as t increases.
The grand tour in two dimensions is similar. We need a second unit vector
β β β β t() that is orthonormal to α αα α t() ,
d
T
β β β β t() 2 = ∑ β β β β i t() = 1 α α α α t()β ββ β t() = . 0
2
i = 1
We project the data onto the second vector using
β β β β t() T
z i = β β β β t()x i . (5.23)
α α α α t()
To obtain the movie view of the 2-D pseudo grand tour, we display z i and
β β β β t()
z i in a 2-D scatterplot, replotting the points as t increases.
The basic idea of the grand tour is to project the data onto a 1-D or 2-D
space and plot the projected data, repeating this process many times to pro-
vide many views of the data. It is important for viewing purposes to make
the time steps small to provide a nearly continuous path and to provide
smooth motion of the points. The reader should note that the grand tour is an
interactive approach to EDA. The analyst must stop the tour when an inter-
esting projection is found.
Asimov [1985] contends that we are viewing more than one or two dimen-
sions because the speed vectors provide further information. For example,
the further away a point is from the computer screen, the faster the point
© 2002 by Chapman & Hall/CRC