Page 80 - Statistics and Data Analysis in Geology

P. 80

Statistics and Data Analysis in Geology - Chapter 3

Each eigenvector can be regarded as a set of coordinates in five-dimensional
space that defines the “direction” of a semiaxis of a hyperellipsoid. The length of
each semiaxis is given by the corresponding eigenvalue. The first semiaxis is twice
as long as the second, which is almost twice the length of the third. The fourth
axis is very short, and the fifth axis is almost nonexistent; the hyperellipse defined
by the correlation matrix, R, is really only a three-dimensional disk embedded in a
space of five dimensions.

The slope of a line drawn from the origin of a graph through a point is defined
by the ratio between the two coordinates of the point, and not by the actual mag-
nitudes of the coordinates. Similarly, the absolute magnitudes of the elements in
eigenvectors are not significant, only the ratios between the elements. An eigen-
vector can be scaled by multiplying by any arbitrary constant, and it will still define
the same direction in multidimensional space. Different computer programs may
return different eigenvectors for the same matrix; the eigenvectors simply have
been scaled in different ways. Most programs normalize, or scale each eigenvector
so the sum of the squares of each element in a vector will be equal to 1.0. Others
scale each eigenvector so the sum of its elements will be equal to its eigenvalue.
Although such results appear to be different, the ratios between pairs of elements
in the eigenvectors remain the same, and the vectors they define point in the same
“direction.” Also, you may note that the pattern of signs on the elements of the
eigenvectors seems to be different for two otherwise identical sets of eigenvectors.
This merely means that one set of vectors has been multiplied by (-l), reversing
its “direction” but not changing its orientation in multivariate space.
Increasingly, computer programs for multivariate analysis employ alternative
techniques for obtaining eigenvalues and eigenvectors. Rather than reducing a rect-
angular data matrix to a symmetrical, square correlation or covariance matrix and
then extracting the desired eigenvalues and eigenvectors as we have done, these
programs obtain results directly from the data matrix by singular value decom-
position (SVD). An excellent description of SVD is given by Jackson (1991); Press
and others (1992) provide a more compact presentation, as well as computer pro-
gram listings. We will delay a discussion of this procedure until Chapter 6, where
we can provide a motivation for our interest. Now, we merely note that an n x m
rectangular matrix, X, can be decomposed into three other matrices:

where W contains the eigenvectors of the major product matrix, XXT. V contains
the eigenvectors of the minor product matrix, XTX, and A is an m x m diagonal
matrix whose diagonal elements are the eigenvalues of either XXT or XTX (they will
be identical except that XTX will have n - m extra eigenvalues, all equal to zero).
If you have worked through the small examples in this chapter, you can readily
appreciate that the computational labor involved in dealing with large matrices can
be formidable, even though the underlying, individual mathematical steps are sim-
ple. A modest data set such as 1STRIA.m will present a challenge to those who
attempt to analyze the data by hand. Fortunately, there are many powerful compu-
tational tools available at modest cost (at least for student versions), and they run
on almost any type of personal computer. A numerical computation package such
as MATLAB@, Mathcad@, or MATHEMATICA@, and even some statistical packages,

152

75 76 77 78 79 80 81 82 83 84 85