Page 46 - Introduction to Statistical Pattern Recognition

P. 46

28 Introduction to Statistical Pattern Recognition

where the following relationships are used:

(@T)T = 0 , (2.85)

@-I = @‘ [from (2.80)] (2.86)

Equation (2.84) leads to the following important conclusions:
(1) The transformation of (2.83) may be broken down to n separate equa-
tions yi =$yX (i=l, . . . ,n). Since @;X is lb$iIIIIXIlcosO= lIXIlcos0 where 0 is the
angle between the two vectors $; and X, yi is the projected value of X on 0;. Thus,
Y represents X in the new coordinate system spanned by . . . , qn, and (2.83)
may be interpreted as a coordinate transformation.
(2) We can find a linear transformation to diagonalize a covariance matrix in
the new coordinate system. This means that we can obtain uncorrelated random
variables in general and independent random variables for normal distributions.
(3) The transformation matrix is the eigenvector matrix of Ex. Since the
eigenvectors are the ones that maximize &Z), we are actually selecting the prin-
cipal components of the distribution as the new coordinate axes. A two-
dimensional example is given in Fig. 2- 1.
(4) The eigenvalues are the variances of the transformed variables, y, ’s.
(5) This transformation is called an orthonormal transformation, because
(2.80) is satisfied. In orthonormal transformations, Euclidean distances are
preserved since

llY112 = YTY =XT@QTX =xTx = IIx112 . (2.87)

Whitening Transformation

After applying the orthonormal transformation of (2.83), we can add another
transformation that will make the covariance matrix equal to I.
,
y = A-”2@TX = (@A-1’2)TX (2.88)
- A-I/2@TZx@A-l/2 = A-l/2AA-l/2 = I
E Y- (2.89)
This transformation is called the whitening transformation or the

41 42 43 44 45 46 47 48 49 50 51