Page 251 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 251

240                                     UNSUPERVISED LEARNING

            Listing 7.6
            MATLAB code for calling the EM algorithm for mixture of Gaussians
            estimation.


            load nutsbolts_unlabeled;     % Load the data set z
            z ¼ setlabtype(z,‘soft’);     % Set probabilistic labels
            [lab,w] ¼ emclust(z,qdc,4);   % Cluster using EM
            figure; clf; scatterd(z);
            plotm(w,[],0.2:0.2:1);        % Plot results



            7.2.4  Mixture of probabilistic PCA

            An interesting variant of the mixture of Gaussians is the mixture of
            probabilistic principal component analyzers (Tipping and Bishop,
            1999). Each single model is still a Gaussian like in (7.13), but its
            covariance matrix is constrained:

                                           T
                                                   2
                                    C ¼ W W k þ   I                    ð7:29Þ
                                     0
                                           k
                                                   k
                                     k
            where the D   N matrix W k has the D eigenvectors corresponding to the
            largest eigenvalues of C k as its columns, and the noise level outside the
            subspace spanned by W k is estimated using the remaining eigenvalues:
                                                N
                                          1    X
                                   2
                                    ¼  N   D         m                 ð7:30Þ
                                   k
                                              m¼Dþ1
            The EM algorithm to fit a mixture of probabilistic principal component
                                                                  0
            analyzers proceeds just as for a mixture of Gaussians, using C instead of
                                                                  k
                                                             2
            C k . At the end of the M step, the parameters W k and   are re-estimated
                                                             k
            for each cluster k by applying normal PCA to C k and (7.30), respectively.
              The mixture of probabilistic principal component analyzers intro-
            duces a new parameter, the subspace dimension D. Nevertheless, it uses
            far fewer parameters (when D   N) than the standard mixture of
            Gaussians. Still it is possible to model nonlinear data, which cannot be
            done using normal PCA. Finally, an advantage over normal PCA is that
            it is a full probabilistic model, i.e. it can be used directly as a density
            estimate.
              In PRTools, probabilistic PCA is implemented in qdc: an additional
            parameter specifies the number of subspace dimensions to use. To train a
            mixture, one can use emclust, as is illustrated in Listing 7.7.
   246   247   248   249   250   251   252   253   254   255   256