Page 173 - Computational Statistics Handbook with MATLAB
P. 173

160                        Computational Statistics Handbook with MATLAB


                              It has been shown [Andrews, 1972; Embrechts and Herzberg, 1991] that
                             because of the mathematical properties of the trigonometric functions, the
                             Andrews curves preserve means, distance (up to a constant) and variances.
                             One consequence of this is that Andrews curves showing functions close
                             together suggest that the corresponding data points will also be close
                             together. Thus, one use of Andrews curves is to look for clustering of the data
                             points.

                             Example 5.23
                             We show how to construct Andrews curves for the iris data, using only the
                             observations for Iris setosa and Iris virginica observations. We plot the curves
                             for each species in a different line style to see if there is evidence that we can
                             distinguish between the species using these variables.
                                load iris
                                % This defines the domain that will be plotted.
                                theta = (-pi+eps):0.1:(pi-eps);
                                n = 50;
                                p = 4;
                                ysetosa = zeros(n,p);
                                % There will n curves plotted,
                                % one for each data point.
                                yvirginica = zeros(n,p);
                                % Take dot product of each row with observation.
                                ang = zeros(length(theta),p);
                                fstr = '[1/sqrt(2) sin(i) cos(i) sin(2*i)]';
                                k = 0;
                                % Evaluate sin and cos functions at each angle theta.
                                for i = theta
                                   k = k+1;
                                   ang(k,:) = eval(fstr);
                                end
                                % Now generate a ‘y’ for each observation.
                                for i = 1:n
                                  for j = 1:length(theta)
                                     % Find dot product with observation.
                                     ysetosa(i,j)=setosa(i,:)*ang(j,:)';
                                     yvirginica(i,j)=virginica(i,:)*ang(j,:)';
                                  end
                                end
                                % Do all of the plots.
                                plot(theta,ysetosa(1,:),'r',...
                                      theta,yvirginica(1,:),'b-.')
                                legend('Iris Setosa','Iris Virginica')
                                hold
                                for i = 2:n


                            © 2002 by Chapman & Hall/CRC
   168   169   170   171   172   173   174   175   176   177   178