Page 188 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 188

EMPIRICAL EVALUATION                                         177

            figure; plotyy(R(:,1),R(:,2),R(:,1),R(:,4));    % Plot the learn
                                                             curves
            [w,R] ¼ bpxnc(z,[100 100],1000);                % Train a larger
                                                             network
            figure; scatterd(z); plotc(w);                  % Plot the
                                                             classifier
            figure; plotyy(R(:,1),R(:,2),R(:,1),R(:,4));    % Plot the learn
                                                             curves



            5.4   EMPIRICAL EVALUATION

            In the preceding sections various methods for training a classifier have
            been discussed. These methods have led to different types of classifiers
            and different types of learning rules. However, none of these methods
            can claim overall superiority above the other because their applicability
            and effectiveness is largely determined by the specific nature of the
            problem at hand. Therefore, rather than relying on just one method that
            has been selected at the beginning of the design process, the designer
            often examines various methods and selects the one that appears most
            suitable. For that purpose, each classifier has to be evaluated.
              Another reason for performance evaluation stems from the fact that
            many classifiers have their own parameters that need to be tuned. The
            optimization of a design criterion using only training data holds the risk
            of overfitting the design, leading to an inadequate ability to generalize.
            The behaviour of the classifier becomes too specific for the training data
            at hand, and is less appropriate for future measurement vectors coming
            from the same application. Particularly, if there are many parameters
            relative to the size of the training set and the dimension of the measure-
            ment vector, the risk of overfitting becomes large (see also Figure 5.13
            and Chapter 6). Performance evaluation based on a validation set (test
            set, evaluation set), independent from the training set, can be used as a
            stopping criterion for the parameter tuning process.
              A third motivation for performance evaluation is that we would like
            to have reliable specifications of the design anyhow.
              There are many criteria for the performance of a classifier. The prob-
            ability of misclassification, i.e. the error rate, is the most popular one. The
            analytical expression for the error rate as given in (2.16) is not very useful
            because, in practice, the conditional probability densities are unknown.
            However, we can easily obtain an estimate of the error rate by subjecting
            the classifier to a validation set. The estimated error rate is the fraction of
            misclassified samples with respect to the size of the validation set.
   183   184   185   186   187   188   189   190   191   192   193