Page 183 - Classification Parameter Estimation & State Estimation An Engg Approach Using MATLAB
P. 183

172                                        SUPERVISED LEARNING

            It basically means that the influence of a single object on the description
            of the classifier is limited. This upper bound avoids that noisy objects
            with a very large weight completely determine the weight vector and
            thus the classifier. The parameter C has a large influence on the final
            solution, in particular when the classification problem contains over-
            lapping class distributions. It should be set carefully. Unfortunately, it is
            not clear beforehand what a suitable value for C will be. It depends on
            both the data and the type of kernel function which is used. No generally
            applicable number can be given. The only option in a practical applica-
            tion is to run cross-validation (Section 5.4) to optimize C.
              The support vector classifier has many advantages. A unique global
            optimum for its parameters can be found using standard optimization
            software. Nonlinear boundaries can be used without much extra com-
            putational effort. Moreover, its performance is very competitive with
            other methods. A drawback is that the problem complexity is not of the
            order of the dimension of the samples, but of the order of the number of
            samples. For large sample sizes (N S > 1000) general quadratic program-
            ming software will often fail and special-purpose optimizers using
            problem-specific speedups have to be used to solve the optimization.
              A second drawback is that, like the perceptron, the classifier is basic-
            ally a two-class classifier. The simplest solution for obtaining a classifier
            with more than two classes is to train K classifiers to distinguish one
            class from the rest (similar to the place coding mentioned above). The
                                           T
            classifier with the highest output w z þ b then determines the class label.
                                           k
            Although the solution is simple to implement, and works reasonable
            well, it can lead to problems because the output value of the support
            vector classifier is only determined by the margin between the classes it is
            trained for, and is not optimized to be used for a confidence estimation.
            Other methods train K classifiers simultaneously, incorporating the one-
            class-against-the-rest labelling directly into the constraints of the optim-
            ization. This gives again a quadratic optimization problem, but the
            number of constraints increases significantly which complicates the
            optimization.

              Example 5.6   Classification of mechanical parts, support vector
              classifiers
              Decision boundaries found by support vector classifiers for the
              mechanical parts example are shown in Figure 5.11. These plots were
              generated by the code shown in Listing 5.7. In Figure 5.11(a), the
              kernel used was a polynomial one with degree d ¼ 2 (a quadratic
              kernel); in Figure 5.11(b), it was a Gaussian kernel with a width
   178   179   180   181   182   183   184   185   186   187   188