Page 335 -
P. 335

Chapter 8 ■ Classification  309


                               Vapnick, and Witten for more details. Actual working software can be
                               found in many places on the Internet, including the WEKA system (www.cs
                               .waikato.ac.nz/ml/weka/), SVM  light  (http://svmlight.joachims.org/), and
                               LIBSVM (www.csie.ntu.edu.tw/~cjlin/libsvm/), for starters. There are
                               many links at www.support-vector.net/software.html.



                               8.5 Multiple Classifiers—Ensembles


                               In complex situations, where there are many classes and many features, it is
                               often true that some classifiers work better for some of the classes than others.
                               One classifier may be able to identify cars in an image, for example, while
                               another is better at trucks, or perhaps even hatchbacks. It may also be that
                               some classifiers work better in some kinds of lighting, or in the presence of
                               specific sorts of noise. In those situations it may be desirable to use more than
                               one kind of classifier, and to merge the results after classification. These are
                               referred to as ensemble classifiers.
                                 The key with an ensemble is to find a way to merge the diverse results
                               from the individual classifiers. They may be of quite different types and have
                               very different methods, but all have the same basic goal, even if the problem
                               has been distributed. In the following description, the hand-printed digital
                               recognition problem of Chapter 9 will be developed. In this problem, an
                               image is presented to the classifier that contains a single hand-printed digit, 0
                               through 9, which has been scanned or otherwise converted into image form.
                               The question: what digit is this?


                               8.5.1 Merging Multiple Methods

                               A classifier can produce one of three kinds of output. The simplest and
                               probably the most common is a basic, unqualified expression of the class
                               determined for the data object. For a digit-classification scheme, this would
                               mean that the classifier might simply state, ‘‘This is a SIX,’’ for example; this
                               will be called a type 1 response [Xu, 1992]. A classifier may also produce a
                               ranking of the possible classes for a data object. In this case, the classifier
                               may say, ‘‘This is most likely a FIVE, but could be a THREE, and is even less
                               likely to be a TWO.’’ Probabilities are not associated with the ranking. This
                               will be called a type 2 response. Finally, a classifier may give a probability or
                               other such confidence rating to each of the possible classes. This is the most
                               specific case of all, since either a ranking or a classification can be produced
                               from it. In this case, each possible digit would be given a confidence number
                               that can be normalized to any specific range. This will be called a type 3
                               response.
   330   331   332   333   334   335   336   337   338   339   340