Page 305 - Machine Learning for Subsurface Characterization
P. 305

Classification of sonic wave Chapter  9 267

























             FIG. 9.18 Implementation of AdaBoost classifier on a dataset that has two features and two clas-
             ses. Weak learner #2 improves on the mistake made by weak learner #1, such that the decision
             boundaries learnt by the two weak learners can be combined to form a strong learner. In this case,
             each weak learner is a decision tree, and AdaBoost classifier (i.e., strong learner) combines the weak
             learner in series.

             The weight of a sample misclassified by the previous tree will be boosted so that
             the subsequent tree focuses on correctly classifying the previously misclassified
             sample. The classification accuracy increases when more weak classifiers are
             added in series to the model; however, this may lead to severe overfitting
             and drop in generalization capability. AdaBoost is suited for imbalanced data-
             sets but underperforms in the presence of noise. AdaBoost is slower to train.
             Hyperparameter optimization of AdaBoost is much more difficult than RF clas-
             sifier (Fig. 9.18).


             4.1.6 Naı ¨ve Bayes (NB) classifier
             Naı ¨ve Bayes classifier is a probabilistic classifier based on Bayes’ theorem,
             which assumes that each feature makes an independent and equal contribution
             to the target class. NB classifier assumes that each feature is independent and
             does not interact with each other, such that each feature independently and
             equally contributes to the probability of a sample to belong to a specific class.
             NB classifier is simple to implement and computationally fast and performs
             well on large datasets having high dimensionality. NB classifier is conducive
             for real-time applications and is not sensitive to noise. NB classifier processes
             the training dataset to calculate the class probabilities P(y i ) and the conditional
             probabilities, which define the frequency of each feature value for a given class
             value divided by the frequency of instances with that class value. NB classifier
             best performs when correlated features are removed because correlated features
   300   301   302   303   304   305   306   307   308   309   310