Page 318 - Machine Learning for Subsurface Characterization
P. 318

278 Machine learning for subsurface characterization


            AdaBoost, Naı ¨ve Bayes, ANN, and voting classifier. Seventy percent of the
            samples in the each of the four dataset (40,000 or 80,000 samples with 28
            features and one target label) are randomly selected as training samples,
            and the remaining dataset forms the testing samples. Grid search and cross
            validation method are used to tune the hyperparameters of the classifiers to
            avoid overfitting. The accuracy of the trained classifiers on the testing data-
            set, also referred as the generalization performance, is shown in Table 9.5.
            The nine classifiers have high generalization performance for the Datasets #1
            and #2 and low performance for the Datasets #3 and #4. Classifiers perform
            near perfect for the four classes in Dataset #2 having dispersion between

            þ20 degrees around the primary orientation, such that SVM, ANN, random
            forest, and voting classifiers have the best performance with a classification

            accuracy of 0.98. With the increase in dispersion to þ50 degrees, the clas-
            sification accuracy for the four classes in Dataset #1 is around 0.91. At a

            dispersion of þ20 degrees, SVM, ANN, and voting classifiers can achieve


              TABLE 9.5 Classification accuracy of the nine classifiers on the test dataset,
              also referred as the generalization performance, for the classification-based
              noninvasive characterization of material containing static discontinuities of
              various primary orientations.
                         Accuracy for  Accuracy for  Accuracy for  Accuracy for
                         Dataset #1:  Dataset #2:  Dataset #3:  Dataset #4:
                         four         four         eight        eight
                         orientation  orientation  orientation  orientation
              Classifiers  kappa 5 10  kappa 5 50  kappa 5 10   kappa 5 50
              KNN        0.87         0.95         0.57         0.69
              Linear SVM  0.92        0.99         0.67         0.88
              RBF SVM    0.92         0.99         0.68         0.86
              Decision   0.82         0.95         0.55         0.77
              tree
              Random     0.90         0.98         0.66         0.86
              forest
              AdaBoost   0.91         0.98         0.64         0.85
              Naı ¨ve    0.81         0.95         0.56         0.73
              Bayes
              ANN        0.91         0.99         0.65         0.88
              Voting     0.92         0.99         0.69         0.89
              classifier
   313   314   315   316   317   318   319   320   321   322   323