Page 49 - Machine Learning for Subsurface Characterization
P. 49

34   Machine learning for subsurface characterization


            Appendix B Confusion matrix to quantify the inlier and outlier
            detections by the unsupervised ODTs

            See Fig. 1.B1.

























            FIG. 1.B1 Confusion matrices for (A) DBSCAN applied on the subset FS1 of Dataset #1, (B) IF
            applied on the subset FS4 of Dataset #2, (C) LOF applied on the subset FS1 of Dataset #3, and (D)
            OCSVM applied on the Dataset #4. IF applied on the subset FS4 of Dataset #2 has the best perfor-
            mance in detecting outliers. OCSVM applied on the Dataset #4 has the worst performance in detect-
            ing outliers.

            Appendix C Values of important hyperparameters of the
            unsupervised ODT models

            Model        Hyperparameters
                                                                      a
            Isolation forest  n_estimators ¼ 100, max_samples ¼ 256, contamination ¼ ’auto’ ,
                           max_features ¼ 1 (default value in scikit learn)
                                    b
            One-class SVM  gamma ¼ ’auto’ ,nu ¼ 0.1
            Local outlier  n_neighbors ¼ 20, metric ¼ ’euclidean’, contamination ¼ ’auto’
              factor
            DBSCAN       eps ¼ 0.5, min_samples ¼ 5, metric ¼ ’euclidean’
            a
             Contamination refers to the fraction of outlier samples in the dataset; when set at ’auto’, the model uses
            its default threshold. When contamination is set (0 < x < 1), the model selects x of the number of samples
            in the dataset as outliers based on their anomaly scores.
            b
             Gamma set at ’auto’ simply means the gamma value is 1/(number of features).
            Appendix D Receiver operating characteristics (ROC) and
            precision-recall (PR) curves for various unsupervised ODTs
            on the Dataset #1

            See Figs. 1.D1–1.D3.
   44   45   46   47   48   49   50   51   52   53   54