Page 41 - Machine Learning for Subsurface Characterization
P. 41

26   Machine learning for subsurface characterization


            5 Performance of unsupervised ODTs on the four validation
            datasets

            In this section, the performances of the four unsupervised ODTs (IF, OCSVM,
            LOF, and DBSCAN) are evaluated by comparing the unsupervised detections
            against the known labels in the validation datasets. Performance of each model
            is expressed in terms of balanced accuracy score, F1 score, and ROC-AUC
            score (AUC for the ROC curve). Balanced accuracy score is high when large
            portion of actual outliers and inliers in data are accurately detected as outliers
            and inliers, respectively. For purpose of outlier detection, good F1 score indi-
            cates good recall and good precision, indicating that a large fraction of actual
            outliers in data are accurately detected as outliers and a large fraction of the
            detected outliers are originally outliers and not inliers. A high ROC-AUC score
            close to 1 indicates that a large portion of actual outliers and inliers are correctly
            detected without much sensitivity to the decision thresholds of the unsupervised
            ODT model. These three metrics are simple evaluation metrics. For more robust
            assessment, these evaluation metrics should be appropriately weighted to
            address the effects of outlier-inlier imbalance (i.e., the number of positives/out-
            liers is much smaller than the number of negatives/inliers). Appendix B presents
            true positives, true negatives, false positives, and false negatives for certain
            unsupervised ODTs on certain datasets in the form of confusion matrix.
            Appendix C lists the values of hyperparameters of various models for the unsu-
            pervised outlier detection. Because our goal is to find the most reliable unsu-
            pervised outlier-detection method, these hyperparameters are not tuned/
            modified and the other hyperparameters (if any) are set at the default values.
            Values mentioned in Appendix C are kept constant for all the numerical exper-
            iments on the four datasets. In a real-world scenario, without any labels to com-
            pare and evaluate the outlier/inlier detections, the hyperparameters need to be
            tuned based on a manual inspection of each of the outliers and inliers.


            5.1 Performance on Dataset #1 containing noisy measurements

            The unsupervised ODT model performance is evaluated for three feature subsets
            referred to as FS1, FS2, and FS2*, where FS1 contains GR, RHOB, and DTC;
            FS2 contains GR, RHOB, and logarithm of RT; and FS2* contains GR, RHOB,
            and RT. For the subsets FS1 and FS2* of Dataset #1, DBSCAN performs better
            than the other models, as indicated by the balanced accuracy score. For the sub-
            set FS1 of Dataset #1, the DBSCAN correctly labels 176 of the 200 introduced
            noise samples as outliers and 3962 of the 4037 “normal” data points as inliers;
            consequently, DBSCAN has a balanced accuracy score and F1 score of 0.93 and
            0.78, respectively. For the subset FS2 of Dataset #1, log transform of resistivity
            negatively impacts the outlier detection performance. Logarithmic transforma-
            tion of resistivity reduces the variability in the feature. On using deep resistivity
            (RT) as is (i.e., without logarithmic transformation) in the subset FS2*,
   36   37   38   39   40   41   42   43   44   45   46