Page 48 - Machine Learning for Subsurface Characterization
P. 48

Unsupervised outlier detection techniques Chapter  1 33


             For any specific decision threshold, balanced accuracy score and F1 score
             should be used together to evaluate the reliability and precision of the
             unsupervised ODT.
                DBSCAN is the most effective in detecting noise in data as outliers, while
             IF and OCSVM have slightly lower performances in detecting noisy data
             points as outliers and lower precisions. DBSCAN, IF, and OCSVM are suit-
             able for detecting point outliers, when outliers are scattered around the inlier
             zone. None of these methods are suitable when outliers occur as dense regions
             in the feature space as collective outliers. Isolation forest exhibits great per-
             formance in detecting contextual outliers when there are zones affected by
             bad-hole conditions. Isolation forest also proved efficient in detecting outliers
             when there is mixture of outliers due to noise (point outlier) and bad-hole con-
             ditions (contextual outliers) in the presence of an infrequently occurring but
             relevant and distinct subgroup (which should not be considered as outlier due
             to its rare occurrence and distinct characteristics). Isolation forest is by far the
             most robust and reliable in detecting outliers and inliers in the log data. Per-
             formance of unsupervised ODTs depends on selection of features, especially
             when detecting contextual outliers, which will require hyperparameter tuning
             for optimum performance. For example, shallow-sensing logs improve
             the detection of depths where logs are adversely affectedbybadholes.Local
             outlier factor is computationally expensive and needs careful hyperparameter
             tuning for reliable and robust performance; by far, LOF is the worst-
             performing unsupervised ODT.



             Appendix A Popular methods for outlier detection
             See Fig. 1.A1.





















             FIG. 1.A1 Popular methods for outlier detection.
   43   44   45   46   47   48   49   50   51   52   53