Page 39 - Machine Learning for Subsurface Characterization
P. 39

24   Machine learning for subsurface characterization


               The values range from 0 to 1, such that F1 score of 1 indicates a perfect pre-
            diction (provided that there is no imbalance between two labels) and 0 indicates
            a total failure of the model. Consider the case discussed earlier, wherein the
            dataset contains 1000 inliers and 100 outliers, and the unsupervised ODT
            detects only 1 outlier and that outlier is correctly detected by the model. In that
            case, the precision is 1, the recall is 1/100, specificity is 1, balanced accuracy is
            close to 0.5, and F1 score is close to 0.02. F1 score and balanced accuracy score
            help detect the poor performance of the unsupervised ODT. For the purpose of
            outlier detection, good F1 score indicates good recall and good precision, mean-
            ing large portion of outliers in data are accurately detected as outliers and large
            portion of the detected outliers are originally outliers and not inliers.


            4.4.6 Receiver operating characteristics (ROC) curve and
            ROC-AUC score
            Unsupervised ODT generally assigns a score to each sample, such that the score
            represents the likelihood of the sample to be an outlier. Each unsupervised ODT
            implements a specific decision threshold to determine whether a sample is out-
            lier, such that all samples with scores greater than the decision threshold are
            labeled as either inlier or outlier. A robust unsupervised ODT should be insen-
            sitive to the variations in the decision threshold, that is, the outliers and inliers
            detected by the unsupervised ODT should not change a lot with changes in deci-
            sion threshold. ROC curve is a plot of the true positive rate (TPR; recall) vs the
            false positive rate (FPR; 1   specificity) of the unsupervised ODT on a dataset
            at different decision/probability thresholds. When the threshold of an unsuper-
            vised ODT is altered, the performance of the unsupervised ODT changes result-
            ing in the ROC curve. For instance, the isolation forest iteratively partitions the
            feature space to isolate a sample and assigns an outlier score based on the aver-
            age path length. Samples with shorter path lengths are given a higher outlier
            score and are considered more likely to be outliers because it is easy to isolate
            them. A threshold is set for the isolation forest by defining the score beyond
            which a sample will be considered an outlier. For the isolation forest, the anom-
            aly scores typically range from  1 to 1 with the threshold set at 0 by default,
            such that negative values (<0) are labeled outliers and positive value (>0) are
            labeled inliers.
               For reliable and robust outlier detection, an unsupervised ODT should
            have high recall (high TPR) and high specificity (low FPR) that is relatively
            insensitive to changes in the decision thresholds. For such scenarios, the
            ROC curve will shift toward the top left corner of the plot shown in
            Fig. 1.6A, which indicates a robust performance. As the ROC curve shifts to
            the left top corner, the area under curve (AUC) tends to 1, which represents
            a perfect outlier detection for various choices of threshold. At an ROC-AUC
            of 1, an unsupervised ODT is robust and reliable when the recall and specificity
            are close to 1 and relatively independent of the choice of thresholds. ROC curve
   34   35   36   37   38   39   40   41   42   43   44