Page 39 - Machine Learning for Subsurface Characterization

P. 39

24 Machine learning for subsurface characterization

The values range from 0 to 1, such that F1 score of 1 indicates a perfect pre-
diction (provided that there is no imbalance between two labels) and 0 indicates
a total failure of the model. Consider the case discussed earlier, wherein the
dataset contains 1000 inliers and 100 outliers, and the unsupervised ODT
detects only 1 outlier and that outlier is correctly detected by the model. In that
case, the precision is 1, the recall is 1/100, specificity is 1, balanced accuracy is
close to 0.5, and F1 score is close to 0.02. F1 score and balanced accuracy score
help detect the poor performance of the unsupervised ODT. For the purpose of
outlier detection, good F1 score indicates good recall and good precision, mean-
ing large portion of outliers in data are accurately detected as outliers and large
portion of the detected outliers are originally outliers and not inliers.

4.4.6 Receiver operating characteristics (ROC) curve and
ROC-AUC score
Unsupervised ODT generally assigns a score to each sample, such that the score
represents the likelihood of the sample to be an outlier. Each unsupervised ODT
implements a specific decision threshold to determine whether a sample is out-
lier, such that all samples with scores greater than the decision threshold are
labeled as either inlier or outlier. A robust unsupervised ODT should be insen-
sitive to the variations in the decision threshold, that is, the outliers and inliers
detected by the unsupervised ODT should not change a lot with changes in deci-
sion threshold. ROC curve is a plot of the true positive rate (TPR; recall) vs the
false positive rate (FPR; 1 specificity) of the unsupervised ODT on a dataset
at different decision/probability thresholds. When the threshold of an unsuper-
vised ODT is altered, the performance of the unsupervised ODT changes result-
ing in the ROC curve. For instance, the isolation forest iteratively partitions the
feature space to isolate a sample and assigns an outlier score based on the aver-
age path length. Samples with shorter path lengths are given a higher outlier
score and are considered more likely to be outliers because it is easy to isolate
them. A threshold is set for the isolation forest by defining the score beyond
which a sample will be considered an outlier. For the isolation forest, the anom-
aly scores typically range from 1 to 1 with the threshold set at 0 by default,
such that negative values (<0) are labeled outliers and positive value (>0) are
labeled inliers.
For reliable and robust outlier detection, an unsupervised ODT should
have high recall (high TPR) and high specificity (low FPR) that is relatively
insensitive to changes in the decision thresholds. For such scenarios, the
ROC curve will shift toward the top left corner of the plot shown in
Fig. 1.6A, which indicates a robust performance. As the ROC curve shifts to
the left top corner, the area under curve (AUC) tends to 1, which represents
a perfect outlier detection for various choices of threshold. At an ROC-AUC
of 1, an unsupervised ODT is robust and reliable when the recall and specificity
are close to 1 and relatively independent of the choice of thresholds. ROC curve

34 35 36 37 38 39 40 41 42 43 44