Page 38 - Machine Learning for Subsurface Characterization
P. 38

Unsupervised outlier detection techniques Chapter  1 23


                It is an important metric in this work as it ensures that inliers are not wrongly
             labeled as outliers. Specificity should be used together with recall to evaluate
             the performance of a model. Ideally, we want high recall close to 1 and high
             specificity close to 1. A high specificity on its own does not indicate a good
             performance of the unsupervised ODT. For example, if a model detects every
             data point as an inlier, the specificity is 1, but the recall is 0, indicating that the
             performance of the unsupervised ODT is bad.

             4.4.3 Balanced accuracy score
             Balanced accuracy score is the arithmetic mean of the specificity and recall. It is
             a better metric because it combines both recall and specificity into a single met-
             ric for evaluating the performance of the outlier detection model. Balanced
             accuracy score is expressed as

                                                Recall + Specificty
                          Balanced accuracy score ¼                     (1.5)
                                                       2
                Its values range from 0 to 1, such that 1 indicates a perfect ODT that cor-
             rectly detects all the inliers and outliers in the dataset. Consider a dataset con-
             taining 1000 inliers and 100 outliers. When an unsupervised ODT model detects
             each sample as an outlier, the recall is 1, the specificity is 0, and the balanced
             accuracy score is 0.5. Balanced accuracy score is high when large portion of
             outliers and inliers in data are accurately detected as outliers and inliers,
             respectively.

             4.4.4 Precision
             Precision is a measure of the reliability of outlier label assigned by the unsuper-
             vised ODT. It represents the fraction of correctly predicted outlier points among
             all the predicted outliers. It is expressed mathematically as
                                                TP
                                    Precision ¼                         (1.6)
                                              TP + FP
                Similar to recall, precision should not be used in isolation to assess the per-
             formance. For a dataset containing 1000 inliers and 100 outliers, when the unsu-
             pervised ODT detects only one point as an outlier and it happens to be a true
             outlier, then the precision of the model will be 1.


             4.4.5 F1 score
             F1 score is the harmonic mean of the recall and precision, like the balanced
             accuracy score it combines both metrics to overcome their individual limita-
             tions. It is expressed as

                                        2 Precison Recall
                               F1 score ¼                               (1.7)
                                          Precision + Recall
   33   34   35   36   37   38   39   40   41   42   43