Page 38 - Machine Learning for Subsurface Characterization
P. 38
Unsupervised outlier detection techniques Chapter 1 23
It is an important metric in this work as it ensures that inliers are not wrongly
labeled as outliers. Specificity should be used together with recall to evaluate
the performance of a model. Ideally, we want high recall close to 1 and high
specificity close to 1. A high specificity on its own does not indicate a good
performance of the unsupervised ODT. For example, if a model detects every
data point as an inlier, the specificity is 1, but the recall is 0, indicating that the
performance of the unsupervised ODT is bad.
4.4.3 Balanced accuracy score
Balanced accuracy score is the arithmetic mean of the specificity and recall. It is
a better metric because it combines both recall and specificity into a single met-
ric for evaluating the performance of the outlier detection model. Balanced
accuracy score is expressed as
Recall + Specificty
Balanced accuracy score ¼ (1.5)
2
Its values range from 0 to 1, such that 1 indicates a perfect ODT that cor-
rectly detects all the inliers and outliers in the dataset. Consider a dataset con-
taining 1000 inliers and 100 outliers. When an unsupervised ODT model detects
each sample as an outlier, the recall is 1, the specificity is 0, and the balanced
accuracy score is 0.5. Balanced accuracy score is high when large portion of
outliers and inliers in data are accurately detected as outliers and inliers,
respectively.
4.4.4 Precision
Precision is a measure of the reliability of outlier label assigned by the unsuper-
vised ODT. It represents the fraction of correctly predicted outlier points among
all the predicted outliers. It is expressed mathematically as
TP
Precision ¼ (1.6)
TP + FP
Similar to recall, precision should not be used in isolation to assess the per-
formance. For a dataset containing 1000 inliers and 100 outliers, when the unsu-
pervised ODT detects only one point as an outlier and it happens to be a true
outlier, then the precision of the model will be 1.
4.4.5 F1 score
F1 score is the harmonic mean of the recall and precision, like the balanced
accuracy score it combines both metrics to overcome their individual limita-
tions. It is expressed as
2 Precison Recall
F1 score ¼ (1.7)
Precision + Recall