Page 319 - Machine Learning for Subsurface Characterization
P. 319
Classification of sonic wave Chapter 9 279
a classification accuracy of 0.91 for the eight classes in Dataset #4. Classi-
fication of eight classes in the presence of dispersion of þ50 degrees exhibits
low generalization performance. KNN, Naı ¨ve Bayes, and decision trees are
the poorest performing classifiers for all the datasets. SVM and voting clas-
sifiers are the best-performing classifiers for all the datasets.
5.2.3 Sensor/receiver importance
The compressional wavefront travel times originate from one source and are
measured at 28 sensors/receivers located along the three boundaries of the mate-
rial, as shown in Figs. 9.22 and 9.23. Each sample used for training or testing the
classifiers comprises 28-dimensional feature vector, representing the travel
times measured at the 28 receivers. Feature permutation method can be used
to compute the importance of each sensor/receiver to the proposed
classification-based noninvasive fracture characterization. Feature permutation
method evaluates the importance of a sensor/receiver with respect to a specific
classifier by sequentially replacing values of each feature (i.e., a sensor/receiver
measurement) in the test dataset with random noise (noninformative data) hav-
ing similar statistical distribution as the original feature values and then quan-
tifying the drop in the performance of the trained classifier applied on the test
data containing the replaced, noninformative feature values. The random noise
for purposes of feature replacement is generated by shuffling the original values
of a feature to be replaced; this preserves the original mean, standard deviation,
and other statistical properties when replacing values corresponding to a fea-
ture. Each feature (i.e., a sensor measurement) in the dataset is replaced one
by one, and the importance of the features is ranked in terms of the drop in
the performance, such that important feature (i.e., a sensor measurement) causes
a significant drop in the generalization performance of a pretrained classifier
when applied to the test data.
The importance of sensors for the purposes of classification-based noninva-
sive characterization of material containing static discontinuities of various pri-
mary orientations is shown in Fig. 9.24. The importance measures are
determined by computing the sensitivity of a model to random permutations
of feature values (i.e., sensor measurements). The importance score quantifies
the contribution of a certain feature to the predictive performance of a model in
terms of how much a chosen evaluation metric deviates when the feature
becomes noninformative. In Fig. 9.24, the sensors located on the boundary
opposite to source are important for classification. Wavefront arrival time mea-
sured by sensors on boundaries adjacent to the transmitter-bearing boundary is
not as important as those on the opposite boundary. When the network of dis-
continuities is horizontal (parallel to x-axis), the sonic wave minimally interacts
with the discontinuities. However, when the network of discontinuities is ver-
tical (parallel to y-axis), the sonic wave significantly interacts with discontinu-
ities. Consequently, changes in orientation of the discontinuities will alter the

