Page 319 - Machine Learning for Subsurface Characterization
P. 319

Classification of sonic wave Chapter  9 279


             a classification accuracy of 0.91 for the eight classes in Dataset #4. Classi-

             fication of eight classes in the presence of dispersion of þ50 degrees exhibits
             low generalization performance. KNN, Naı ¨ve Bayes, and decision trees are
             the poorest performing classifiers for all the datasets. SVM and voting clas-
             sifiers are the best-performing classifiers for all the datasets.


             5.2.3 Sensor/receiver importance
             The compressional wavefront travel times originate from one source and are
             measured at 28 sensors/receivers located along the three boundaries of the mate-
             rial, as shown in Figs. 9.22 and 9.23. Each sample used for training or testing the
             classifiers comprises 28-dimensional feature vector, representing the travel
             times measured at the 28 receivers. Feature permutation method can be used
             to compute the importance of each sensor/receiver to the proposed
             classification-based noninvasive fracture characterization. Feature permutation
             method evaluates the importance of a sensor/receiver with respect to a specific
             classifier by sequentially replacing values of each feature (i.e., a sensor/receiver
             measurement) in the test dataset with random noise (noninformative data) hav-
             ing similar statistical distribution as the original feature values and then quan-
             tifying the drop in the performance of the trained classifier applied on the test
             data containing the replaced, noninformative feature values. The random noise
             for purposes of feature replacement is generated by shuffling the original values
             of a feature to be replaced; this preserves the original mean, standard deviation,
             and other statistical properties when replacing values corresponding to a fea-
             ture. Each feature (i.e., a sensor measurement) in the dataset is replaced one
             by one, and the importance of the features is ranked in terms of the drop in
             the performance, such that important feature (i.e., a sensor measurement) causes
             a significant drop in the generalization performance of a pretrained classifier
             when applied to the test data.
                The importance of sensors for the purposes of classification-based noninva-
             sive characterization of material containing static discontinuities of various pri-
             mary orientations is shown in Fig. 9.24. The importance measures are
             determined by computing the sensitivity of a model to random permutations
             of feature values (i.e., sensor measurements). The importance score quantifies
             the contribution of a certain feature to the predictive performance of a model in
             terms of how much a chosen evaluation metric deviates when the feature
             becomes noninformative. In Fig. 9.24, the sensors located on the boundary
             opposite to source are important for classification. Wavefront arrival time mea-
             sured by sensors on boundaries adjacent to the transmitter-bearing boundary is
             not as important as those on the opposite boundary. When the network of dis-
             continuities is horizontal (parallel to x-axis), the sonic wave minimally interacts
             with the discontinuities. However, when the network of discontinuities is ver-
             tical (parallel to y-axis), the sonic wave significantly interacts with discontinu-
             ities. Consequently, changes in orientation of the discontinuities will alter the
   314   315   316   317   318   319   320   321   322   323   324