Page 72 - Machine Learning for Subsurface Characterization
P. 72
58 Machine learning for subsurface characterization
terms of 180 STFT-derived features were reduced to 67 and 88 PCA-derived
components, which account for 98% of variance, for visualizing the altered
zones in the axial and frontal planes, respectively. In Case 2, data expressed
in terms of 180 STFT-derived features were reduced to 12 and 15 PCA-
derived components, which account for 75% of variance, for visualizing the
altered zones in the axial and frontal planes, respectively. In Case 3, feature
selection is performed prior to PCA to dimensionally reduce the data
expressed in terms of 180 STFT-derived features. In Case 3, two tasks are
performed to accomplish the feature selection. First, variance threshold is
applied to eliminate STFT-derived features that have variance less than 1.6;
then, correlated STFT-derived features having a correlation coefficient
higher than 0.9 are removed. The two steps reduce the 180 STFT-derived
features to 22 features for axial visualization and 19 features for frontal
visualization. After these two steps, PCA is performed to reduce the feature-
selected dataset to 18 and 8 PCA-derived components, which account for
98% of variance, for visualizing the altered zones in the axial and frontal
planes, respectively. Geomechanical alteration index (Fig. 2.9) for the three
cases were generated using K-means clustering. Only postfracture shear
wave measurements are considered for this comparison.
In the axial plane, all the three cases show a region of maximum
geomechanical alteration around the center of the sample, and the degree of
alteration reduces toward the sample boundaries. Case 2 shows unusually
large region of low geomechanical alteration; most likely due to the loss of
information associated with the loss of 25% of variance. In the frontal
orientation, all three cases show a highly altered zone around the height of
100 mm. At the height of 40 mm, Case 1 and Case 2 indicate a slightly
altered zone [blue (dark gray in print version)], whereas Case 3 indicates
highly altered zone [yellow (light gray in print version)], which is
inconsistent. Case 2 seems to be the most inconsistent, while Case 1 is the
most consistent with gradual variation in alterations in the vertical and radial
directions. In conclusion, shear waveform should be transformed using STFT
followed by PCA that retains 98% of the variance for the best visualization.
6.4 Effect of using features derived from both prefracture
and postfracture waveforms
Fig. 2.10 shows the effect of combining features derived from prefracture shear
waveforms with the features derived from postfracture shear waveforms on the
noninvasive visualization. Our hypothesis is that the information from
prefracture waveforms may improve the identification of alteration zones. A
feature set containing STFT-derived features obtained by transforming
prefracture and postfracture measurements has double the number of features
than the feature set containing STFT-derived features obtained by
transforming only the postfracture waveform. Hence, the dataset containing