Page 67 - Machine Learning for Subsurface Characterization
P. 67
Characterization of fracture-induced geomechanical alterations Chapter 2 53
features. Dimensionality reduction reduces undesired characteristics in high-
dimensional data, namely, noise (variance), redundancy (highly correlated
variables), and data inadequacy (features ≫ samples). Dimensionality
reduction leads to some loss of information. Dimensionality reduction
methods can be broadly categorized into feature selection and feature
extraction methods. Feature selection methods select the most relevant
features from the original set of features based on an objective function.
Feature extraction finds a smaller set of newly constructed features, which
are some combination of the original features. Features obtained using
feature selection retain their original characteristics and meaning as in the
original feature set, whereas those obtained using feature extraction are
nonphysical transformations of the original features that are different from
the original feature set. Popular feature selection methods are variance
threshold, recursive feature elimination, and ANOVA F-value and mutual
information test. Popular feature extraction methods are principal component
analysis, factor analysis, ISOMAP, and independent component analysis. In
our study, we use principal component analysis (PCA) for feature extraction
as the dimensionality reduction technique.
6 Results and discussions
6.1 Effect of feature engineering
Fig. 2.6 compares the visualizations based on the K-means clustering of shear-
waveform dataset transformed using STFT followed by PCA with that
transformed using stationary statistical methods followed by PCA.
Visualizations in Fig. 2.6 show the geomechanical alterations in the
postfracture rock quantified in terms of geomechanical alteration (GA)
index. PCA was applied on feature-engineered dataset to reduce the
dimensionality for better performance of the clustering methods by avoiding
the curse of dimensionality. 180 STFT-derived features were reduced to 67
and 88 PCA-derived components, which account for 98% of variance, for
visualizing the geomechanically altered zones in the axial and frontal planes,
respectively. Similarly, the 18 stationary statistical features were reduced to
12 PCA-derived components, which account for 98% of variance, for
visualizing the altered zones in both the axial and frontal planes.
In the frontal plane, both sets of features (obtained by feature engineering
followed by dimensionality reduction) indicate very high geomechanical
alteration [red (dark gray in print version)] around 100 mm height, which is
most likely due to the hydraulic fracturing. Unlike stationary statistical
features, STFT-derived GA index indicates symmetrical regions of high
alteration [red (dark gray in print version)]. Also, unlike the stationary
statistical features, the STFT-derived GA index clearly shows a large
uniform low-alteration region [pink (light gray in print version) and blue