Page 64 - Machine Learning for Subsurface Characterization
P. 64
50 Machine learning for subsurface characterization
data for purposes of improving a specific learning task. The choice of
transformations for feature engineering depends on factors like the data type,
data structure, data size, learning model, and the desired outcome of the
learning. Few examples of feature engineering methods are time series
aggregations, image filters, and natural language processing. Popular feature
engineering methods tend to be unsupervised and easy to interpret. We
implement two distinct feature engineering methods (Step 4 in the workflow
shown in Fig. 2.2) to generate two distinct sets of features: The first set
comprises statistical measures of a stationary oscillatory signal, and the
second set comprises features derived using short-time Fourier
transformation of nonstationary signal.
Stationary signals are constant in their statistical parameters over time [13].
To extract the first set of statistical features, each waveform is divided into three
equal segments, each having a length of 20 μs, and six features are derived for
each segment. This generates a total of 18 features per waveform; therefore, the
1375-dimensional raw waveform data are now transformed into 18-dimensional
feature set. Following statistical features were derived for each of the three
segments:
l Energy is defined as the sum of squares of amplitude of signal. The energy
transmitted through a fracture depends on various factors such the fracture
stiffness, contact area, and type of fracture filling.
l Kurtosis is a descriptor of the distribution of the amplitudes relative to the
center of the distribution. It measures whether the data are heavy-tailed or
light-tailed compared with a normal distribution. Data sets with high
kurtosis tend to have heavy tails indicating lot of outliers.
l Shape factor is the ratio of the root mean square (RMS) value to the average
(arithmetic mean) of absolute amplitudes. Shape factor is representative of a
signal type; for example, sine wave, square wave, and Gaussian white noise
have shape factors of 1.11, 1, and 1.15, respectively.
l Crest factor is the ratio of peak value to the root mean square (RMS) of a
waveform. It indicates how extreme the peaks are in a waveform. Shape
factor is representative of a signal type; for example, sine wave, square
wave, and Gaussian white noise have crest factors of 1.414, 1, and
infinity, respectively.
l Dominant frequency is the frequency of maximum amplitude (one that
carries the most energy) on the frequency spectrum obtained by applying
fast Fourier transform (FFT) on the signal. FFT converts a signal from
time domain to frequency domain by decomposing the sequence of
values into components of different frequency. Fractures act as low-pass
filters [5], that is, transmission of sonic waves through fractures results in
attenuation of high frequencies in the signal. On comparing dominant
frequencies between fractured and intact rock, a reduction in the
dominant frequency indicates wave propagation through a fractured zone.