Page 64 - Machine Learning for Subsurface Characterization
P. 64

50  Machine learning for subsurface characterization


            data for purposes of improving a specific learning task. The choice of
            transformations for feature engineering depends on factors like the data type,
            data structure, data size, learning model, and the desired outcome of the
            learning. Few examples of feature engineering methods are time series
            aggregations, image filters, and natural language processing. Popular feature
            engineering methods tend to be unsupervised and easy to interpret. We
            implement two distinct feature engineering methods (Step 4 in the workflow
            shown in Fig. 2.2) to generate two distinct sets of features: The first set
            comprises statistical measures of a stationary oscillatory signal, and the
            second  set  comprises  features  derived  using  short-time  Fourier
            transformation of nonstationary signal.
               Stationary signals are constant in their statistical parameters over time [13].
            To extract the first set of statistical features, each waveform is divided into three
            equal segments, each having a length of 20 μs, and six features are derived for
            each segment. This generates a total of 18 features per waveform; therefore, the
            1375-dimensional raw waveform data are now transformed into 18-dimensional
            feature set. Following statistical features were derived for each of the three
            segments:

            l Energy is defined as the sum of squares of amplitude of signal. The energy
               transmitted through a fracture depends on various factors such the fracture
               stiffness, contact area, and type of fracture filling.
            l Kurtosis is a descriptor of the distribution of the amplitudes relative to the
               center of the distribution. It measures whether the data are heavy-tailed or
               light-tailed compared with a normal distribution. Data sets with high
               kurtosis tend to have heavy tails indicating lot of outliers.
            l Shape factor is the ratio of the root mean square (RMS) value to the average
               (arithmetic mean) of absolute amplitudes. Shape factor is representative of a
               signal type; for example, sine wave, square wave, and Gaussian white noise
               have shape factors of 1.11, 1, and 1.15, respectively.
            l Crest factor is the ratio of peak value to the root mean square (RMS) of a
               waveform. It indicates how extreme the peaks are in a waveform. Shape
               factor is representative of a signal type; for example, sine wave, square
               wave, and Gaussian white noise have crest factors of 1.414, 1, and
               infinity, respectively.
            l Dominant frequency is the frequency of maximum amplitude (one that
               carries the most energy) on the frequency spectrum obtained by applying
               fast Fourier transform (FFT) on the signal. FFT converts a signal from
               time domain to frequency domain by decomposing the sequence of
               values into components of different frequency. Fractures act as low-pass
               filters [5], that is, transmission of sonic waves through fractures results in
               attenuation of high frequencies in the signal. On comparing dominant
               frequencies between fractured and intact rock, a reduction in the
               dominant frequency indicates wave propagation through a fractured zone.
   59   60   61   62   63   64   65   66   67   68   69