Page 137 - Machine Learning for Subsurface Characterization

P. 137

Stacked neural network architecture Chapter 4 113

Before training the ANN models, data preprocessing is necessary to facili-
tate reliable and robust model predictions. Around 2% of depths in the training
and testing datasets exhibited outlier behavior; to be more specific, these out-
liers are mostly point outliers. Few examples of such outliers are depths where
gamma ray responses are close to 1000 API units or shear-wave travel time
responses are higher than 800 μs/ft. Depths exhibiting outlier response were
removed using simple variance-based method. Outlier removal is essential
for obtaining robust NRMSE values. Following outlier detection and removal,
the features/inputs and targets/outputs were normalized using MinMax scaler to
values within 1 and 1. It is uncommon to scale targets, but features are always
scaled. MinMax scaling of the features guarantees stable convergence of the
ANN model parameters (i.e., weights and biases) during the neural network
optimization [11]. MinMax scaler should not be applied when there are outliers
or when a feature has a large variance. In the absence of the limiting conditions,
MinMax scaler is well suited for developing robust neural network models.
MinMax scaler is performed based on the following equation:

D i, j D j,min
D sc,i, j ¼ 2 1 (4.3)
D j,max D j,min
where D i, j is the original value of log j measured at a specific depth i, D j, max and
D j, min are the maximum and minimum values of the log j in the entire training
dataset, and D sc, i, j is the scaled value of the log j computed for the specific
depth i using Eq. (4.3). D i, j can be from training dataset, testing dataset, or
new deployment dataset, but D j, max and D j, min should always be from the train-
ing dataset corresponding to the log j. It should be noted, all data preprocessing
methods should be first applied on only the training dataset to learn the statis-
tical parameters or mathematical transformations required for the data prepro-
cessing of the training dataset (e.g., for MinMax scaler operation the parameters
are D j,max and D j,min ). Following that, the statistical parameters or mathematical
transformations required for data preprocessing of the training dataset should be
used for data preprocessing of the testing dataset or the new deployment dataset.
Data preprocessing “should not” be done on the entire dataset prior to split.
After split, data preprocessing “should not” be done separately on each split
by separately learning the statistical parameters or mathematical transforma-
tions required for the data preprocessing of each specific split.

2.5 ANN models for dielectric dispersion log generation
Nine ANN models are implemented in the proposed predictive method that
trains and tests the stacked neural network model. The first ANN model with
15 inputs and 8 outputs learns to simultaneously synthesize the 8 DD logs; fol-
lowing that, each DD log is ranked in accordance to the accuracy of simulta-
neously synthesizing the DD log. The accuracy used for ranking corresponds

132 133 134 135 136 137 138 139 140 141 142