Page 138 - Machine Learning for Subsurface Characterization
P. 138
114 Machine learning for subsurface characterization
to the performance of the ANN model on the testing dataset. The second step of
training learns to improve accuracy of synthesizing each DD log by sequentially
synthesizing the DD logs in accordance to the rank assigned in the first step,
such that higher-ranked DD logs along with the conventional logs are used
to synthesize the lower-ranked DD logs. Sequential synthesis of one DD logs
is done using one ANN; consequently, eight ANN models are implemented
to sequentially generate the eight DD logs one at a time. During the training
stage, the original higher-ranked DD logs are fed as inputs to learn to generate
the lower-ranked DD log, whereas, during the testing and deployment stages,
the predicted higher-ranked DD logs are fed as inputs to generate the lower-
ranked DD log. For these 8 ANN models used in the second step of prediction,
the ith ANN model processes 14+i inputs (comprising 15 conventional logs and
i 1 DD logs) to generate only 1 output DD log, where i varies from 1 to 8. In
other words, 1 DD log of a specific rank is synthesized by a corresponding ANN
model that processes all the previously predicted or measured higher-ranked
DD logs and the conventional input logs. For example, in this second step,
the eighth ANN model processes 22 logs to generate the lowest-ranked DD
log. All ANN models have two hidden layers and varying number of neurons
corresponding to the numbers of inputs and outputs.
Several algorithms can be utilized as the training function to adjust weights
and biases of the neurons to minimize target functions of the ANN model. Tar-
get function describes error relevant to the prediction performance. We use the
sum of squared errors’ (SSE) function as the target function expressed as
n P
X 2 X 2
SSE ¼ ð y ^ y Þ + λ σ j (4.4)
i
i
i¼1 j¼1
where n is the number of samples, P is the number of outputs/targets per sample,
λ is the penalty parameter, y i is the original output/target vector, ^ y is the esti-
i
2
mated output/target vector, and σ j is the variance of predicted output/target j.
Regularization method is utilized to introduce the penalty parameter λ into the
target function to avoid overfitting and ensure a balanced bias-variance tradeoff
for the data-driven model. Overfitting is a critical issue when minimizing the
target function. Due to overfitting, ANN model cannot generalize the relation-
ship between inputs/features and outputs/targets and becomes sensitive to noise.
Overfitting is evident when the testing performance (referred as the generaliza-
tion performance) is significantly worse than the training performance (referred
as the memorization performance). λ ranges from 0.01 to 0.2 in our models,
which is set based on trial and error.
Scaled conjugate gradient (CG) backpropagation [12] is selected as the
training function because it requires a small training time. For example, the
training time of the ANN model implemented in the first step with
Levenberg-Marquardt (LM) backpropagation as the training function is two
times more than that with CG backpropagation. Each ANN model is trained