Page 160 - Machine Learning for Subsurface Characterization

P. 160

Robust geomechanical characterization Chapter 5 135

synthesizethetargetsbyprocessingthefeatures.Regularization/penaltyparameter
in ElasticNet and LASSO models and the number of neurons, activation function,
and the number of hidden layers in ANN model are examples of hyperparameters.
On the other hand, the weights of neurons in ANN model and coefficients to be
multiplied with the features in the OLS and LASSO models are examples of
parameters. Hyperparameters control the learning process, and the model
parameters are the outcome of the learning process. Parameters are computed
during the training, and hyperparameters are set before the training and
modified after comparing the memorization error with the training error. During
the machine learning, one goal is to reach to the sweet spot where
memorization error (training error), generalization error (testing error), and the
difference between these two errors are below certain thresholds.

2.4.1 Ordinary least squares (OLS) model
OLS model assumes the target y i is a linear combination of features x ip and a
residual error ε i at any given depth i. Target can then be formulated as

(5.5)
y i ¼ β x i1 + β x i2 + ⋯ + β x ip + ε i
1 2 p
where i represents a specific depth and p represents the number of “easy-to-
acquire” logs available as features for training the model to synthesize the
target log y. In our case, p ¼ 13. In the training phase, the OLS model
learns/computes the parameters β that minimize the sum of squared errors
(SSE) between the modeled and measured targets. SSE is expressed as
n
X
2
SSE ¼ ð ^ y y i Þ (5.6)
i
i¼1
where ^ y is the synthesized target of the model for a specific depth i and n is
i
the number of samples in training the dataset. In this study the features x ip are
the 13 raw logs at depth i, and the target y i areDTC and DTS soniclogs. OLS
models tend to be adversely affected by outliers, noise in data, and
correlations among the features. Like other linear models, OLS is suited
for small-sized, high-dimensional datasets, where the dimensionality of the
dataset is the number of available features for each sample, and the size of
the dataset is the number of samples available for training the model.

2.4.2 Partial least squares (PLS) model
Partial least squares regression is an extension of the multiple linear regression for
situations when the features are highly colinear and when there are fewer number
of samples than the number of features, that is, when the size of the dataset is
much smaller than the dimensionality of the dataset. In such cases, OLS
model will tend to overfit, whereas PLS model performs much better in terms
of building a generalizable model. Scaling of features and targets is crucial for
developing robust PLS model. PLS model learns to find the correlations

155 156 157 158 159 160 161 162 163 164 165