Page 160 - Machine Learning for Subsurface Characterization
P. 160

Robust geomechanical characterization Chapter  5 135


             synthesizethetargetsbyprocessingthefeatures.Regularization/penaltyparameter
             in ElasticNet and LASSO models and the number of neurons, activation function,
             and the number of hidden layers in ANN model are examples of hyperparameters.
             On the other hand, the weights of neurons in ANN model and coefficients to be
             multiplied with the features in the OLS and LASSO models are examples of
             parameters. Hyperparameters control the learning process, and the model
             parameters are the outcome of the learning process. Parameters are computed
             during the training, and hyperparameters are set before the training and
             modified after comparing the memorization error with the training error. During
             the machine learning, one goal is to reach to the sweet spot where
             memorization error (training error), generalization error (testing error), and the
             difference between these two errors are below certain thresholds.


             2.4.1 Ordinary least squares (OLS) model
             OLS model assumes the target y i is a linear combination of features x ip and a
             residual error ε i at any given depth i. Target can then be formulated as

                                                                        (5.5)
                                y i ¼ β x i1 + β x i2 + ⋯ + β x ip + ε i
                                    1     2        p
             where i represents a specific depth and p represents the number of “easy-to-
             acquire” logs available as features for training the model to synthesize the
             target log y. In our case, p ¼ 13. In the training phase, the OLS model
             learns/computes the parameters β that minimize the sum of squared errors
             (SSE) between the modeled and measured targets. SSE is expressed as
                                           n
                                          X
                                                    2
                                     SSE ¼   ð ^ y  y i Þ               (5.6)
                                               i
                                          i¼1
             where ^ y is the synthesized target of the model for a specific depth i and n is
                   i
             the number of samples in training the dataset. In this study the features x ip are
             the 13 raw logs at depth i, and the target y i areDTC and DTS soniclogs. OLS
             models tend to be adversely affected by outliers, noise in data, and
             correlations among the features. Like other linear models, OLS is suited
             for small-sized, high-dimensional datasets, where the dimensionality of the
             dataset is the number of available features for each sample, and the size of
             the dataset is the number of samples available for training the model.


             2.4.2 Partial least squares (PLS) model
             Partial least squares regression is an extension of the multiple linear regression for
             situations when the features are highly colinear and when there are fewer number
             of samples than the number of features, that is, when the size of the dataset is
             much smaller than the dimensionality of the dataset. In such cases, OLS
             model will tend to overfit, whereas PLS model performs much better in terms
             of building a generalizable model. Scaling of features and targets is crucial for
             developing robust PLS model. PLS model learns to find the correlations
   155   156   157   158   159   160   161   162   163   164   165