Page 162 - Machine Learning for Subsurface Characterization
P. 162

Robust geomechanical characterization Chapter  5 137


             balances the importance between the SSE term and the regularization term,
             which is the L 1 norm of the coefficient vector.
                When α increases, the regularization term punishes the coefficient matrix to
             be sparser. The α term is the hyperparameter of the LASSO model that is
             optimized by testing a range of values for α. We select α equals to 4.83 for
             which the LASSO model achieves the best generalization performance.
              2
             R values for the DTC and DTS predictions are 0.79 and 0.75, respectively.
             For α equals to 4.83, the LASSO model learns the value of parameter
             (coefficient) for each feature (input log), as listed in Table 5.1. LASSO
             model-derived coefficients for 6 out of the 13 features are 0. Those logs with
             a coefficient values (Table 5.1) close to zero are either well correlated with
             other logs or less important for the desired log synthesis as compared with
             other logs with nonzero coefficients. A similar redundancy in features was
             noticed with the PLS model. Table 5.1 indicates that shallow-resistivity logs,
             deep-resistivity logs, and density porosity log are not essential for the desired
             DTC and DTS logs (Fig. 5.2). Shallow resistivity (RLA0 and RLA1) and
             deep resistivity (RLA4 and RLA5) logs are correlated with the medium
             sensing RLA2 and RLA3 logs. Further, RLA2 and RLA3 being medium
             sensing have similar depth of investigation as the sonic DTC and DTS logs;
             consequently, RLA2 and RLA3 are used by the LASSO model, while the
             other resistivity logs are not used for the synthesis of DTC and DTS logs.


             2.4.4 ElasticNet model
             Similar to LASSO, the ElasticNet algorithm uses a regularization term to
             penalize the coefficients of correlated and nonessential features. ElasticNet
             model learns linear relationship between features and targets using a
             regularization term that is a weighted sum of L1 norm and L2 norm of the
             coefficients. Unlike the LASSO model, ElasticNet model preserves certain
             groups of correlated features that improve the precision and repeatability of
             the predictions. ElasticNet model does not penalize correlated features as
             severely as LASSO model. ElasticNet algorithm generates more unique
             model as compared to the LASSO model for high-dimensional data with
             highly correlated variables. The objective function of the ElasticNet model is
             formulated as
                                 1         2              2
                                               kk + α 2 w
                              min  k Xw yk + α 1 w  1  kk 2             (5.8)
                                           2
                               w 2n
             where the penalty parameters α 1 and α 2 are the hyperparameters of the
             ElasticNet model and determined through optimization to be equal to 4.8
             and 0.1, respectively. This is aligned with the findings of the LASSO model
             because α 2 is a small value and α 1 is almost equal to α of the LASSO
             model. Dataset used in our study is not a high-dimensional dataset, and the
             benefits of ElasticNet model in comparison to the LASSO model are only
             observed for high-dimensional datasets.
   157   158   159   160   161   162   163   164   165   166   167