Page 260 - Machine Learning for Subsurface Characterization
P. 260

224   Machine learning for subsurface characterization


            of errors between predicted values Xθ andmeasuredvalues Y,whichis
            expressed as

                                                 2
                                      min Xθ Yk  k 2                    (8.1)
                                       θ
            where the feature X is a 2D array of raw/inverted logs for 475 depth points, where
            each depth is considered as a sample having either 10 inverted logs or 12 raw logs
            as features. The target Y is a 2D array of T2 amplitudes measured across 64 T2
            bins, which constitutes the NMR T2 distribution, for the 475 depth points.
            Coefficient vector θ is a 1D array of coefficients/parameters of the model that
            are computed by minimizing Eq. (8.1) during the model training. OLS model
            learns to predict by minimizing the cost function shown in Eq. (8.1),which
            requires the minimization of the square of L 2 norm of errors in the model
            prediction (jXθ – Yj). θ is the consequence of supervised learning of the OLS
            model. OLS model does not have any hyperparameter.


            3.2 Least absolute shrinkage and selection operator
            LASSO model is an extension of the OLS model, where a regularization/penalty
            term αkθk 1 is added to the cost/loss function of the OLS model:
                                              2
                                   min Xθ Yk + α θ                      (8.2)
                                                  kk
                                      k
                                    θ         2      1
            where α is a hyperparameter referred as the penalty parameter and kθk 1 is L 1
            norm of the coefficient/parameter vector. The penalty term prevents
            overfitting and ensures that the LASSO model neglects correlated features.
            According to Eq. (8.2), when minimizing the cost function for building the
            LASSO model, L 1 norm of coefficient vector θ needs to be minimized along
            with the minimization of square of L 2 norm of errors in the model prediction
            (jXθ – Yj). Minimization of L 1 norm requires reduction of majority of
            coefficients in the coefficient vector θ to zero.

            3.3 ElasticNet

            ElasticNet is an extension of OLS and LASSO formulations. The cost function
            of the ElasticNet includes L 1 and L 2 norms of the coefficient vector θ
            comprising coefficients/parameters learnt by the ElasticNet model. The cost
            function of the ElasticNet model is formulated as

                                       2                   2
                            min Xθ Yk + αρ θ kk + α 1 ρð  Þ θ kk 2      (8.3)
                                k
                                       2
                                               1
                             θ
            where ρ balances the overall penalty due to L 1 and L 2 norms of model parameters
            and kθk 2 is L 2 norm of the coefficient/parameter vector. Compared with LASSO,
            ElasticNet model is more unique and does not severely penalize correlated
            features. Unlike ElasticNet, LASSO model drastically reduces the parameters
   255   256   257   258   259   260   261   262   263   264   265