Page 260 - Machine Learning for Subsurface Characterization
P. 260
224 Machine learning for subsurface characterization
of errors between predicted values Xθ andmeasuredvalues Y,whichis
expressed as
2
min Xθ Yk k 2 (8.1)
θ
where the feature X is a 2D array of raw/inverted logs for 475 depth points, where
each depth is considered as a sample having either 10 inverted logs or 12 raw logs
as features. The target Y is a 2D array of T2 amplitudes measured across 64 T2
bins, which constitutes the NMR T2 distribution, for the 475 depth points.
Coefficient vector θ is a 1D array of coefficients/parameters of the model that
are computed by minimizing Eq. (8.1) during the model training. OLS model
learns to predict by minimizing the cost function shown in Eq. (8.1),which
requires the minimization of the square of L 2 norm of errors in the model
prediction (jXθ – Yj). θ is the consequence of supervised learning of the OLS
model. OLS model does not have any hyperparameter.
3.2 Least absolute shrinkage and selection operator
LASSO model is an extension of the OLS model, where a regularization/penalty
term αkθk 1 is added to the cost/loss function of the OLS model:
2
min Xθ Yk + α θ (8.2)
kk
k
θ 2 1
where α is a hyperparameter referred as the penalty parameter and kθk 1 is L 1
norm of the coefficient/parameter vector. The penalty term prevents
overfitting and ensures that the LASSO model neglects correlated features.
According to Eq. (8.2), when minimizing the cost function for building the
LASSO model, L 1 norm of coefficient vector θ needs to be minimized along
with the minimization of square of L 2 norm of errors in the model prediction
(jXθ – Yj). Minimization of L 1 norm requires reduction of majority of
coefficients in the coefficient vector θ to zero.
3.3 ElasticNet
ElasticNet is an extension of OLS and LASSO formulations. The cost function
of the ElasticNet includes L 1 and L 2 norms of the coefficient vector θ
comprising coefficients/parameters learnt by the ElasticNet model. The cost
function of the ElasticNet model is formulated as
2 2
min Xθ Yk + αρ θ kk + α 1 ρð Þ θ kk 2 (8.3)
k
2
1
θ
where ρ balances the overall penalty due to L 1 and L 2 norms of model parameters
and kθk 2 is L 2 norm of the coefficient/parameter vector. Compared with LASSO,
ElasticNet model is more unique and does not severely penalize correlated
features. Unlike ElasticNet, LASSO model drastically reduces the parameters