Page 165 - Machine Learning for Subsurface Characterization
P. 165

Robust geomechanical characterization Chapter  5 139


























             FIG. 5.2 Bar plot of the estimated coefficients β for LASSO model.



             2.4.5 Multivariate adaptive regression splines (MARS) model
             Few advantages of linear models are their ease and speed of computation and
             also the intuitive nature of interpreting their coefficients/parameters.
             However, the strong assumption about linearity affects the predictive
             accuracy of linear models. MARS models the nonlinear relationship
             between features and targets by splitting the feature space into subspaces
             and then learns the linear relationship between features and targets for each
             of the subspaces. MARS uses a divide and conquer strategy in which the
             training datasets are partitioned into separate piecewise linear segments
             (splines) of differing gradients (slope). These piecewise linear segments (or
             curves), also known as basis functions B q (x), result in a flexible model that
             can handle both linear and nonlinear behavior. The points of connection C q
             between the piecewise segments are called knots. By relating the features
             and targets using multiple independent linear regressions, the model can
             capture the nonlinear trends in the dataset. MARS assesses each data point
             for each feature as a knot to partition the original feature space into two
             new subspaces. Then, two different linear models with the candidate
             feature(s) are identified for each subspace that results in the smallest error.
             This partitioning is continued until many knots are found, producing a
             highly nonlinear pattern, which is a collection of linear models for
             individual subspaces. Increase in number of knots allows better fit with the
             training dataset; however, the learnt relationship may not generalize well to
             new, unseen dataset. Knots that do not contribute significantly to predictive
             accuracy can be removed using the process known as “pruning.” MARS
   160   161   162   163   164   165   166   167   168   169   170