Page 261 - Machine Learning for Subsurface Characterization
P. 261

Shallow and deep machine learning models Chapter  8 225


             corresponding to most of the correlated features to zero resulting in nonunique,
             less generalizable model dependent on very few features.



             3.4 Support vector regressor
             Support vector regression is based on support vector classifier (SVC). SVC
             classifies the dataset by finding a hyperplane and decision boundaries that
             maximize the margin of separation between data points belonging to
             different classes/groups. Unlike SVC, SVR is used for regression tasks. SVR
             processes data to learn the coefficients that define the hyperplane such that
             cost associated with data points certain distance away from the hyperplane is
             the minimum. Regression model produced by SVR depends only on a subset
             of the training data, because the cost function for building the model ignores
             any training data within a certain margin around the hyperplane. Only the
             point outside the margin contributes to the final cost associated with the
             model. OLS minimizes the error in model prediction, whereas SVR fits the
             error in model prediction within a certain threshold. SVR model is built by
             minimizing the mismatch of the predicted and true target values while
             keeping the mapping function as smooth as possible, which is formulated as

                                                 n
                                       1    2   X       ∗
                                          w
                               minimize kk + C      ξ + ξ i             (8.4)
                                                     i
                                       2
                                                i¼1
                                      8     T
                                        y i  w ϕ x i  b   ε + ζ
                                               ðÞ
                                      <                   i
                                         T
                              subject to  w ϕ x i + b y i   ε + ζ ∗ i   (8.5)
                                           ðÞ
                                        ζ ,ζ   0
                                      :    ∗
                                           i
                                         i
             where ε is the error we can tolerate in high-dimensional space that defines the
                                              ∗
             margin around the hyperplane, ξ i and ξ i are slack variables introduced for
             cases when the optimization in high-dimensional space with error limit of
             ε is not feasible, and ϕ is a kernel function that maps input dataset from
             current space to higher-dimensional space, like the kernel function in
             SVC. We use the radial basis function (RBF) as the kernel function. The
             SVR algorithm can only predict one target. We trained 64 different SVR
             models to generate the entire NMR T2 distribution comprising T2
             amplitudes for 64 bins, such that each SVR model predicts T2 amplitude
             for one of the 64 NMR T2 bins.
             3.5  k-Nearest neighbor regressor
             k-Nearest neighbor regressor (kNNR) is based on the nearest neighbor
             classifier. kNN classifier first calculates distances of all the training data
             points from an unclassified data point, then selects the k-nearest points to the
             unclassified data point, and finally assigns a class to the unclassified data
   256   257   258   259   260   261   262   263   264   265   266