Page 164 - Intelligent Digital Oil And Gas Fields
P. 164
126 Intelligent Digital Oil and Gas Fields
we are solving a classification problem where the observation data fall into
two classes, colored in pink and blue in Fig. 4.9A. The support vector clas-
sifier seeks the linear boundary between the two classes of observed data
points and thus performs quite poorly (Fig. 4.9B).
When applying the SVM method, which is an extension of support
vector classifier, the classification feature space is rearranged in a specific
way using nonlinear functions, that is, kernels. If then, an SVM with a poly-
nomial kernel of the third degree is applied to the nonlinear distribution of
data points as shown in Fig. 4.9A; the result is the significantly better fitting
classification presented in Fig. 4.9C, which renders better decisions.
Furthermore, if instead, the SVM is applied with the radial basis kernel,
the classification/decision boundary is captured even more accurately
(Fig. 4.9D).
When applied to statistical regression problems, the SVM method is
referred to as support vector regression (SVR). Both techniques are closely
related and only applied to a different class of problems. In the case of SVR,
the regression function usually has the form (Zhong et al., 2015)
N
X ∗ t p
f xðÞ ¼ y ¼ ð α α i Þ v x +1 + b (4.4)
i
i
i¼1
∗
where v 1 , …, v N are N support vectors and b, p, α i , and α i are the parameters
of the model, which are optimized with respect to ε-insensitive loss (Zhong
et al., 2015). During the parameter estimation, the N support vectors are
selected from the data training set. Similar to the nature of classification
problems, solve the nonlinear regression with the application of kernel func-
tion. For information, the radial basis kernel function (as mentioned previ-
ously for classification) acquires the form:
2
j
Kv i , xð Þ ¼ exp γ v i xj (4.5)
4.2.2.3 Random Forest
Random forest (RF) is an ensemble ML method that constructs a large num-
ber of uncorrelated decision trees based on averaging random selection of
predictor variables. For in-depth introduction into the concept of decision
trees, see James et al. (2014). In their fundamental formulation, decision trees
have proven to be very successful in solving classification problems of statis-
tical learning; however, they are less efficient for nonlinear regression.