Page 167 - Machine Learning for Subsurface Characterization
P. 167
Robust geomechanical characterization Chapter 5 141
classification tasks, whereas a 1:1 linear function is used as the activation function
for regressiontasks.Activationfunctionaddsnonlinearityto the computation.The
input layer containsneurons without anyactivationfunction.Eachfeature value of
a sample is fed to a corresponding neuron in the input layer. Each neuron in the
input layer is connected to each neuron in the subsequent hidden layer, where
the weights of the connections are few of the several parameters that need to be
computed during the training of the neural network and essential for nonlinear
complex functional mappings between the features and targets. A neural
network without activation function will act as a high-order linear regression
model. An important feature of an activation function is that it should be
differentiable so as to perform back-propagation optimization strategy while
propagating the errors backward in the network for updating the weights/
parameters of the connections. In our case, there are 13 features and 2 targets to
be synthesized. The number of neurons in the input and output layers of the
neural network are 13 and 2, respectively. We use two fully connected hidden
layers in the ANN model having nine and five neurons in the first and second
hidden layers, respectively. Such a connection results in total of 188
parameters/weights that need to be computed. Out of the 188 parameters, 126
parameters define the connection between the input layer and first hidden layer,
50 parameters define the connection between the first and second hidden layers,
and 12 parameters define the connection between second hidden layer and the
output layer. The neural network implemented in our study utilizes conjugate
gradient back propagation to update the parameters of the neurons. The number
of neurons in a layer, number of hidden layers, and activation function serve as
the hyperparameters of the ANN model.
2.5 Clustering techniques
The goal of clustering is to group data into a certain number of clusters such that
the samples belonging to one cluster share the most statistical similarity and are
dissimilar to samples in other clusters. In this study, we implement five clustering
techniques: centroid-based K-means, distribution-based Gaussian mixture,
hierarchical clustering, density-based spatial clustering of application with
noise (DBSCAN), and self-organizing map (SOM) clustering. The clustering
methods process the “easy-to-acquire” features to differentiate the various
depths into distinct groups/clusters based on certain similarities/dissimilarities
of the “easy-to-acquire” features. The goal is to generate clusters/groups based
on unsupervised learning technique (i.e., without the target sonic logs) and
assess the correlations of the clusters with the accuracies of the regression
models for log synthesis. In doing so, the cluster numbers can be used to
evaluate the reliability of the log synthesis during the deployment phase, when
it is impossible to quantify the accuracy/performance of the log synthesis,
unlike what is done for the training and testing datasets.
K-means clustering technique takes the number of clusters as an input from
user and then randomly sets the cluster centers in the feature space. The cluster