Page 96 - Machine Learning for Subsurface Characterization
P. 96

80  Machine learning for subsurface characterization


               There are no specific equations or procedures to calculate the number of
            neurons (computational units) in each hidden layer. Different combinations
            of neurons in each layer were tried for the NMR T 2 prediction. An arithmetic
            sequence of the number of neurons in each hidden layer is suggested to generate
            high prediction accuracy [12]. Consequently, for the first predictive model that
            takes 27 inputs/features and generates 64 outputs/targets, 39 and 51 neurons
            were set in the first and second hidden layers of the ANN model, respectively,
            because the numbers 27, 39, 51, and 64 approximately form an arithmetic
            sequence. This architecture requires 6460 parameters, including the weights
            and biases, to be computed/updated during each training step. Following the
            same logic, for the second predictive model that has 27 inputs/features and 6
            outputs/targets, requires 20 and 13 neurons in the first and second hidden layers
            of the ANN model, respectively, such that the sequence 6, 13, 20, and 27 is
            nearly an arithmetic sequence. This architecture requires 917 parameters,
            including the weights and biases, to be computed/updated during each
            training step.
               Training functions are the methods to adjust weights and biases to converge
            the target functions of ANNs. Target functions quantify errors in learning pro-
            cess. ANN learns by minimizing the target functions based on the training func-
            tions. Levenberg-Marquardt (LM) backpropagation [13] and conjugate gradient
            (CG) backpropagation [14] are two most widely used algorithms for approxi-
            mating the training functions, which are the relationships between the features
            and targets with the weights and biases of every neuron in the ANN model. LM
            backpropagation is suitable for a small number of weights and biases, whereas
            CG backpropagation can be applied on large neural networks implementing a
            large number of weights and biases. We use CG backpropagation as the training
            function for both the ANN-based prediction models. Training time of the ANN
            model with LM backpropagation was 10 times more than that with CG back-
            propagation. To be specific, scaled conjugate gradient algorithm [14] was used
            for our study.
               Target function of ANN model adjusts the weights and biases of all neurons
            to minimize the errors to ensure the best prediction performance during the
            model training. Overfitting is a challenging problem when minimizing the tar-
            get function [15]. ANN models cannot learn a generalizable relationship
            between targets and features due to overfitting. A simple target function is
            the regularized sum of squared errors (SSE) function expressed as

                                       n            P
                                      X        2   X   2
                                SSE ¼    ð y i   ^ y Þ + λ  σ          (3.10)
                                              i        j
                                      i¼1          j¼1
            where n is the number of samples/depths in the training/testing dataset and P is
            the number of outputs (64 for the first model and 6 for the second model), λ is
            penalty parameter, y i is original target at depth i, ^ y is estimated target at depth i,
                                                    i
                 2
            and σ j is the variance of predicted target j. Regularization is a popular method to
   91   92   93   94   95   96   97   98   99   100   101