Page 94 - Machine Learning for Subsurface Characterization
P. 94

78  Machine learning for subsurface characterization


            data-driven model. Min-max scaler is well suited when the feature distribution
            is non-Gaussian in nature and the feature follows a strict bound (e.g., image
            pixels). Depths exhibiting outlier log responses need to be removed prior to
            min-max scaling, which is drastically influenced by outliers. Moreover, the
            existence of outlier log responses adversely affects the weights and biases of
            the neurons learnt during the training of an ANN model. In this study, those
            depths are considered as outliers, where one of the logs has abnormally large
            or small value as compared with the general trend of the log. At some depths
            of the shale system under investigation, the log responses are unrealistic and
            abnormally high; for example, Gamma ray larger than 1000 API unit or DTSM
            larger than 800 μs/ft. Such outliers are referred as global outliers, which require
            simple thresholding technique for detection, followed by the removal of such
            depths exhibiting outlier log responses.
               After the removal of depths exhibiting abnormal log responses, each log
            (feature or target) is transformed to a value between  1 and 1 using min-
            max scaler. Min-max scaling forces features and targets to lie in the same range,
            which guarantees stable convergence of weights and biases in the ANN model
            [11]. Min-max scaling was performed using the following equation:

                                         x x min
                                    y ¼ 2         1                     (3.9)
                                        x max  x min
            where x is the original (unscaled) value of the feature or target and y is the scaled
            value of x. Scaling is performed for all the features so that all the features have
            the same influence when training the model. Scaling is essential when using
            distance, density, and gradient-based learning methods. ANN rely on gradients
            for updating the weights. For ANN models, unscaled features can result in a
            slow or unstable learning process, whereas unscaled targets can result in explod-
            ing gradients causing the learning process to fail. Unscaled features are spread
            out over orders of magnitude resulting in a model that may learn large-valued
            weights. When using traditional backpropagation with sigmoid activation func-
            tion, unscaled features can saturate the sigmoid derivative during the training.
            Such a model is unstable exhibiting poor generalization performance. However,
            when certain variations of backpropagation, such as resilient backpropagation,
            are used to estimate the weights of the neural network, the neural network is
            more stable to unscaled features because the algorithm uses the sign of the gra-
            dient and not the magnitude of gradient when updating the weights.


            2.7 Training and testing methodology for the ANN models
            After the feature scaling, the dataset (comprising features and targets) is split
            into two parts: training data and testing data. Usually, 80% of data are selected
            as the training data, and the remaining 20% of the original data constitute the
            testing data. When the size of the dataset available for building a data-driven
            model increases, we can choose larger percentage of data to constitute the
   89   90   91   92   93   94   95   96   97   98   99