Page 180 - Machine Learning for Subsurface Characterization
P. 180

154    Machine learning for subsurface characterization


            clustering. Compared with the other clustering techniques, both K-means and
            Gaussian mixture clustering exhibit good ability to differentiate formation
            depths. The Gaussian mixture clustering results are closely related to the
            t-SNE dimensionality reduction results. Different clusters in the Gaussian
            mixture model coincide with different blocks on the plots. However, the
            results from Gaussian mixture model do not have a similar pattern with the
            lithology and relative error. K-means clustering algorithm has the best
            correlation with the prediction relative error. This is also confirmed by
            analysis of Figs. 5.5 and 5.9. For the data points that have a high relative
            error in log synthesis (notated by the red circle), K-means clustering method
            labels them as cluster number 2. Moreover, the K-means clustering results
            have a very close pattern with the lithology plot in Fig. 5.10B.


            4 Conclusions
            Six shallow-learning regression models were used for synthesizing the
            compressional and shear travel-time logs in a shale reservoir. The regression
            models were trained using supervised learning to process 13 conventional
            easy-to-acquire logs. Artificial neural network (ANN) and multivariate
            adaptive regression spline (MARS) models achieve the best log-synthesis
            performance. ANN-based log synthesis is the best among the six regression
            models, and the ANN-based synthesis exhibits coefficient of determination
            (R-squared) of 0.85 on the test data from a single well. ANN-based log
            synthesis exhibits R-squared of 0.84 when deployed in the second well for
            blind testing, such that no information from the second well was used to
            train the model. In the entire first well, the six models show similar log
            synthesis performance, in terms of relative errors in synthesizing the
            compressional and shear travel-time logs.
               For assessing the reliability of log synthesis when deploying the shallow-
            learning  regression  models  in  new  wells/formations,  where  the
            compressional and shear wave travel-time logs are absent, we applied five
            clustering methods to group the formation depths by processing three
            specific easy-to-acquire conventional logs. Among the five clustering
            methods tested for their ability to assess the reliability of log synthesis, the
            centroid-based K-means clustering significantly outperforms other clustering
            methods. K-means-derived clusters exhibit strong correlation with the
            relative errors in log synthesis. Most formations that have prediction relative
            error higher than 0.3 are clustered into cluster number 2 by the K-means
            clustering. Formation depths assigned a cluster number 2 by the K-means
            clustering will have low reliability in ANN-based synthesis of compressional
            and shear travel time logs. By processing the 13 easy-to-acquire logs using
            the ANN model, we can synthesize the shear and compressional travel-time
            logs to facilitate geomechanical characterization under data constraint. At the
            same time, K-means clustering can be applied to evaluate the reliability of
            the log synthesis performed using the ANN model.
   175   176   177   178   179   180   181   182   183   184   185