Page 180 - Machine Learning for Subsurface Characterization
P. 180
154 Machine learning for subsurface characterization
clustering. Compared with the other clustering techniques, both K-means and
Gaussian mixture clustering exhibit good ability to differentiate formation
depths. The Gaussian mixture clustering results are closely related to the
t-SNE dimensionality reduction results. Different clusters in the Gaussian
mixture model coincide with different blocks on the plots. However, the
results from Gaussian mixture model do not have a similar pattern with the
lithology and relative error. K-means clustering algorithm has the best
correlation with the prediction relative error. This is also confirmed by
analysis of Figs. 5.5 and 5.9. For the data points that have a high relative
error in log synthesis (notated by the red circle), K-means clustering method
labels them as cluster number 2. Moreover, the K-means clustering results
have a very close pattern with the lithology plot in Fig. 5.10B.
4 Conclusions
Six shallow-learning regression models were used for synthesizing the
compressional and shear travel-time logs in a shale reservoir. The regression
models were trained using supervised learning to process 13 conventional
easy-to-acquire logs. Artificial neural network (ANN) and multivariate
adaptive regression spline (MARS) models achieve the best log-synthesis
performance. ANN-based log synthesis is the best among the six regression
models, and the ANN-based synthesis exhibits coefficient of determination
(R-squared) of 0.85 on the test data from a single well. ANN-based log
synthesis exhibits R-squared of 0.84 when deployed in the second well for
blind testing, such that no information from the second well was used to
train the model. In the entire first well, the six models show similar log
synthesis performance, in terms of relative errors in synthesizing the
compressional and shear travel-time logs.
For assessing the reliability of log synthesis when deploying the shallow-
learning regression models in new wells/formations, where the
compressional and shear wave travel-time logs are absent, we applied five
clustering methods to group the formation depths by processing three
specific easy-to-acquire conventional logs. Among the five clustering
methods tested for their ability to assess the reliability of log synthesis, the
centroid-based K-means clustering significantly outperforms other clustering
methods. K-means-derived clusters exhibit strong correlation with the
relative errors in log synthesis. Most formations that have prediction relative
error higher than 0.3 are clustered into cluster number 2 by the K-means
clustering. Formation depths assigned a cluster number 2 by the K-means
clustering will have low reliability in ANN-based synthesis of compressional
and shear travel time logs. By processing the 13 easy-to-acquire logs using
the ANN model, we can synthesize the shear and compressional travel-time
logs to facilitate geomechanical characterization under data constraint. At the
same time, K-means clustering can be applied to evaluate the reliability of
the log synthesis performed using the ANN model.