Page 229 - Machine Learning for Subsurface Characterization
P. 229
Deep neural network architectures Chapter 7 199
defined by the filters. Convolution is beneficial when the data (such as audio
and image) have local structures such that spatially proximal features exhibit
strong correlations. For images and audio signals, multiple convolutional
layers are used in series to extract features at multiple scales, such that with
the addition of convolutional layer, the entire architecture adapts to the
higher-level features.
After the convolution the max-pooling operation processes the 64 16
convolutional layer with a filter size of 4 1 to obtain a 16 16 layer.
Filters for max pooling select maximum values among the four spatially
neighboring features subsampled from the convolutional layer.
Consequently, max pooling neglects a large portion of the data in the
convolution layer and retains only one-fourth of data, especially the large-
valued features. Max pooling reduces overfitting, reduces computational
time, extracts rotation- and position-invariant features, and improves the
generalizability of the lower-dimensional output features. While the
convolution operation helps in obtaining the features maps, the pooling
operations play an important role in reducing the dimensionality of the
convolution-derived features. Sometimes, average pooling is used as an
alternative to the max pooling.
The output of the max-pooling layer is then flattened and fed to a fully
connected 16-dimensional layer. The fifth layer of the VAEc is a
3-dimensional latent layer, which samples from the output of the previous
16-dimensional layer to generate mean and variance vectors. The
64-dimensional NMR T2 data are compressed to 3 dimensions (3 means and
3 variances) through these 5 layers. Rather than directly outputting single
discrete values for the latent attributes, the encoder model of a VAEc will
output mean and variance, which serve as parameters describing a
distribution for each dimension in the latent space. The three-dimensional
latent layer contains the encoded Gaussian representations of the input, such
that each dimension represents a learned attribute about the data. As shown
in the Fig. 7.5, the latent layer is not merely a collection of single discrete
values for each dimension (latent attributes). Instead, each latent attribute is
represented as a range of possible values (probability distribution) by
learning mean and variance for each latent attribute. Following that a
random sample is selected from the probability distribution encoded into the
latent space and is then fed into the decoder network for the desired NMR
T2 synthesis. The mean and variance vectors in the latent layer and the
random sampling of the latent vector from the latent space enforce a
continuous, smooth latent space representation and target reconstruction.
Similar latent layer is used for the VAE architecture discussed in Section 4.2.
After the encoding, the first three-dimensional layer of the decoder
generates a latent vector by randomly sampling from the probability
distribution encoded into the latent space. Random sampling process
leverages “reparameterization trick” that samples from a unit Gaussian and