Page 50 - Artificial Intelligence for the Internet of Everything
P. 50

Uncertainty Quantification in Internet of Battlefield Things  37


              impractical for deep neural networks, even as the HMC acceptance proba-
              bility is high throughout the experiment.
                 It is an open avenue of research to explore a few mode exploration fixes.
              Here, traditional MCMC methods can be used, such as annealing and
              annealing importance sampling. Less traditional methods are also explored,
              such as stochastic initialization and model perturbation.
                 Regarding the important challenge of posterior sampling accurate
              models in an given posterior mode, mini-batch stochastic gradient Langevin
              dynamics (SGLD) (Welling & Teh, 2011) is increasingly credited to being a
              practical Bayesian method to train neural networks to find good generaliza-
              tion regions (Chaudhari et al., 2017), and it may help in improving parallel
              SGD algorithms (Chaudhari et al., 2017). The connection between SGLD
              and SGD has been explored in Mandt, Hoffman, and Blei (2017) for pos-
              terior sampling in small regions around a locally optimal solution. To make
              this procedure a legitimate posterior sampling approach, we explore the use
              of Chaudhari et al.’s (2017) methods to smooth out local minima and sig-
              nificantly extend the reach of Mandt et al.’s (2017) posterior sampling
              approach.
                 This smoothing out has connections to Riemannian curvature methods
              to explore the energy function in the parameter (weight) space (Poole,
              Lahiri, Raghu, Sohl-Dickstein, & Ganguli, 2016). The Hessian, which is
              a diffusion curvature, is used by Fawzi, Moosavi-Dezfooli, Frossard, and
              Soatto (2017) as a measure of curvature to empirically explore the energy
              function of learned model with regard to examples (the curvature with
              regard to the input space, rather than the parameter space). This approach
              is also related to the implicit regularization arguments of Neyshabur,
              Tomioka, Salakhutdinov, and Srebro (2017).
                 There is a need to develop an alternative SGD-type method for accurate
              posterior sampling of deep neural network models that is capable of giving
              the all-important UQ in the decision-making problem in C3I systems. Not
              surprisingly a system that correctly quantifies the probability that a suggested
              decision is incorrect inspires more confidence than a system that incorrectly
              believes itself to always be correct; the latter is a common ailment in deep
              neural networks. Moreover, a general practical Bayesian neural network
              method would help provide robustness against adversarial attacks (as the
              attacker needs to attack a family of models, rather than a single model),
              reduce generalization error via posterior-sampled ensembles, and provide
              better quantification of classification accuracy and root mean square error
              (RMSE).
   45   46   47   48   49   50   51   52   53   54   55