Page 46 - Artificial Intelligence for the Internet of Everything
P. 46
Uncertainty Quantification in Internet of Battlefield Things 33
We explore the above techniques for UQ in intelligent battlefield sys-
tems. We believe that the development of ways to measure UQ could be
of great use in areas that use ML or artificial intelligence to make risk-
informed decisions, particularly when poor predictions come with a
high cost.
2.4.1 Gaussian Process Regression
Gaussian process regression (GPR) is a framework for nonlinear, nonpara-
metric, Bayesian inference (Rasmussen, 2004) (kriging; Krige, 1951). GPR
is widely used in chemical processing (Kocijan, 2016), robotics (Deisenroth,
Fox, & Rasmussen, 2015), and ML (Rasmussen, 2004), among other appli-
cations. One of the main drawbacks of GPR is its complexity, which scales
3
cubically N with the training sample size N in batch setting.
p
GPR models the relationship between random variables x 2X and
y
y 2Y , that is, ^¼ f ðxÞ by function f(x), which should be estimated
N
upon the basis of N training examples S¼ fx n ,y n g n¼1 . Unlike in ERM,
GPR does not learn this estimator by solving an optimization problem that
assesses the quality of its fitness, but instead assumes that this function f(x)
follows some particular parameterized family of distributions, in which
the parameters need to be estimated (Krige, 1951; Rasmussen, 2004).
In particular, for GPs, a uniform prior on the distribution of
f S ¼½f ðx n Þ,…, f ðx N Þ is placed as a Gaussian distribution, namely,
f S Nð0,K N Þ. Here Nðμ,ΣÞ denotes the multivariate Gaussian distribu-
tion in N dimensions with mean vector μ 2 and covariance Σ 2 N N .
N
N,N
In GPR, the covariance K N ¼½κðx m ,x n Þ m,n¼1 is constructed from a
distance-like kernel function κ : X X ! defined over the product set
of the feature space. The kernel expresses some prior about how to measure
distance between points, a common example of which is itself the Gaussian,
2
2
¼ κðx m ,x n Þ¼ expf k x m x n k =c g with bandwidth hyperpara-
½K N mn
meter c.
In standard GPR, a Gaussian prior on the noise will be placed that cor-
rupts f S to form the observation vector y ¼ [y 1 , …, y N ], that is, ðy j f S Þ¼
2
2
Nðf S ,σ IÞ where σ is some variance parameter. The prior can be
integrated on f S to obtain the marginal likelihood for y as:
2 (2.7)
ðy jSÞ ¼ N ð0,K N + σ IÞ