Conditioning a Gaussian Processes - Adaptive Sampling for Marine Robotics

Once the mean and covariance function are deﬁned, GPs follow basic probability theory, applied to multivariate Gaussian distributions. In this way, GPs can be used in a Bayesian setting whereBayes’ rule, can be used to update the prior probability as more evidence or data becomes available. The model for Bayesian updating from data relies on ﬁnding the conditional probabilityp(x|y), wherexis a distinction of interest (e.g. temperature), and yis data (e.g. measurements from an AUV). From the rules of conditional probability, the posterior is obtained according to Bayes’ rule

p(x|y) = p(x,y)

p(y) = p(x)p(y|x)

p(y) (3.6)

wherep(x)is the prior model forx, andp(y|x)is thelikelihood function. The denomi-natorp(y)is themarginal likelihood, which is a normalizing constant that can be found from marginalizing overxasp(y) =

p(x)p(y|x)dx, or using sums in discrete situ-ations. The assessment of the posterior probability density function (pdf)p(x|y)in Eq.

(3.6), depends on the choice of prior and likelihood model. For GPs this has a practical implication, as the posterior will be Gaussian if both the prior and likelihood are Gaussian.

In this setting, a tractable expression for the posterior can be found, which is presented in Eq. (3.10) and (3.11).

3.2.1 Example: Using Gaussian Processes

To illustrate how GPs can be practically applied to sampling, a short example of model ﬁtting and prediction is demonstrated in this section using ocean model data. The model data that we will use is simulated surface temperature from a coastal area in Norway (Froan, Trøndelag). An imaged data collection survey with AUV will be the basis for illustrating conditioning (assimilation of data) within the GP framework.

We start by modeling the prior mean function using model output of surface temper-ature (at time 12:00 PM), presented in Fig. 3.2b. The mean functionμ(si)is found using 2D linear regression on the temperature data (see Fig. 3.2a), yielding the resulting regres-sion vectorβ=[5.42, ,0.0026, 0.0057]. This gives us a prior estimate of the temperature provided east-north location, as predicted values forμ, shown in Fig. 3.3a. The covariance function cov(x(si), x(sj))is set to thesquared exponentialkernel. Consider then the GP given by

μ(si) = 5.40 + 0.0026e_i+ 0.0058n_i, (3.7) cov(si,sj) =σ²e^(−δ^||sⁱ^−s^j^||), (3.8) where si = (e_i, n_i) indicates location i = 1, . . . , n. In the covariance function σ² and δ denote design parameters (hyperparameters) for variance and correlation dis-tance, while ||si −sj|| may be recognized as the Euclidean distance between two sites si and sj. To obtain the correct correlation range δ, a variogram analysis is conducted using multiple realizations of the surface temperature data from the ocean model. A variogram is a plot which is constructed to help relate the spatial distance

3.2. Conditioning a Gaussian Processes

(a) 2D regression of ocean model data (temperature).

Showing a ﬁtted plane over the synthetic ground-truth data from (b).

(b) Synthetic temperature data generated from ocean model (12:00 PM), used as a simulated ground-truth, and AUV survey path (dashed line).

Figure 3.2: (3.2a) 2D regression of the simulated surface temperature, note the ﬁtted 2D plane. (3.2b) The ocean model data showing the surface temperature used as ground-truth, and the simulated AUV survey (dashed line).

between points with the points variance. The formal deﬁnition follows the relation γ(h) = ¹₂V ar([x(s_i)−x(s_j)]) = V ar(x(s))−Cov(x(s_i), x(s_j)), wherehis the lag vector (distance). Typically as the lag distancehincreases, the variance increases until a limit is reached and the variogram ﬂattens out. At this limit, the points no longer yield any relation based on the data value and the variance can no longer grow indicating the correlation rangeδ. The variogram for the ocean model data (many realizations covering one month) is displayed in Fig.3.3b.

0 25 50 75 100

(a) Prior GP realization forμfound using regression.

0 20 40 60 80 100 120

Distance between cells in ocean model (h) 0

(b) Surface temperature variogram made using ocean model data.

Figure 3.3: (3.3a) The prior predicted temperature values inμ, before any observations are made. (3.3b) The one month variogram for the ocean surface temperature data.

The variogram curve in Fig.3.3b indicates a correlation distance of approx. 5-7 km (50-60 h). The correlation varianceσ²can be set using Eq. (3.4), or to an average value.

Here we use an average value of∼0.035°C²based on the variogram. The prediction in Fig.

3.3a, found in Eq. (3.7), now constitute a prior estimate of the environment (which we will refer to as theworld modelin Section 4.3). We now proceed to simulate an AUV survey, using observations from the ocean model output (at time 12:00 PM), shown in Fig. 3.2b;

observations are made as a location value pair. By assimilating these observations into the GP prior we can produce an updated GP posterior that can render updated values at all locations (including un-observed). The interpolated values are obtained using predictions delivered from the covariance functions, predicting the functional value at a given point by a weighted average of the values in the neighborhood of the point. Letpriorbe the prior function values atsibased on out prior temperature functionμ(si) = 5.40 + 0.0026e_i+ 0.0058n_i, anddatabe measurements from the AUV that we want to assimilate. Using a matrix representation, there are four important matrices to consider in this regard, namely

prior:μ=μ(si), for all locationsi= 1, . . . , n.

observation matrix:F =m×nmatrix with ﬁxed entries (0s and 1s) indicative of the survey design.mis the number of observations/measurements.

data:y =F x+ε, where xis a process (ocean model), with Gaussian measurement noiseε∼ N(0,T); andT =τ²I, whereτ is can be set manually.

covariance:Σ= cov(si,sj), for all locations pairsi= 1, . . . , nandj = 1, . . . , n.

Using these matrices we can setup a joint Gaussian model as

p(prior,data) =N

prior data

; μ

F μ

Σ ΣF^T FΣ FΣF^T +T

. (3.9)

The Gaussian posterior solution (ref. Eq. (3.6)) is deﬁned by the conditional mean and covariance (posterior) as

μ_posterior=μ+ΣF^T(FΣF^T +T)⁻¹(y−F μ), (3.10) Σ_posterior=Σ−ΣF^T(FΣF^T +T)⁻¹FΣ, (3.11) whereF μis the prior temperature prediction at the sampled locationsμ(sk). Note that the posterior covarianceΣ_posterioris reduced in comparison to the prior covariance, since the updated equation subtracts an always positive term representing the additional information gained from adding the new observations. Another factor which is important to notice is that the GP update requires inversion of the covariance matrix(FΣF^T +T), which can be computationally expensive. This is can be a drawback for GP models; for large dimensional problems (i.e. many observations or points), sparse approximations need to be used, see e.g. Vanhatalo et al. (2010).

Using Eq. (3.10) and (3.11) the updated results (conditional mean and variance) can be seen in Fig. 3.4. The sampled locations are shown as dashed lines in Fig. 3.2b, illustrating the AUV path. The GP is conditioned on the data along this line to obtain the Fig. 3.4a.

Comparing Fig. 3.4a to Fig. 3.2b, the GP has updated the prior mean ﬁeld in Fig. 3.3a, into

In document Adaptive Sampling for Marine Robotics (sider 51-54)