Spatio-temporal modelling of infectious diseases using the ensemble Kalman model

(1)

Mina SpremicSpatio-temporal modelling of infectious diseases using the ensemble Kalman model NTNU Norwegian University of Science and Technology Faculty of Information Technology and Electrical Engineering Department of Mathematical Sciences

Master ’s thesis

Spatio-temporal modelling of infectious diseases using the ensemble Kalman model

Master’s thesis in Applied Physics and Mathematics Supervisor: Karl Henning Omre

July 2020

(2)

(3)

Spatio-temporal modelling of

infectious diseases using the ensemble Kalman model

Master’s thesis in Applied Physics and Mathematics Supervisor: Karl Henning Omre

July 2020

Norwegian University of Science and Technology

Faculty of Information Technology and Electrical Engineering Department of Mathematical Sciences

(4)

(5)

This thesis is submitted as a requirement in TMA4900 Industrial Mathematics Master Thesis at the Department of Mathematical Sciences, and completes my Master of Science degree in Applied Physics and Mathematics.

Following my ever present interest in medical applications of statistics, I decided to write a Master Thesis on epidemic modelling during the autumn of 2019, unaware of the com- ing pandemic events. The Norwegian Public Health Institute(FHI) provided data from the 2016-2019 influenza seasons and promised support during my thesis work. The sit- uation changed dramatically during the winter of 2020. NTNU was closed down with online thesis supervision and more importantly, my contact persons at FHI got heavily occupied. NTNU thesis supervision was less active than usual for some weeks, and my supervisor and myself decided to avoid further contact with our resource persons at FHI.

Instead, inspired by the circumstances, the focus of the study was shifted from seasonal influenza to a more general case of an infectious disease, and we decided to use synthetic data in this study.

Many hours went into implementing the code lying behind the results presented in the thesis. Despite the challenging circumstances, rather limited prior knowledge of the topic, and having felt rather frustrated and helpless at times, working on this study has been motivational and fun.

I wish to thank my supervisor, Henning Omre, for all the help and guidance throughout the whole year, and for being understanding and encouraging. It was inspiring and motivational working with him as my supervisor.

I wish to extend my thanks to Gry Marysol Grøneng, for being helpful and welcoming during my visits to Oslo in the autumn and providing me with the influenza dataset, and to Solveig Engebretsen at NR for pointing me in the right direction.

I am forever grateful to my family for their endless support and love, and for having worked hard and taking the biggest chance of their life so that I could have the oppor- tunity to be where I am, doing what I like.

Finally, thanks to Martin for believing in me when I did not.

Mina Spremić,

Trondheim, July 2020

i

(6)

(7)

In this study, we model the spatio-temporal spread of an infectious disease, using Bayesian inversion, unifying a data assimilation method, the spatio-temporal ensemble Kalman model with the infectious disease model. The variables of interest are the infected event counts and intensity. We present a necessary background on the Bayesian inversion and spatio-temporal models. The infectious disease model is presented. We connect the infectious disease model and the ensemble Kalman model in order to use available observations in the modelling of the spread of the infectious disease. The unified model is demonstrated on three different synthetic examples; a temporal, a regular spatio- temporal and a spatio-temporal with train station example, testing the model limits.

Both ensemble Kalman filter and smoother are demonstrated for all three examples.

The true scenario, prior model, and the posterior model, made up of the filtered and smoothed posterior distributions are presented for each of the examples. The ensemble Kalman model performs satisfactory for both the temporal and spatio-temporal example, showing promising results. The train station example has proven to be somewhat of a challenge, and a revision of the model would be necessary for improving the results.

(8)

(9)

I denne oppgaven modellerer vi spredningen av en smittsom sykdom ved å bruke Bayesiansk inversjon og forener rom-tid modellen med modellen for spredningen av smittsomme sykdommer. Variablene av interesse er hendelser og intensiteten av smittede tilfeller. Vi presenterer den nødvendige bakgrunnen for Bayesiansk inversjon og rom-tid modeller, i tillegg til modellen for spredningen av smittsomme sykdommer. Vi forener modellen for spredningen av smittsomme sykdommer med ensemble Kalman modellen for å kunne bruke de tilgjengelige observasjonene i modelleringen av spredningen i rom-tid. Den forente modellen er demonstrert på tre syntetiske eksempler. Disse er: et eksempel i tid, rom-tid og rom-tid med togstasjon. Det sanne scenarioet, a priori modellen og a posteriori modellen er presentert for hvert av eksemplene. Ensemble Kalman modellen viser lovende resultater for både eksemplet i tid og rom-tid. Eksemplet i rom-tid med togstasjon har vist seg å være mer utfordrende enn de to andre, og en revidering av modellen kan være nødvendig for å kunne forvente forbedring av resultatene.

(10)

(11)

Foreword i

Abstract iii

Sammendrag v

1 Introduction 1

1.1 Literature review . . . 2

1.2 Notation . . . 3

2 Influenza dataset 4 3 Methodology 11 3.1 Bayesian inversion . . . 11

3.2 Bayesian spatio-temporal models . . . 12

3.2.1 Discrete hidden Markov model . . . 14

3.2.2 Traditional Kalman model . . . 14

3.2.3 Ensemble Kalman model . . . 17

4 Infectious disease model 20 4.1 Prior model . . . 22

4.2 Likelihood model . . . 25

5 Synthetic examples 27 5.1 Unified model . . . 27

5.2 Temporal example . . . 32

5.2.1 True scenario . . . 32

5.2.2 Likelihood model . . . 33

5.2.3 Prior model . . . 33

5.2.4 Posterior model . . . 34

5.2.5 Summary . . . 36

5.3 Spatio-temporal example . . . 41

5.3.2 Likelihood model . . . 45 vii

(12)

viii CONTENTS

5.3.3 Prior model . . . 45

5.3.5 Summary . . . 63

5.4 Train station example . . . 64

5.4.2 Likelihood model . . . 65

5.4.3 Prior model . . . 68

5.4.5 Summary . . . 84

5.5 Computational demands . . . 84

6 Conclusions and further work 86 A Additional results 87 A.1 Temporal example . . . 87

(13)

Introduction

Infectious diseases are caused by various types of microorganisms called pathogens.

Pathogens can either infect humans through direct or indirect transmission. Direct transmission happens through contact between humans via various means, for example if the disease is airborne, the contact is aerial, and the disease is spread through small particles in the air, see Rothman (2012). Indirect transmission of the disease happens through an intermediate host or a proxy.

The focus of the study is uniting the infectious disease model with a data assimilation method using the available observations to improve the predictions on the spread of the infectious disease. Using the ensemble Kalman model as the data assimilation method of choice, we demonstrate the combined model on three different examples with syntheti- cally generated observations. Starting off, we unite our infectious disease model with the ensemble Kalman model. This unified model is then demonstrated on a simple temporal example. We evaluate the performance of the ensemble Kalman filter and smoother and discuss the results. We follow up with a spatio temporal case, where we look at a populated grid. We generate synthetic observations and test our models on this, more complicated scenario. The filter and smoother are then evaluated and the results are discussed. Lastly we investigate whether the model can represent a more demanding scenario, with a train station placed in one of the grid nodes, providing a place for the infection to dramatically increase. We give en evaluation and discussion of the results.

In the Chapter 2 we motivate with an example of a real dataset of an infectious disease, namely seasonal influenza which also illustrates the spatio-temporal spread of the disease. Chapter 3 includes necessary background on Bayesian inversion and relevant spatio-temporal models. The infectious disease model is presented in Chapter 4. The unified model, alongside the three synthetic examples mentioned above are presented in Chapter 5. Lastly, Chapter 6 concludes this study.

In the rest of the introduction we present a brief literature review and introduce some necessary notation used throughout the study.

1

(14)

2 CHAPTER 1. INTRODUCTION

1.1 Literature review

In infectious disease epidemiology, there are many interesting questions to be answered;

can we forecast the peak of an upcoming influenza season? What factors have influenced the spread of the H1N1 pandemic in 2009? Can different preventive measures change the course of an epidemic or a pandemic, and in what way? How does a particular infectious disease spread? The questions are many, and they are all important both for understanding the behaviour of diseases and preparing for the epidemics to come.

According to WHO (2019), unknown pathogens that can lead to a pandemic are currently one of the biggest threats to global health. Therefore it is important to have good answers to questions related to dynamics of infectious diseases, their origin and the impact they can have on our society, and how we can be better prepared.

Focusing on the modelling of infectious diseases, more specifically epidemic modelling, we see that approaches and methods are many. The most famous epidemic model is the SIR model, introduced by Kermack et al. (1927). Here, SIR stands for susceptible, infected and recovered, representing three different groups or compartments an individual can belong to, depending on whether the person can be infected, contracts the disease, or has recovered from the disease. The SIR model can be deterministic, where changes in each compartment are represented with a differential equation, or it can be formulated in a stochastic manner, using probability distributions. Since then, different variations and expansions of the SIR model have been introduced in order to better represent different infectious diseases with different mechanisms. One of the examples is the SIS model, where one assumes that immunity is not possible, and once an individual has recovered from a diseases, she is once again susceptible to being infected. Discrete-time stochastic and deterministic SIS and SIR models are compared in Allen et al. (2000). Another example is the SEIR model, where an additional compartment is added, namely the exposed compartment, representing the incubation period, see for example Rothman (2012), when the individual has contracted a disease but has not yet started showing the symptoms and is not yet infectious. Finkenstädt et al. (2000) use a combination of discrete-time SEIR model and time-series for a case of measles, termed tSIR model, where "t" stands for time-series. Höhle (2016) presents examples of continuous time, discrete time and spatio-temporal models, including the above-mentioned tSIR model.

As mentioned, there are different versions of the stochastic SIR model, depending on how the temporal aspect of the model is treated, meaning whether it is discrete or continuous.

Originally, the SIR model considers continuous time dimension, but in cases where real data is available, it could be more appropriate to consider a discrete time model. In such cases, in addition to the SIR model, epidemic models like Reed-Frost and chain binomial are frequently used. One of the main differences lies in the manner the duration of the infectious period is treated. In general stochastic epidemic it is modelled as exponential, while in Reed-Frost and chain-binomial models it is considered constant and the same for all individuals, see more in Britton (2019). Setting aside the studies using the fre- quentist approach, the studies using the Bayesian approach, with filtering methods used to obtain the posterior distribution are not many. Chretien et al. (2014) compare differ-

(15)

ent filtering approaches in retrospective forecasting of influenza. Ong et al. (2010) and Saito et al. (2013) both focus on modelling of H1N1 pandemic in Singapore and Japan, although with different approaches. In Ong et al. (2010) the SEIR model and particle filter are used for real time epidemic monitoring of the H1N1 in 2009 in Singapore, while in Saito et al. (2013) a stochastic SEIR is used to model infection transmission between populations of two regions in Japan during the H1N1 epidemic.

We would like to remind the reader that these are only some of the existing studies, and that there are many aspects related to modelling of the infectious diseases that are important and insightful, and that the studies chosen here are the ones considered to be relevant with the respect to our study.

1.2 Notation

In this section, we introduce the necessary base notation that is used throughout the study. Vectors are represented with bold lower-case letters, v, and matrices are denoted with bold capital letters,A. Lower-case Latin letters are used to denote variables, while Greek letters are used for model parameters.

We define {(x, t);x ∈ D ⊂ R², t ∈ T ⊂ R^L} as the spatio-temporal reference of the variable of study. The probability density function of a continuous random variable r is referred to as the pdf, and denoted with p(r). The same notation is used for the probability mass function, referred to as pmf, of a discrete random variable r. Writing r ∼ p(r) implies that r is distributed according to the pdf/pmf. The distribution of a multivariate Gaussian random n-vector r is denoted with r ∼ φn(r;µ,Σ), where µ is an n-vector of expectation values and Σ is an (n×n) covariance matrix. With in, we denote ann-vector of ones and with I_n, an (n×n) identity matrix.

(16)

Chapter 2

Influenza dataset

In Norway, Norwegian Institute for Public Health (NIPH) monitors influenza. The influenza dataset presented in this chapter is provided by NIPH, based on observations from the Norwegian Syndromic Surveillance System (NorSSS). The observations are weekly, accumulated number of influenza like illness (ILI) diagnosis made by general practitioners (GPs) on a municipality level. Keep in mind that the number of diagnosed individuals does not equal the actual number of people who contracted influenza during the period.

This is due to influenza having very general symptoms that are common for different diseases. Most common diseases diagnosed under ILI are common cold and influenza.

To be able to confirm an influenza case, one needs to run a blood test and analyse it in a laboratory. In the study, we consider the seasonal influenza as one entity, and do not differentiate between subtypes A and B.

The dataset includes the number of consultations resulting in a (ILI) diagnosis for a municipality for each age group and gender, in addition to the total weekly number of consultations per GPs office. The age group division is as follows: 0-4 years, 5-14 years, 15-64 years and 65+ years. The dataset includes information on whether the consultation made is a direct consultation, meaning the patient came for an appointment, or made by phone. The eight regions of eastern Norway, are represented in the dataset for the period 2016-2019. Children of school age and young adults are less likely to report to the GPs office in case of ILI symptoms, hence we expect under-reporting for the corresponding age groups. It is important to be aware of the possibility of mis-diagnosing of influenza, since, as mentioned above, ILI is not a diagnosis of influenza, and can in reality be another disease.

In the visualization of the data, we have excluded Oslo for illustration purpose. Due to much denser population compared to the other municipalities and counties, considerably more cases get registered, hence we consider it a special case. Note that the population size in the regions and municipalities is not taken in account when presenting data.

Figure 2.1 shows the spatial spread of ILI cases in municipalities of eastern Norway, for the influenza season 2016-2017, lasting from November through March. We notice that the number of ILI cases peaks in January, and by March the number of registered cases sinks considerably.

4

(17)

November December January

February March

0 500 1000 1500

ILI Cases Monthly number of ILI cases for eastern Norway,

Nov−Mar 2016−2017

Figure 2.1: Monthly number of registered ILI cases for municipalities in eastern Norway for influenza season 2016-2017.

In Figure 2.2 the monthly number of ILI cases for regions of eastern Norway is presented. We see that the number of influenza cases reaches its peak in January, which corresponds to what we have observed in Figure 2.1. The highest number of cases registered, over 6000 individuals with ILI, is in Akershus region.

Comparing the 2017-2018 influenza season for municipalities of eastern Norway, shown in Figure 2.3, to the 2016-2017 season, we observe that the number of registered ILI cases is much lower in November and December, with peak in both January and February. The number starts decreasing in March, but is still significantly higher than the number of registered cases in the previous season.

The difference between the subsequent seasons is even more visible when we look at

(18)

6 CHAPTER 2. INFLUENZA DATASET

February March

0 2000 4000 6000 8000

Nov−Mar 2016−2017

Figure 2.2: Monthly number of registered ILI cases for regions of eastern Norway for influenza season 2016-2017.

the accumulated number of registered ILI cases for regions, where season 2017-2018 is shown in Figure 2.4. The peak of the season is spread out over two months, with visible decrease in numbers of registered cases in March, as observed in Figure 2.3.

Weekly ILI cases for the municipality of Oslo, for the influenza season 2017-2018, are shown in Figure 2.5. There are few cases in week 45 through 52, after which the number of registered cases abruptly increases. The peak is at around week 5, which is the transition between January and February. Finally the number decreases quite sharply from week 9 through 12. In Figure 2.6, percentage of ILI cases out of total number of consultations at GPs office is shown. We observe at its peak, ILI cases made up 4% of total consultations in the municipality of Oslo.

(19)

February March

0 500 1000 1500

Nov−Mar 2017−2018

Figure 2.3: Monthly number of registered ILI cases for municipalities in eastern Norway for influenza season 2017-2018.

As the municipality of Oslo is considered an outlier, we compare it to the less populated municipality of Hamar, located in Hedmark region. From Figure 2.7, we imme- diately notice the difference in the highest weekly number of registered cases, which for Oslo is over 2000 while in Hamar it is slightly under 80. At its peak, ILI cases made up approximately 2.75% out of total number of consultations at GPs office, as seen in Figure 2.8.

(20)

February March

0 2000 4000 6000 8000

Nov−Mar 2016−2017

Figure 2.4: Monthly number of registered ILI cases for regions of eastern Norway for influenza season 2017-2018.

(21)

500 1000 1500 2000

44 45 46 47 48 49 50 51 52 00 01 02 03 04 05 06 07 08 09 10 11 12 13 Weeks

Number of ILI cases

Weekly ILI cases for municipality of Oslo, Nov−Mar 2017−2018

Figure 2.5: Weekly number of ILI cases for the municipality of Oslo, for period between November 2017 and March 2018.

1 2 3 4

44 45 46 47 48 49 50 51 52 00 01 02 03 04 05 06 07 08 09 10 11 12 13 Weeks

Percentage of ILI cases

Percentage of patients diagnosed ILI for municipality of Oslo, Nov−Mar 2017−2018

Figure 2.6: Percentage of ILI cases out of total weekly consultations, for the municipality of Oslo, for period between November 2017 and March 2018.

(22)

20 40 60 80

44 45 46 47 48 49 50 51 52 00 01 02 03 04 05 06 07 08 09 10 11 12 13 Weeks

Number of ILI cases

Weekly ILI cases for municipality of Hamar, Nov−Mar 2017−2018

Figure 2.7: Weekly number of ILI cases for the municipality of Hamar, for period between November 2017 and March 2018.

1 2

44 45 46 47 48 49 50 51 52 00 01 02 03 04 05 06 07 08 09 10 11 12 13 Weeks

Percentage of ILI cases

Percentage of patients diagnosed with ILI for municipality of Hamar, Nov−Mar 2017−2018

Figure 2.8: Percentage of ILI cases out of total weekly consultations, for the municipality of Hamar, for period between November 2017 and March 2018.

(23)

Methodology

We introduce the Bayesian inversion framework. We define prior, likelihood and posterior models. In the proceeding section, Bayesian spatio-temporal models of choice are introduced, namely, the hidden Markov model, the traditional Kalman model and the ensemble Kalman model.

3.1 Bayesian inversion

An inversion problem focuses on finding the real unavailable state, using observations related to the state with some noise. In Bayesian inversion, the inversion problems are formulated in a Bayesian framework. Bayesian inversion is used in fields of geophysics and image reconstruction, to name a few. The main focus in our study isr, the unknown variable of interest. It is known through the collected observations, d, which are related to the unknown state with some error. The objective of Bayesian inversion is on assessing [r|d], the unobservable state given the available observations,

[r|d]∼p(r|d) = [p(d)]⁻¹p(d|r)p(r) (3.1)

=const×p(d|r)p(r),

wherep(r|d) is the posterior model,p(d|r) is the likelihood model and p(r) is the prior model. The normalizing constant, const, is usually computer demanding to calculate.

We defineras a spatio-temporal variable, denoted byr= [r₁, . . . ,r_T], where each of the elements is a spatial vector rt, represented on a grid at timet.

We proceed to define the prior, likelihood and posterior models.

Usually, one has some prior knowledge of the problem, as well as other valuable information that would be useful to include in the model. This can, for example, be information about the hidden states that originates from expert knowledge of the field about the behaviour of the state. This prior information can be included in the prior model, which is denoted with

r∼p(r), (3.2)

11

(24)

12 CHAPTER 3. METHODOLOGY

whererrepresents a spatio-temporal variable{r(x, t);x∈ D, t∈ T }.

The likelihood model represents the link between the observationsdand the variable of interestr,

[d|r]∼p(d|r). (3.3)

As mentioned previously, the aim of the Bayesian inversion is assessing the posterior distribution, p(r|d). The posterior model includes the previously discussed prior and likelihood models, and is then given as,

p(r|d) =const×p(d|r)p(r), (3.4)

where const= [p(d)]⁻¹ = [R

p(d|r)p(r)dr]⁻¹.

3.2 Bayesian spatio-temporal models

In this section, we introduce some Bayesian spatio-temporal models, discretized on {(x, t);x ∈ LD, t ∈ LT}, represented by r = (r1, . . . ,rT), where rt is the discretized spatial variable at timet∈ L_T. The observations appear at d= (d1, . . . ,dT).

We assumer= (r₁, ...,r_T)to be following a Markov chain. The main characteristic of a Markov chain is its memoryless property, also called the Markov property. This means that the current state only relies on previous k states. We call such a Markov chain a k-th order Markov chain. When k = 1, we have a first order Markov chain, which is defined asp(r_t|r_t−1, ...,r₁) =p(r_t|r_t−1).

Consequently, the prior model is defined as, p(r) =p(r1)

T

Y

t=2

p(rt|rt−1) (3.5)

[r_t|r_t−1] =ω_t(rt−1,^r|r_t )∼p(r_t|r_t−1),

with ωt(·) :R²ⁿ → Rⁿ and an arbitrary random n-vector, ^r|r_t . Here, p(r1) is a known pdf of the initial state.

Assuming observationsd are conditionally independent givenr, as well as single-site response, the likelihood model is defined as,

p(d|r) =

T

Y

t=1

p(d_t|r) =

T

Y

t=1

p(d_t|r_t) (3.6)

[d_t|r_t] =ν_t(r_t,^d|r_t )∼p(d_t|r_t),

(25)

r₁ r₂ r_T rT+1

d₁ d₂ d_T

Figure 3.1: DAG of a first order hidden Markov model. Here,r_t,witht= 1, . . . , T+ 1, is an unobserved state, while d_t, t= 1, . . . , T represents the observed state.

withνt(·) :R²ⁿ→Rⁿ and an arbitrary random vector m-vector, ^d|r_t . The likelihood of an observation at a time tis independent of hidden variables and observations from the previous time steps, and only a function of the current state rt.

Finally, the goal is to find the posterior model to be able to predict the hidden variables of interest. The hidden Markov model, illustrated through a direct acyclic graph in Figure 3.1, is used for parameter inference and prediction of such hidden states.

The posterior model is defined as, p(r|d) = p(d|r)×p(r)

p(d) (3.7)

=const×p(r1)p(d1|r₁)

T

Y

t=2

p(dt|r_t)p(rt|r_t−1)

=p(r₁|d)

T

Y

t=2

p(r_t|r_t−1,d_t:T),

where the constant is the inverse of the marginal likelihood,p(d), which is necessary for finding the posterior distribution. However, determining this constant can be quite computationally demanding. Note that in this formulation of the posterior model, the posterior temporal transition matrix is only dependent on the observations from the current step and later, see Moja et al. (2019) for more details.

The marginal distribution, p(d;θ_l,θ_p), is also accessible through Bayes’ rule,

p(r|d;θl,θp) = [p(d;θl,θp)]⁻¹p(d|r;θl)p(r;θp), (3.8) where [p(d;θl,θp)]⁻¹ = R

p(d|r;θl)p(r;θp)−1

is the constant from 3.4, while θl and θ_p are unknown likelihood and prior parameters. The marginal distribution can then be maximized to estimate the unknown parameters,

[ˆθl,θˆp] =argmax_θ[p(d;θl,θp)], (3.9) whereθˆl,θˆp are estimates of the unknown likelihood and prior parameters.

We proceed to introduce different models that can be used to obtain the posterior distribution.

(26)

14 CHAPTER 3. METHODOLOGY 3.2.1 Discrete hidden Markov model

Assume that the variable of interest r = (r1, . . . ,rT) takes on a finite set of discrete outcomes, hence rt ∈ Ωr : Nⁿ_[1,n_c_]. The prior model is specified by p(r1);r1 ∈ Ωr and p(r_t|r_t−1);r_t,rt−1 ∈ Ω_r, the latter expression being the temporal transition matrices.

For this case, the constant can be calculated by summation for arbitrary likelihood models,p(dt|r_t). Hidden Markov model can be implemented, and posterior distribution assessed, using recursive algorithms, most notable being the forward-backward algorithm, see Baum et al. (1970), Scott (2002). The posterior model can also be efficiently assessed by the Reverse algorithm, see Moja et al. (2018), given in Algorithm 1.

Algorithm 1 Reverse algorithm

• Last timeT: const=

"

P

r⁰_T∈Ωr

p(d_T|r⁰_T)p(r⁰_T|r_T₋₁)

#−1

, r_T−1∈Ωr

p(r_T|r_T₋₁,d_T) =const×p(d_T|r_T)p(r_T|r_T₋₁), r_T,r_T−1 ∈Ω_r

• Iterationt=T−1, . . . ,2:

const= P

r⁰_t∈Ωr

h_p(d

t|r⁰_t)p(rt+1|r⁰_t)p(r⁰_t|rt−1) p(rt+1|r⁰_t;d(t+1):T)

i−1

, rt−1 ∈Ωr

p(r_t|r_t−1,d_t:T) =const× ^p(d^t^|r_p(r^t^)p(r^t+1^|r^t^)p(r^t^|r^t−1⁾

t+1|r_t;d_(t+1):T) , r_t,rt−1 ∈Ω_r

• End iteration

• Initial time1:

const=

"

P

r⁰₁∈Ωr

p(d1|r⁰₁)p(r2|r1) p(r2|r⁰₁;d2:T)

#−1

, r₁ ∈Ω_r p(r₁|d_1:T) =const×^p(d_p(r¹^|r¹^)p(r¹^|r¹⁾

2|r₁;d2:T) , r₁ ∈Ω_r Then,

p(r|d) =p(r1|d)Q

p(rt|r_t−1,dt:T)

The algorithm provides all the factors in the last expression of Equation 3.7, which define the posterior model. Realizations from the posterior model can be generated by Algorithm 2.

Using an approach, similar to the Reverse algorithm, one can obtain p(d;θ), which can then be used to find the estimates of unknown model parameters.

3.2.2 Traditional Kalman model

Assume that the variable of interest r= (r1, . . . ,rT) takes continuous outcomes, hence rt ∈Ωr =Rⁿ, and so do the observations d = (d1, . . . ,d_T), hence dt ∈Ω_d=R^m. The traditional Kalman model, Kalman (1960), is defined for this case.

Kalman filter has its origin in engineering branches and has been developed and

(27)

Algorithm 2 Simulation algorithm

• Initialization:

r^s₁∼p(r1|d)

• Begin iteration t= 2, . . . , T: r^s_t ∼p(rt|r^s_t−1,d)

• End iteration Then,

r^s= (r^s₁, . . . ,r^s_T)∼p(r|d)

introduced by Kalman (1960). Since then it has been applied in many different fields, including prognostic health monitoring and signal processing; for more see Auger et al. (2013). Kalman filters are used to predict unobserved variable of interest, based on available observations.

The main assumptions in the traditional Kalman model are that the initial state is Gaussian, the relationship between rt and rt+1 is Gauss-linear, dt and rt have a linear relationship and error terms are additive and Gaussian. We define the prior distribution, p(r₁)∼φ_n(r₁;µ₁,Σ₁). (3.10) Thereafter, we define the observation and process models as a Gauss-linear model,

[r_t+1|r_t] =A_tr_t+η_t (3.11) [d_t|r_t] =H_tr_t+_t, (3.12) where the(n×n)-matrixA_tdefines the model propagation in time, andη_t∼φ_n(η_t;0,Σ^r|r_t ) is the Gaussian distributed process error with process error covariance matrixΣ^r|r_t , independent of r. The(m×n)-matrix, Ht is the observation operator that maps the hidden variable to the observations, and _t∼φ_m(_t;0,Σ^d|r_t ) is the Gaussian distributed observation error, with Σ^d|r_t being the observation error covariance matrix, independent of r.

Matrices Ht and At are assumed known, and may vary with time.

We write the forward model and the likelihood model as,

p(r_t+1|r_t) =φ_n(r;A_tr_t,Σ^r|r_t ) (3.13) p(dt|r_t) =φm(d;Htrt,Σ^d|r_t ). (3.14) Then the expression for joint distribution, at a time tcan be calculated,

(



 rt

d_t r_t+1



)∼p(r_t,d_t,r_t+1) =φ_n+m+n(



 µ^r_t µ^d_t µ^r_(t+1)



,







Σ^r_t Γ^rd_t Γ^rr_t(t+1) Γ^dr_tt Σ^d_t Γ^dr_t(t+1) Γ^rr_(t+1)t Γ^rd_(t+1)t Σ^r_(t+1)





). (3.15)

(28)

16 CHAPTER 3. METHODOLOGY From Equation 3.15, we can by recursion obtain the joint distribution, p(r,d).

The algorithm for assessing the joint Gaussian model under the traditional Kalman model is given in Algorithm 3.

Algorithm 3 Joint traditional Kalman model

• Initialization:

r1 ∼p(r1) =φ3n(µ₁,Σ1) µ^r₁ =µ₁

Σ^r₁ =Σ₁

• Begin iteration t= 1, . . . , T −1:

Likelihood model:

µ^d_t =H_tµ^r_t

Σ^r|d_t+1 =A_tΣ^r|d_t A^T_t +Σ^r|r_t Iterate s= 1, . . . , t−1:

Γ^rd_ts =Σ^r_tsH^T_t End iterate

Ift >1, for s= 1, . . . , t−1:

Σ^d_ts =HtΓ^rd_ts End iterate Forwarding model:

µ^r_t+1 =Atµ^r_t

Σ^r(t+1),(t+1) =AtΣ^r_ttA^T_t +Σ^r|r_t Iterate s= 1, . . . , t:

Σ^r_(t+1)s=A_tΣ^r_ts Γ^rd_(t+1)s=AtΓ^rd_ts End iterate

End iteration Then,

[r,d]∼p(r,d) =φ_n+m( r

d

; µ^r

µ^d

,

Σ^r Γ^rd Γ^dr Σ^d

)

As the full joint Gaussian model, p(r,d), is available through Algorithm 3, we can use it to obtain the filtered posterior distribution p(r_t|d_1:t) or the smoother posterior distribution p(r|d). The smoothed distribution can then be used to obtain the most probable sequence of states, as it utilizes all of the available observations.

The model is also dependent on model parameters,θ, which can be estimated through maximum marginal likelihood,

ˆθ=argmax_θ[p(d;θ].

(29)

3.2.3 Ensemble Kalman model

Assume that the variable and observations are rt∈Ωr=Rⁿ and dt∈Ωd=R^m, similar to the case for the traditional Kalman model. Let, however, the Gaussian and Gauss- linear assumptions of the traditional Kalman model be violated. Assume non-linear forward and likelihood models, as it is often more representative of a real world problem. The ensemble Kalman model, introduced by Evensen (1994), is suitable for solving high-dimensional problems with potential non-linear forward and likelihood models, see Myrseth et al. (2010).

We define the prior model in a general form,

r1 ∼p(r1). (3.16)

As the forward model can be assumed non-linear, we proceed with the general notation,

[rt+1|r_t] =ωt(rt,η_t)∼p(rt+1|r_t) (3.17) [dt|r_t] =νt(rt,t)∼p(dt|r_t),

whereωt(·) and νt(·)are known functions with Gaussian components,

η_t∼φn(η_t;0,In) (3.18)

t∼φm(t;0,Im).

The corresponding pdfs may not be available in functional form. Through ensemble Kalman filter, one represents ensembles containing approximate realizations of the distribution of the hidden state, along with the corresponding realizations of observations.

Hence, an ensemble, at timet, is formulated as,

e_t:{[r_t,d_t]^j;j= 1, . . . , n_e}, t= 1, . . . , T. (3.19) The corresponding covariance matrix is defined as,

Σ^rd_t =

Σ^r_t Γ^rd_t Γ^dr_t Σ^d_t

, (3.20)

which may be estimated from the ensemble at each time step t. The algorithm for the ensemble Kalman filter, used for obtaining p(rT|d), is given in Algorithm 4. Using the approximate sample from p(rT+1|d), one can empirically estimate forecast expectation

ˆ

µ_T₊₁= ˆE(r_T₊₁|d) and forecast varianceΣˆ_T₊₁ = ˆVar(r_T₊₁|d).

The ensemble Kalman smoother can be used for obtaining the posterior distribution p(r|d) for the full r, conditional on all the available observations. There are different approaches depending on whether the likelihood or forward model are Gauss-linear or not. Similarly to the Algorithm 3, one can compute necessary covariance matrices in each iteration to gain the joint distribution and have access to smoothed, filtered and forecast distributions, as shown in Algorithm 5. We define the accumulated ensemble, containing

(30)

18 CHAPTER 3. METHODOLOGY

Algorithm 4 Ensemble Kalman filter

• Initialization:

ne - ensemble size

r^j₁ ∼p(r₁), j= 1, . . . , n_e ^j₁∼φ_m(;0,I_m), j = 1, . . . , n_e d^j₁ =ν₁(r^j₁,^j₁), j= 1, . . . , n_e e1 :{[r₁,d1]^j;j= 1, . . . , ne}

• Begin iteration t= 1, . . . , T −1:

Conditioning step:

EstimateΣ^rd_t fromn_e →Σˆ^rd_t

r^r|d,j_t =r^j_t+ ˆΓ^rd_t [ ˆΣ^d_t]⁻¹(dt−d^j_t), j= 1, . . . , ne

e_t:{[r^r|d_t ,d_t]^j;j= 1, . . . , n_e} Forwarding step:

η^j_t ∼φn(η;0,In), j= 1, . . . , ne

r^j_t+1 =ω_t(r^r|d,j_t ,η^j_t), j = 1, . . . , n_e ^j_t+1∼φ_m(;0,I_m), j = 1, . . . , n_e d^j_t+1=νt+1(r^j_t+1,^j_t+1), j= 1, . . . , ne

• End iteration

EstimateΣ^rd_T fromn_e →Σˆ^rd_T

r^r|d,j_T =r^j_T + ˆΓ^rd_T [ ˆΣ^d_T]⁻¹(d_T −d^j_T), j= 1, . . . , n_e η^j_T ∼φn(η;0,In), j = 1, . . . , ne

r^j_T₊₁ =ω_t(r^r|d,j_T ,η^j_T), j= 1, . . . , n_e Then,

{r^j_T₊₁;j= 1, . . . , ne}is approximate sample fromp(rT+1|d)

conditioned variable from the current time step in addition to all of the previous time steps,

E^r_0:t:{r^r|d,j_0:t ;j= 1, . . . , n_e}, t= 1, . . . , T. (3.21) The corresponding covariance matrix is defined as,

Σ^rd_0:t=

Σ^r_0:t Γ^rd_0:t Γ^dr_0:t Σ^d_t

, (3.22)

and is estimated from accumulated and observation ensembles.

To make parameter inference, the model parametersθ can be included in the ensemble, r^∗_t =

r_t θ

. (3.23)

It is necessary to provide priors for the parameters,p(θ). Through an algorithm, similar to Algorithm 4, one can obtain the approximate samples of model parameters from p(θ|d).

(31)

Algorithm 5 Ensemble Kalman Smoother

• Initialization:

ne - ensemble size

r^j₁∼p(r1), j= 1, . . . , ne

E^r₁ :{r^j₁;j = 1, . . . , ne}

^j₁ ∼φ_m(;0,I_m), j= 1, . . . , n_e d^j₁=ν₁(r^j₁,^j₁), j= 1, . . . , n_e e^d₁ :{d^j₁;j= 1, . . . , n_e}

• Begin iteration t= 1, . . . , T: Conditioning step:

EstimateΣ^rd_1:t from[E^r_1:t,e^d_t]→Σˆ^rd_1:t

r^r|d,j_1:t =r^j_1:t+ ˆΓ^rd_1:t[ ˆΣ^d_t]⁻¹(d_t−d^j_t), j= 1, . . . , n_e Forwarding step:

η^j_t ∼φn(η;0,In), j = 1, . . . , ne

r^j_t+1=ω_t(r^r|d,j_t ,η^j_t), j= 1, . . . , n_e E^r_1:t+1 :{E^r_1:t,r^j_t+1;j= 1, . . . , ne} ^j_t+1 ∼φm(0,Im), j= 1, . . . , ne

d^j_t+1 =ν_t+1(r^j_t+1,^j_t+1), j= 1, . . . , n_e e^d_t+1:{d^j_t+1;j= 1, . . . , n_e}

• End iteration

(32)

Chapter 4

Infectious disease model

We study the domain D ⊂ R² and time T ⊂ R[0,T], and define the following spatial variables,

{b^N(x);x∈ D}; b^N(x)∈ B:{0,1}

{b^S(x, t);x∈ D, t∈ T }; b^S(x, t)∈ B:{0,1}

{b^I(x, t);x∈ D, t∈ T }; b^I(x, t)∈ B:{0,1} (4.1) {b^R(x, t);x∈ D, t∈ T }; b^R(x, t)∈ B:{0,1},

whereb^N(·)is an event spatial variable representing presence of an individual at a location x. Correspondingly,b^S(·), b^I(·)andb^R(·)are associated individual-state spatial variables, representing whether an individual is susceptible (S), infected (I) or recovered (R). We assume that b^N(x) is known for all x ∈ D. In reality, we condition on this knowledge, (·|b^N(x)), but for practical purposes we will leave out this notation from the rest of the study. Note that an individual cannot belong to multiple groups at the time t,

b^N(·) =b^S(·, t)∪bÎ(·, t)∪b^R(·, t); ∀t∈ T (4.2) b^S(·, t)∩bÎ(·, t) =∅, b^S(·, t)∩b^R(·, t) =∅, bÎ(·, t)∩b^R(·, t) =∅. (4.3) We let the vector of states for an individual be

{b(x, t) = (b^S(x, t), b^I(x, t), b^R(x, t));x∈ D, t∈ T }. (4.4) An individuals affiliation with the groups, or the health status, is described through the vector b(x, t), where each of the binary elements in the vector indicate which group an individual belongs to at timet. The infinidesimal temporal transition rates between the states for an individual is defined as,

S I R

S - β_x^SI(t) 0

I 0 - γ^IR

R 0 0 -

20

(33)

with β_x^SI(t) > 0 being the infection rate, and γ^IR > 0 being the recovery rate.

The infection rate is a function of both x and t, while the recovery rate is constant, independent of location and time. Hence, over time, the only possible transition for a susceptible individual is to infected, and then from infected a transition to the recovered state. Once the individual has reached the recovered state, it stays in the recovered state, which is the absorbing state of the system.

After having defined spatial variables representing individuals, it is natural to define the counting variables for the different states. We define the number of susceptible, infected, recovered, and total number of individuals in an area ∆⊂ D, at a time t∈ T with,

k_∆^N = #{b^N events in ∆⊂ D}

k^S_∆(t) = #{b^S(t) events in ∆⊂ D}

k^I_∆(t) = #{b^I(t) events in ∆⊂ D} (4.5) k^R_∆(t) = #{b^R(t) events in ∆⊂ D}.

The total number of individuals is assumed known. The vector of counting variables is then defined as,

k_∆(t) = [k^S_∆(t), k^I_∆(t), k_∆^R(t)]. (4.6) Ask_∆^N represents the number of individuals in∆⊂ D, assumed temporally constant, for all t’s, the following holds:

k_∆^N =k_∆^S(t) +k^I_∆(t) +k_∆^R(t), t∈ T. (4.7) Individuals cannot belong to multiple groups at the same point in time, and each individual must belong to one of the groups, meaning the whole population in an area

∆is partitioned into these three groups. The sum of the individuals in all three groups, in an area ∆, needs to, at all times, be equal to the total population for the given area.

Assume that an observation of the number of people diagnosed with the infectious disease of interest, in a number of subregions a_i ⊂ D, with i= 1, . . . , m, is denoted with dait=k^Id_a_i_t, at a timet. The observations at a time tcan then be represented by,

d_t= (d_a₁_t, . . . , d_a_m_t). (4.8) Elements of the vector d are collected observations for each of the subregionsai, i= 1, . . . , mat a timet, associated with the counting variable for the infected group.

The infection will, among other thing, depend on interaction between the different subregions, and this can be modelled by explanatory variables that provide information on infrastructure, weather, mobility, population and degree of vaccination. We proceed to define our model in a Bayesian setting.

(34)

22 CHAPTER 4. INFECTIOUS DISEASE MODEL

4.1 Prior model

For our problem, we assume that an individuals presence at x is time independent and can be modelled by,

{[b^N(x)|λ^N(x)];x∈ D},

defined to be a spatially non-stationary Poisson RF, with an intensity field,{λ^N(x);x∈ D}, where λ^N(x) ∈ R^L. Furthermore, as each of the individuals needs to belong to either the group of susceptible, infected or recovered, see Equation 4.2, we define

{[b(x, t)|λ(x, t)];x∈ D, t∈ T }, (4.9) to be a non-stationary, tri-variate Poisson RF, with intensity field[λ(x, t);x∈ D, t∈ T], whereλ(x, t) = (λ^S(x, t), λ^I(x, t), λ^R(x, t)).

Then it is natural to assume

{λ^N(x) =λ^S(x, t) +λ^I(x, t) +λ^R(x, t);x∈ D}; t∈ T, (4.10) since the sum of the individuals across the groups is equal to the total number of inhabi- tants in an area, for all timest, as given in Equation 4.7. Assuming that{λ^N(x);x∈ D}

is known, we define {λ^I(x, t), λ^R(x, t);x ∈ D};t ∈ T, with λ^I(x, t), λ^R(x, t) ∈ R^L, to be the free variables. This lets us express the intensity of the susceptibles through the intensities of the total population and the remaining two groups,

{λ^S(x, t) =λ^N(x)−λ^I(x, t)−λ^R(x, t);x∈ D}; t∈ T. (4.11) We discretize the spatial domainDto a latticex∈ L_D, with grid nodesn=n_y×n_z and{∆_x;x∈ L_D} being the partition ofD.

The intensity over all grid units are, {λ_x(t) =

Z

∆x

λ(x, t)dx;x∈ L_D}; t∈ T, (4.12)

whereλ_x(t) = (λ^S_x(t), λ^I_x(t), λ^R_x(t)),λ_x(t)∈R³^L.

Due to the Poisson RF assumptions, the corresponding conditional number of events over the lattice{[k_x(t)|λ_x(t)];x∈ L_D}, t∈ T, is,

[k_x(t)|λ_x(t)]∼p(k_x(t)|λ_x(t)) =Pois(k_x(t);λ_x(t)), (4.13) where kx(t) = (k^S_x(t), k^I_x(t), k^R_x(t)), for each x ∈ L_D. The vector notation entails that each component is Poisson distributed with the corresponding intensity.

The discretized random field {k_x(t)|λ_x(t);x ∈ LD} is a discretized Poisson random field given the intensity field, and it can be represented by a3n-vector, k(t). Similarly, the discretized intensity field{λ_x(t);x∈ L_D}can be represented by the3n-vector, λ(t).

(35)

As a consequence of the Poisson RF asumption, the joint spatial model for conditional event counts for all the nodes on the lattice L_D is,

[k(t)|λ(t)]∼p(k(t)|λ(t)) = Y

x∈LD

p(kx(t)|λ_x(t)). (4.14)

We consider the discretized intensity field represented by the3n-vectorλ(t), for allt∈ T, to be a discretized Gaussian RF,

λ(t)∼p(λ) =φ_3n(λ;µ_λ_t,Σ_λ_t), (4.15) with the model parameters the expectation vector µ_λ_t, variance vector σ²_λ

t and spatial correlation functionρ_λ(τ);τ =|x⁰−x⁰⁰|,x⁰,x⁰⁰ ∈ D. The Gaussian RF is defined onΩ_x= R, while the intensity λ belongs to Ωλ =R^L, and the probability for having negative realizations of λ in the model depends on the model parameters. This model mismatch may be avoided by using log-Gaussian RF model, further specified in the examples.

Note further that the intensity RF must ensure that λ^N =λ^S(t) +λ^I(t) +λ^R(t), for all t∈ T. This random interpretation of the intensity λ(t) makes the model for the counts hierarchical.

We proceed to define the transition process between the different states as a birth and death process, fort∈ T. We consider the event random process for every individual present. If an individual is susceptible it can either go to being infected or stay susceptible. If the individual gets infected, it recovers after a certain period of time, and stays in the absorbing, recovered state. We assume that becoming infected is a non-stationary Poisson process with infection rate β_x^SI(t), and that recovering is a stationary Poisson process with a constant recovery rate γ^IR, which is space-time independent.

We discretize time in a similar manner as the space, discretizing the temporal domain T to a grid t ∈ LT of size T, where {∆_t;t ∈ LT} is a partition of T. We denote the discretized number of individuals and the associated intensities, for a grid node x∈ L_D, with,

{k_xt;t∈ L_T} (4.16)

{λ_xt;t∈ L_T}.

Assuming that an individual is present at a locationx∈ LD, the transitional probabilities for different states between two times(t, t+ ∆t), are,

t+ ∆_t

S I R

S p^SS_xt p^SI_xt p^SR_xt t I 0 p^II_xt p^IR_xt

R 0 0 1

(36)

24 CHAPTER 4. INFECTIOUS DISEASE MODEL where the transition probability expressions are as defined below,

p^SS_xt =exp(−β_x∆^SI) p^SR_xt =

∆t

Z

0

β_xτ^SIexp(−β_xτ^SI)exp(−γ^IR(∆t−τ))dτ

p^SI_xt = 1−p^SS_x∆_t−p^SR_x∆_t (4.17)

pÎI_xt =exp(−γÎR∆t) pÎR_xt = 1−pÎI_xt, whereβ_x∆^SI =

t+∆

R

t

β_x^SI(t)dt.

We use these transitional probabilities in the definition of the distributions describing the changes in event counts in grid unit∆_x, during the time interval(t, t+ 1), assuming

∆t to be unity for notational convenience,

k^SS_xt ∼Binom(k;k^S_xt, p^SS_xt) k^SR_xt ∼Binom(k;k^S_xt, p^SR_xt )

k^SI_xt =k_xt^S −k^SS_xt −k^SR_xt (4.18) kÎI_xt ∼Binom(k;kÎ_xt, pÎI_xt)

kÎR_xt =k_xtÎ −kÎI_xt k_xt^RR =k_xt^R,

where the superscript indicates the start and end group of the transition happening during the time interval(t, t+ 1). Then we can write,





 k_x(t+1)^S k_x(t+1)^I k_x(t+1)^R

kxt







=k^SS_xt

=k^II_xt+k_xt^SI

=k^RR_xt +k^SR_xt +k^IR_xt,

(4.19)

which ensures thatP

l∈Hk^l_xt=P

l∈Hk^l_x(t+1) =k_x^N holds, forH={S, I, R}. The changes in event counts during one time step are dependent on the movement of individuals between the groups. The number of infected is the sum of all the individuals from the previous time step that stay infected, and all the individuals that have transitioned from being suscpetible to infected during the current time step. The change in the count of recovered individuals includes only the influx of the new individuals. Once an individual has recovered, it stays in that group. Then the number of recovered individuals are all the individuals that were in the recovered group from the previous time steps, in addition to all the infected individuals that recovered in the current time step and those that were susceptible, became infected, and recovered in one time step. Notice that the event