Parametric, piecewise linear programming - 4 ENSEMBLE UPDATING OF BINARY STATE VECTORS

unchanged components

4 ENSEMBLE UPDATING OF BINARY STATE VECTORS

4.3 Parametric, piecewise linear programming

In this section, we look further into the backward recursion of the DP algorithm described in Section 4.2. As we shall see, each step of the recursion involves the setup of an optimization problem that we refer to as a parametric, piecewise linear program, namely, an optimization problem with a piecewise linear objective function subject to a set of linear constraints, which we solve as a function of the parametertk. For simplicity of writing, we now introduce the following notations:

q^ij_k=q(xk=0xk 1=i,xk=j,y), (24a)

qⁱ₁=q(x1=0x1=i,y), (24b)

f_k^ij=f(xk 1=i,xk=j y), (24c)

k(tk) = t_k(xk 1=i,xk=j y), (24d) q_k^ij(tk) =q_t_k(xk=0xk 1=i,xk=j,y), (24e)

i jk 1=f(xk=i xk 1=j), (24f) fori,j {0,1}andk 2.

Reconsider the initial step of the backward recursion. The goal of this step is to computeE_n(tn) in (20) andqn(tn)in (21). The objective function at this step, En(tn,qn), can be computed as

En(tn,qn) = _n⁰⁰(tn)q⁰⁰n + ⁰¹_n(tn)(1 q⁰¹n) + _n¹⁰(tn)q¹⁰n + _n¹¹(tn)(1 q¹¹n). (25) Since ⁰¹n(tn) + _n¹¹(tn) =f(xn=1), we can, after rearranging the terms, rewrite (25) as

En(tn,qn) = _n⁰⁰(tn)q⁰⁰n 01

n(tn)q⁰¹n + _n¹⁰(tn)q¹⁰n 11

n (tn)q¹¹n +f(xn=1). (26) As a function of the parametertn [t^min_n ,t_n^max], we are interested in computing the solution ofqnwhich maximizes (26). In this regard one needs to take the constraint in (9) into account.

Specifically, the constraint entails at this step that

(xn 1,xn y) =f(xn 1,xn y)

for allxn 1,xn {0,1}. Hence, using that (xn 1,xn,xny) = (xn 1,xny)q(xn xn 1,xn,y), and that (xn 1,xny)follows by summing outxnfrom (xn 1,xn,xn y), we see thatqnmust fulfill

f(xn 1,xny) =

x_n (xn 1,xny)q(xnxn 1,xn,y).

This requirement leads to four linear equations of which two are linearly independent, one where we setxn 1=0 and one where we setxn 1=1. Using the notations in (24a)–(24d), the two linearly

independent equations can be written as

fn⁰⁰= ⁰⁰n(tn)q⁰⁰n + n⁰¹(tn)q⁰¹n, (27a) fn¹⁰= ¹⁰_n(tn)q¹⁰n + n¹¹(tn)q¹¹n. (27b) Additionally, we know thatq⁰⁰n,q⁰¹n,q¹⁰n,andq¹¹n can only take values within the interval [0,1],

0 q^ijn 1, for all i,j {0,1}. (28)

To summarize, we want, as a function of the parametertn [t^minn ,tn^max], to compute the solu-tions ofq⁰⁰_n,q⁰¹_n,q¹⁰_n, andq¹¹_n which maximize the function (26) subject to the constraints in (27) and (28). For any fixedtn, this is a maximization problem where both the objective function and all the constraints are linear inq⁰⁰n,q⁰¹n,q¹⁰n, andq¹¹n. As such, the maximization problem can, for a given value oftn, be formulated as a linear program and solved accordingly. In Appendix A, we show that the optimal solutionsqn⁰⁰(tn),qn⁰¹(tn),qn¹⁰(tn), andqn¹¹(tn)are piecewise-defined functions oftn and easy to compute analytically. Furthermore, we show that the correspond-ing functionEn(tn), obtained by insertingqn⁰⁰(tn), qn⁰¹(tn),qn¹⁰(tn), andqn¹¹(tn)into (26), is a continuous piecewise linear (CPL) function oftn.

Next, consider the intermediate steps of the backward recursion, that is,k=n 1,n 2,…,2.

At each such step, the aim is to computeE_{k n}(tk)in (22) andq_k(tk)in (23). The objective function at each step reads

Ek n(tk,qk) =Ek(tk,qk) +E_(k+1)_n(tk+1(tk,qk)), (29) and this function is to be maximized with respect toqk. The first term,Ek(tk,qk), in (29) can be computed as

Ek(tk,qk) = ⁰⁰_k(tk)q⁰⁰_k ⁰¹_k(tk)q⁰¹_k + _k¹⁰(tk)q¹⁰_k _k¹¹(tk)q¹¹_k +f(xk=1). (30) The second term,E_(k+1)_n(tk+1(tk,qk)), is a CPL function oftk+1. Fork=n 1, this result is immediate, since we know from the first iteration thatEn(tn)is CPL. Fork<n 1, the result is explained in Appendix A. Sincetk+1(tk,qk) is linear inqk, it follows thatE_k+1(tk+1(tk,qk))is CPL inqkfor any giventk [t^min_k ,t_k^max]. Hence, the objective function in (29) is also CPL inqkfor anytk [t^min_k ,t^max_k ]. As in the first backward step, we have the following equality and inequality constraints forqk:

f_k⁰⁰= _k⁰⁰(tk)q⁰⁰_k + _k⁰¹(tk)q⁰¹_k, (31a) f_k¹⁰= ¹⁰_k (tk)q¹⁰_k + _k¹¹(tk)q¹¹_k (31b) and

0 q⁰⁰_k,q⁰¹_k,q¹⁰_k,q¹¹_k 1. (32) Additionally, we now need to incorporate constraints ensuring thatqkandtkreturn a value tk+1within the interval[t^min_k+1,t^max_k+1], wheret^min_k+1 andt_k+1^max are given by (15) and (16), respectively.

That is, we require

t^min_k+1 tk+1(tk,qk) t^max_k+1, (33) wheretk+1(tk,qk) follows from (17) as

tk+1(tk,qk) = _k⁰⁰(tk)q⁰⁰_k ^{0 0}_k + _k⁰¹(tk)q⁰¹_k ^{0 1}_k + _k¹⁰(tk)q¹⁰_k ^{0 0}_k + _k¹¹(tk)q¹¹_k ^{0 1}_k . (34) Clearly, for any fixedtk [t^min_k ,t^max_k ], all the constraints (31)-(33) are linear inqk. However, the objective function in (29) is only piecewise linear. As such, we are not faced with a standard linear program, but a piecewise linear program. Piecewise linear programs are a well-studied field of linear optimization and several techniques for solving such problems have been proposed and studied, see for instance Fourer (1985, 1988, 1992). The most straightforward approach is to solve the standard linear program corresponding to each line segment of the objective function separately, and afterward compare the solutions and store the overall optimum. This technique can be inefficient and is not recommended if the number of pieces of the objective function is relatively large. However, in our case, the objective functions normally consist of only a few pieces.

For example, in the simulation experiment of Section 5.2, where a modelq(x x,y)is constructed as much as 1,000 times, the largest number of intervals observed is 10 and the average number of intervals is 4.35. We therefore consider the straightforward approach as a convenient method for solving the piecewise linear programs in our case, but we note that more elegant strategies exist and may have their advantages. Further details of our solution are presented below.

First, some new notations needs to be introduced. For each 2 k n, we letMkdenote the number of pieces, or intervals, ofE_{k n}(tk), and we let t^B(j)_k , j=1,…,Mk+1, denote the cor-responding breakpoints. Note that for the first and last breakpoints, we havet^B(1)_k =t_k^min and t^B(M_k ^k⁺¹⁾=t_k^max. Furthermore, we letI_k^(j)= [t^B(j)_k ,t^B(j+1)_k ] [t^min_k ,t^max_k ]denote interval numberj, and

k= {1,2,…,Mk} the set of interval indices. For eachj k, E_{k n}(tk)is defined by a linear function, which we denote byE_k^(j)(tk), whose intercept and slope we denote bya^(j)_k andb^(j)_k, respectively.

Each linear piece,E_k+1^(j)(tk+1), of the piecewise linear functionE_(k+1)_n(tk+1)leads to a stan-dard parametric linear program. Specifically, if E_(k+1)_n(tk+1(tk,qk)) in (29) is replaced with E_(k+1)^(j) _n(tk+1(tk,qk)), we obtain an objective function

E_{k n}^(j) (tk,qk) =Ek(tk,qk) +E_(k+1)^(j) _n(tk+1(tk,qk)), (35) which is linear, not piecewise linear, as a function ofqk. The corresponding constraints forqk

are given in (31) and (32), but instead of (33), we require thattkandqkreturn a valuetk+1(tk,qk) within the intervalI_k+1^(j) ,

t^B(j)_k+1 tk+1(tk,qk) t_k+1^B(j+1). (36)

Using (30), (34), and thatE_k+1^(j)(tk+1) =a^(j)_k+1+b^(j)_k+1tk+1, we can for eachj k+1rewrite (35) as E_{k n}^(j) (tk,qk) = _k^00(j)(tk)q⁰⁰_k + ^01(j)_k (tk)q⁰¹_n ₁+ _k^10(j)(tk)q¹⁰_k + ^11(j)_k (tk)q¹¹_k + _k^(j), (37)

where

00(j)

k (tk) = (b^(j)_k+1 ^{0 0}_k +1) _k⁰⁰(tk),

01(j)

k (tk) = (b^(j)_k+1 ^{0 1}_k 1) _k⁰¹(tk),

10(j)

k (tk) = (b^(j)_k+1 ^{0 0}_k +1) _k¹⁰(tk),

11(j)

k (tk) = (b^(j)_k+1 ^{0 1}_k 1) _k¹¹(tk), and

(j)

k =a^(j)_k+1+f(xk=1).

To summarize, we obtain for eachj k+1a standard parametric linear program, with the objec-tive function given in (37) and the constraints given in (31), (32), and (36). Solving the parametric linear program corresponding to eachj k+1, yields the following quantities:

E^(j)_{k n}(tk) =max_q

k E^(j)_{k n}(tk,qk), (38)

q^(j)_k(tk) =argmax

q_k E^(j)_{k n}(tk,qk). (39)

The overall maximum value E_{k n}(tk)and corresponding optimal solution q_k(tk)are then available as

E_{k n}(tk) =E^j_{k n}^k+1^(t^k⁾(tk) and

q_k(tk) =q^(j_k^k+1^(t^k⁾⁾(tk) where

j_k+1(tk) =argmax

j _k+1 E^(j)_{k n}(tk).

As previously mentioned, and as shown in Appendix A,E_{k n}(tk)is a CPL function oftk. As such, E_{k n}(tk)is fully specified by its breakpoints and the function values at those points. The break-points ofE_{k n}(tk)can be computed prior to the maximization. Thereby, we can obtainE_{k n}(tk) for all values oftkquite efficiently since we only need to solve the parametric, piecewise linear program at the breakpoints ofE_{k n}(tk).

Finally, consider the last step of the backward recursion,k=1. Here, the goal is to compute q_t₁(x1x1,y)andE₁_n(t1). Essentially, this step proceeds in the same fashion as the intermediate steps, but some technicalities are a bit different since there are only two variables involved inq1, namely,q⁰₁=q(x1=0x1=0,y)andq¹₁=q(x1=0x1=1,y). Also,t1is not a parameter free to vary within a certain range, but a fixed number, namelyt1=f(x1=0), meaning that we obtain spe-cific values forq_t₁(x1x1,y)andE₁_n(t1). The function we want to maximize at this final backward

step, with respect toq1, is

E1n(t1,q1) =E1(t1,q1) +E₂_n(t2(t1,q1)), (40) where now, recalling that (x1y) =f(x1), the first term,E1(t1,q1), can be written as

E1(t1,q1) =t1q⁰₁+ (1 t1)(1 q¹₁). (41) Again, as in the intermediate steps, we have a piecewise linear, not a linear, objective function.

To determine the constraints forq1, we note that the requirement (9) forq(x x,y)entails that f(x1y) = (x1y).

Thereby, since t1=f(x1=0) and using that f(x1y) = _x₁ (x1,x1y) and (x1,x1y) = f(x1)q(x1x1,y), we see that the following requirement must be met byq(x1x1,y):

f(x1y) =t1q(x1x1=0,y) + (1 t1)q(x1x1=1,y). (42) Additionally, we have the inequality constraints

0 q⁰₁, q¹₁ 1. (43)

So, we are faced with a piecewise linear program, with the piecewise linear objective function (40) and the linear constraints (42) and (43). Again, we proceed by iterating through each linear piece ofE₂_n(t2(t1,q1)), solving the standard linear program corresponding to each piece sepa-rately. That is, for eachj 2, we replaceE₂_n(t2(t1,q1))in (40) byE₂^(j)_n(t2(t1,q1))and consider instead the objective function

E₁^(j)_n(t1,q1) =E1(t1,q1) +E₂^(j)_n(t2(t1,q1)), (44) which is linear, not piecewise linear, as a function ofq1. As we did for each subproblemj k+1

in every intermediate backward iteration, we must for each subproblemj 2incorporate the inequality constraints

t^B(j)₂ t2(t1,q1) t^B(j+1)₂ , (45)

where nowt2(t1,q1) follows from (18) and (19) as

t2(t1,q1) =t1q⁰₁ ^{0 0}₁ + (1 t1)q¹₁ ^{0 1}₁ . (46) Using (41), (46), and thatE₂^(j)_n(t2) =a^(j)₂ +b^(j)₂t2, we can rewrite the function in (44) as

E^(j)₁_n(t1,q1) = ₁^0(j)(t1)q⁰₁+ ₁^1(j)(t1)q¹₁+ ₁^(j)(t1), (47) where

10(j)(t1) =t1(1+b^(j)₂ ^{0 0}₁ ),

11(j)(t1) = (1 t1)(1+b^(j)₂ ^{0 1}₁ ),

(j)1(t1) =1 t1+a^(j)₂.

To summarize, we obtain for eachj 2a standard linear program, where the aim is to maximize the objective function (47) with respect toq1subject to the constraints (42), (43), and (45). This program is solved fort1=f(x1=0). Analogously to (38) and (39), let

E^(j)₁_n(t1) =max_q

1 E^(j)₁_n(t1,q1), q^(j)₁(t1) =argmax

q₁ E^(j)₁_n(t1,q1).

Ultimately, we obtain

E₁_n(t1) =E^(j₁²⁾_n(t1) and

q₁(t1) =q^(j₁²⁾(t1) where

j₂=argmax

j ₂ [E^(j)₁_n(t1)].

In document Ensemble updating for a state-space model with categorical variables (sider 53-58)