OPTIMAL CONTROL FOR STOCHASTIC DELAY SYSTEM UNDER MODEL UNCERTAINTY: A STOCHASTIC DIFFERENTIAL GAME APPROACH

(1)

OPTIMAL CONTROL FOR STOCHASTIC DELAY SYSTEM UNDER 2

MODEL UNCERTAINTY: A STOCHASTIC DIFFERENTIAL GAME 3

APPROACH 4

OLIVIER MENOUKEU-PAMEN 5

CMA, Department of Mathematics, University of Oslo, Moltke Moes vei 35, P.O. Box 1053 6

Blindern, 0316 Oslo, Norway.

7

Abstract. In this paper, we study a robust recursive utility maximization problem for time-delayed stochastic differential equation with jumps. This problem can be written as a stochastic delayed differential game. We suggest a maximum principle of this problem and obtain necessary and sufficient condition of optimality. We apply the result to study a problem of consumption choice optimization under model uncertainty.

1. Introduction 8

A common problem in mathematical finance consists of an agent who invests and want 9

to maximize the expected utility of her instantaneous consumption and/or terminal wealth.

10

Recently, there has been an increased interest in problems of utility maximization under model 11

uncertainty (see e.g., [11, 15, 24, 26] and references therein.) In fact, unlike in the standard 12

expected utility maximization, where it is assumed that the investor knows the “original”

13

probability measureP that describes the dynamics of the wealth process; in these papers it is 14

supposed that the investor does not know this probability. In order to take into account this 15

uncertainty, the authors introduced a familyQof probability measuresQwhich are equivalent 16

(or absolutely continuous with respect) to the original measureP and then choose the worst 17

case criteria in the optimization problem. The problem is solved by dynamic programming 18

or stochastic maximum principle or duality arguments. There is already a vast literature on 19

the dynamic programing and the stochastic maximum principle. The reader is e.g. referred 20

to [1, 12, 22, 35, 36] and the references therein.

21

The problem of optimal control for delayed systems has also received a lot of attention 22

recently. (see for e.g., [9, 14, 19, 21] and references therein.) One of the reasons of looking 23

at this problem is that many phenomena have memory dependence i.e., their dynamics at a 24

present time t does not only depend on the situation at time t but also on a finite part of 25

E-mail address: o.m.pamen@cma.uio.no.

2010Mathematics Subject Classification. 34K50, 60H20, 60H10, 60J75, 91G80, 93E20.

Key words and phrases. Model uncertainty, optimal control, stochastic differential utility, stochastic delay equations, time advanced BSDEs, L´evy processes.

The research leading to these results has received funding from the European Research Council under the European Community’s Seventh Framework Programme (FP7/2007-2013) / ERC grant agreement no [228087].

1

(2)

their past history. Such model may be regarded as a stochastic differential delay equations 26

(SDDEs).

27

As a generalization of classical utility utility, Duffie and Epstein [5] introduced the notion 28

of stochastic differential utility (SDU)( or recursive utility). The cost function of such utility 29

is given in terms of an intermediate consumption rate and a future utility, therefore it can be 30

represented as a solution of a backward stochastic differential equation (BSDE). The notion 31

of backward stochastic differential equations (BSDEs) was introduced by Bismut [2] in the 32

linear case to study the adjoint equation associated with the stochastic maximum principle in 33

stochastic optimal control problem. Pardoux and Peng [28] further developed BSDEs in the 34

nonlinear setting and since then the theory has become a useful tool for formulating many 35

problems in mathematical finance and control theory (see [7]). They are many papers dealing 36

with SDU maximization (see e.g., [8, 10, 23, 30] and references therein.) 37

In the stochastic delayed systems, let us mention that, the appearance of time-delayed in 38

the coefficients of the controlled process, leads to time-advancedin the drift of the associated 39

adjoint equations. Note that, time-advanced (or anticipated) BSDEs were studied by Peng 40

and Yang [29] in the continuous case, the results were then applied to study a linear stochastic 41

delay system when there is no delay in the noise coefficient. Øksendal et al. [27] generalized 42

the latter results to the jumps case. Their application also extend the one by Peng and Yang 43

[29] to a nonlinear control problem for stochastic delayed systems and with possible delay in 44

the noise and the jumps coefficients. In the delayed case, the problem of optimal control of 45

recursive utility can be seen as a optimal control for forward-backward stochastic differential 46

delayed system. In the jumps case this problem was studied in [32, 33].

47

The problem of optimal control of recursive utility under model uncertainty was studied by 48

Bordigoni et al. [3] in the continuous case and by Jeanblanc et al. [18] in the discontinuous 49

case via a robust utility maximization technique. In these papers, the penalization function 50

is given by the entropy. On the other hand, assuming that the probability measure Q ∈ Q 51

is a market scenario controlled by the market, this problem can be interpreted as a zero-sum 52

stochastic differential game between the agent who optimizes her instantaneous consumption 53

and/or portfolio, and the market choosing the scenario Q. In a general non-Markovian case, 54

this problem was solved by Øksendal and Sulem [25], using stochastic maximum principle.

55

In the present paper, we consider a problem of optimal control for stochastic delay system 56

under model uncertainty, in a general non-Markovian setting. In this regard, the problem 57

cannot be solved by a dynamic programming argument. We shall therefore study the problem 58

using a stochastic maximum principle approach. Our problem can be regarded as a stochastic 59

differential game of a system of forward-backward stochastic differential delay equations. We 60

derive sufficient and necessary conditions of optimality.

61

This paper can be seen as a generalization of [32] to model uncertainty and with delay of 62

moving average time in the coefficients (but without delay in the control). We also extend the 63

work in [3, 18] by considering delay in the coefficients of the state process, and more general 64

SDU and penalization functions. Moreover, our paper can be consider as a dynamic time 65

delayed version of [26].

66

We apply the results to find the optimal consumption rate from a cash flow with delay under 67

model uncertainty and general recursive utility. This is a generalization to the stochastic 68

differential utility under model uncertainty of [4].

69

The paper is organized as follows: In Section 2, we motivate and formulate our control 70

problem. In Section 3, we obtain a stochastic maximum principle for delayed stochastic 71

differential games for this general non-Markovian stochastic control problem under model 72

(3)

uncertainty. We apply our result to study a problem of consumption choice optimization 73

under model uncertainty and delay.

74

2. Problem formulation 75

In this section, we briefly present the model in [10] and then formulate the optimization 76

problem.

77

2.1. A motivating example.

78 79

Let (Ω,F,F= (F_t)0≤t≤T, P) be a complete filtered probability space that satisfies the usual conditions with T being a finite horizon. For any probability measure Q P on F_T, the density process ofQwith respect toP is the RCLLP-martingale Z^Q= (Z^Q(t))0≤t≤T with

Z^Q(t) = dQ dP Ft

=E hdQ

dP F_ti

The following model by Faidi et al. [10] illustrates the situation. Suppose the financial market has two investments opportunities: a bond S0 and a risky assetS. Without loss of generality, we assume that the price of the bond in constant otherwise we consider the bond as a numeraire. We assume thatS is a continuous semimartingale with canonical decomposition:

S(t) =S(0) +N(t) +D(t), t∈[0, T].

Here< N >denotes the quadratic variation of the continuous martingaleN. We shall assume that< N > is absolutely continuous with respect to the Lebesgue measure on [0, T] and we define the positive predictable processσ = (σ(t))0≤t≤T by

< N >t= Z t

0

σ(s)ds, t∈[0, T].

Assume that there exists a predictable process λ= (λ(t))0≤t≤T such that D(t) =

Z t 0

σ(s)λ(s)ds, t∈[0, T].

Assume that

K(T) =< λdN >_T= Z T

0

σ(s)λ²(s)ds, t∈[0, T] is bounded a.s.

Let us consider an investor who can consume between time 0 and time T and denote by 80

c= (c(t))0≤t≤T her consumption rate. If she chooses a portfolioH= (H(t))0≤t≤T representing 81

the number of risky assets invested in the portfolio andS-integrable, the corresponding wealth 82

process A(t), t∈[0, T], will have the dynamics 83

dA(t) =H(t)dS(t)−c(t)dt, A(0) =a₀>0. (2.1) In the case of a continuous filtration, Bordigoni et al. [3] study stochastic control problem arising in the context of utility maximization under model uncertainty. Their goal is to find Q∈ Q_f that minimizes the following functional

E_QhZ T 0

αS^κ(s)U₁(s)ds+ ¯αS^κ(T)U₂(T)i

+βE_Qh

R^κ(0, T)i where

Q_f = n

Q|QP, Q=P on F₀ and H(Q|P) :=EQ

h logdQ

dP io

,

(4)

α and ¯α are non negative constants, β ∈ (0,∞), κ = (κ(t))0≤t≤T and U₁ = (U₁(t))0≤t≤T

84

are progressively measurable processes,U2(T) is a F_T-measurable random variable, S^κ(t) = 85

e^R⁰^t^κ(s)ds is the the discounting factor andR^κ(t, T) is the penalization term which is the sum 86

of the entropy rate and the terminal entropy:

87

R^κ(t, T) = 1 S^κ(t)

Z T t

κ(s)S^κ(s) logZ^Q(s)

Z^Q(t)ds+S^κ(T)

S^κ(t) logZ^Q(T)

Z^Q(t). (2.2) The authors prove that in general there exists a unique optimal measure Q^∗ and show that 88

Q^∗ is equivalent toP. In the case of a dynamic value process i.e., 89

V(t) = ess inf

Q∈Q_f Y^Q(t), (2.3)

where 90

Y^Q(t) = 1

S^κ(t)E_QhZ _T

t

αS^κ(s)U₁(s)ds+ ¯αS^κ(T)U₂(T) F_ti

+βE_Qh

R^κ(t, T) F_ti

. (2.4) They also show that, ifFis a continuous filtration, then the dynamics of (Y(t))0≤t≤T is given 91

by the following BSDE 92







dY(t) =

κ(t)Y(t)−αU1(t)

dt+ 1

βdhM^Yi_t+dM^Y(t); t∈[0, T] Y(T) = αU¯ 2(T).

(2.5) Faidi et al. [10] study the problem of utility maximization over a terminal wealth and 93

consumption in complete market when the value function is given by (2.5). The existence 94

and uniqueness of an optimal strategy is proved.

95

Jeanblanc et al. [18] generalize these results to model with jump and in the case of a 96

discontinuous filtration. They prove that the robust optimization problem is the solution of 97

a quadratic BSDE. Note that their work also extends the result of Duffie and Skiadas [6] and 98

El Karoui et al. [8] to the robust case and including jumps.

99

In this paper we generalize forκ= 0 the later situation in many directions 100

• We study more general utility and convex penalty functions.

101

• We include delay in our wealth process.

102

2.2. Problem formulation.

103 104

Let {B(t)}_0≤t≤T be a Brownian motion and Ne(dζ, ds) = N(dζ, ds)−ν(dζ)ds be a com- 105

pensated Poisson random measure associated with a L´evy process with L´evy measure ν on 106

the (complete) filtered probability space (Ω,F,F={F_t}0≤t≤T, P), with T >0, a fixed time 107

horizon. In the sequel, we assume that the L´evy measureν fulfills 108

Z

R0

ζ²ν(dζ)<∞, where R0:=R\ {0}.

109

We also point out that the filtration F = {F_t}_0≤t≤T is generated by the Brownian motion 110

and the Poisson random measure.

111

(5)

Suppose that the state process (or wealth process) A(t) =A^(v)(t, ω); 0≤t≤T, ω∈Ω is 112

a controlled stochastic delay equation of the form:

113











dA(t) = b(t, A(t), A1(t), A2(t), v(t), ω)dt + σ(t, A(t), A1(t), A2(t), v(t), ω)dB(t) +

Z

R0

γ(t, A(t), A1(t), A2(t), v(t), ζ, ω)Ne(dζ, dt); t∈[0, T] A(t) = a0(t); t∈[−δ,0],

(2.6) where

114

A₁(t) =A(t−δ), A₂(t) = Z _t

t−δ

e^−ρ(t−r)X(r)dr, (2.7) andδ >0, ρ≥0 and T >0 are given constants. v(·) is the control process.

115

The functions b : [0, T] × R × R × R × V × Ω → R, σ : [0, T] × R × R × R × 116

V × Ω → R and γ : [0, T] ×R× R× R× V × R0 ×Ω → R are given such that for 117

all t, b(t, a, a₁, a₂, v,·), σ(t, a, a₁, a₂, v,·) and γ(t, a, a₁, a₂, v, z,·) are F_t-measurable for all 118

a∈R, a1 ∈R, a2 ∈R, v ∈ V andz∈R0. We assume that the function a0(t) is continuous 119

and deterministic.

120

Let consider the preceding model uncertainty setup and assume that the law of the con- 121

trolled process belong to a family of equivalent measures whose densities are 122











dG^θ(t) = G^θ(t⁻)(θ₀(t)dB(t) + Z

R0

θ₁(t, ζ)Ne(dζ, dt); t∈[0, T +δ]

G^θ(0) = 1,

G^θ(t) = 0, t∈[−δ,0).

(2.8)

• θ= (θ0, θ1) may be regarded as a scenario control, 123

• V is the set of admissible controlsv, 124

• Ais the set admissible scenario controlsθassumed to beF_t-predictable and such that 125

E Z T

0

θ²₀(t) + Z

R0

θ₁²(t, ζ)ν(dζ)

dt

<∞ and θ1(t, z)≥ −1 +εfor someε >0 126

Assume the following in Equation (2.4):

127

α= ¯α=β = 1, κ= 0, U₁(t) =f(t, A(t), A₁(t), A₂(t), v(t)) U2(T) =g(A(T)),R^κ(t, T) =R(t, T) =

Z T t

h(θ(s))ds (2.9)

wheref, g are given concave functions, increasing with a strictly decreasing derivative, and h 128

is a convex function.

129

The robust optimization problem we consider is therefore:

130

Problem P1. Find (bv,θ)b ∈ V × A such that 131

ess sup

v∈V

ess inf

θ∈A E_Qθ[W_t(v, θ)

F_t] =E_Qθ∗W_t[(bv,θ)b

F_t] = ess inf

θ∈A ess sup

v∈V

E_Qθ[W_t(v, θ) F_t]

(2.10) where

W_t(bv,θ) =b Z T

t

f(s, A(s), A₁(s), A₂(s), v(s), ω)ds+g(A(T), ω) + Z T

t

h(θ(s))ds.

(6)

This problem can be seen as a stochastic differential game problem.

132

Let{E_t¹}_0≤t≤T and{E_t²}_0≤t≤T be given subfiltration of{F_t}_0≤t≤T representing the amount of information available to the controllers at timet. We assume thatv∈ V isE_t¹-predictable andθ∈ AisE_t²-predictable. We define

f₁(t, a, a₁, a₂, u) :=f(t, a, a₁, a₂, v) +h(θ); u= (v, θ).

Then

E_Qθ[W(v, θ)] =Eh

G^θ(T)g(A^v(T)) + Z T

0

G^θ(s)f₁(t, A^v(t), A^v₁(t), A^v₂(t), u(t))dsi Put

133

Y(t) =E

hG^θ(T)

G^θ(t)g(A^v(T)) + Z T

t

G^θ(s)

G^θ(t)f1(t, A^v(t), A^v₁(t), A^v₂(t), u(t))ds F_ti

(2.11) If follows from Lemma A1 that Y(t) is the solution of the following linear BSDE

134











dY(t) = −

f1(t, A^v(t), A^v₁(t), A^v₂(t), u(t)) +θ0(t)Z(t) + Z

R⁰

θ1(t, ζ)K(t, ζ)ν(dζ)

dt +Z(t)dB(t) +

Z

R0

K(t, ζ)Ne(dζ, dt); t∈[0, T] Y(T) = g(A^v(T)).

(2.12) Note that

Y(0) =Y^v,θ(0) =E_Qθ[W(v, θ)].

Thus the Problem P1 becomes 135

Problem P2. Find (bv,θ)b ∈ V × A such that 136

ess sup

v∈V

ess inf

θ∈A Y^v,θ(t) =Y^b^v,^b^θ(t) = ess inf

θ∈A ess sup

v∈V

Y^v,θ(t), (2.13) where Y^v,θ(t) is given by the forward-backward delayed system (2.6)& (2.12).

137

In the next section, we shall solve Problem P2 under more general coefficients using sto- 138

chastic maximum principle for delayed differential games.

139

3. A stochastic maximum principle for delayed stochastic differential games 140

In this Section, we study Problem P2 with more general driver in the BSDE (2.12). We 141

prove a necessary and sufficient stochastic maximum principle for stochastic differential games 142

of forward-backward SDEs with delayed.

143

Suppose that the state processA(t) =A^(u)(t, ω); 0≤t≤T, ω ∈Ω is a controlled stochas- 144

tic delay equation of the form:

145











dA(t) = b(t, A(t), A1(t), A2(t), u(t), ω)dt + σ(t, A(t), A1(t), A2(t), u(t), ω)dB(t) +

Z

R0

γ(t, A(t), A1(t), A2(t), u(t), ζ, ω)Ne(dζ, dt); t∈[0, T] A(t) = a₀(t); t∈[−δ,0],

(3.1)

(7)

where 146

A1(t) =A(t−δ), A2(t) = Z t

t−δ

e^−ρ(t−r)X(r)dr, (3.2) andδ >0, ρ≥0 and T >0 are given constants. u(·) is the control process.

147

The functions b : [0, T] × R × R × R × U × Ω → R, σ : [0, T] × R × R × R × 148

U ×Ω → R and γ : [0, T] ×R ×R× R× U × R0 × Ω → R are given such that for 149

all t, b(t, a, a₁, a₂, u,·), σ(t, a, a₁, a₂, u,·) and γ(t, a, a₁, a₂, u, z,·) are F_t-measurable for all 150

a∈R, a1 ∈R, a2 ∈R, u∈ U andζ ∈R0. We assume that the functiona0(t) is continuous 151

and deterministic.

152

Here u = (u₁, u₂), where u_i(t) is the control of player i;i= 1,2. We suppose that we are 153

given two subfiltrations 154

E_t⁽ⁱ⁾⊂ F_t; t∈[0, T], (3.3)

representing the information available to playeriat timet; i= 1,2. We letA_i denotes the set 155

of admissible control processes of player i, contained in the set of E_t⁽ⁱ⁾-predictable processes, 156

i= 1,2.

157

We consider the associated BSDE’s in the unknowns Yi(t), Zi(t), Ki(tζ) have the form 158











dYi(t) = gi(t, A(t), A1(t), A2(t), Yi(t), Zi(t), Ki(t, ζ), u(t))dt + Zi(t)dB(t) +

Z

R0

Ki(t, ζ)Ne(dζ, dt); t∈[0, T] Yi(T) = hi(A(T)) ; i= 1,2,

(3.4)

wheregi(t, a, a1, a2, y, z, k, u) : [0, T]×R×R×R×R×R×R0× U ×Ω→Randhi(a) :R→R, 159

i= 1,2 are such that the BSDE (3.4) has a unique solution.

160

Let fi(t, a, a1, a2, u) : [0, T]×R×R×R× U ×Ω→R, ϕi(a) :R→ Rand ψi(a) :R→R 161

i= 1,2 be given C¹ functions with respect to (t, a, a₁, a₂, u) such that 162

E Z T

0

|f_i(t, A(t), A₁(t), A₂(t), u(t))| +

∂f_i

∂ai

f_i(t, A(t), A₁(t), A₂(t), u(t))

2 dt ϕ_i(A(T)) +|ϕ⁰_i(A(T))|²+|ψ_i(Y_i(0))|+|ψ_i⁰(Y_i(0))|²

<∞ fora_i=a, a₁, a₂ and u.

Assume that the performance functional of each playeri has the following form 163

J_i(t, u) =EhZ _T

t

f_i(s, A(s), A₁(s), A₂(s), u(s))ds + ϕ_i(A(T)) + ψ_i(Y_i(t)) F_ti

; i= 1,2.

(3.5) Here, f_i, ϕ_i and ψ_i can be seeing as profit rates, bequest functions and “risk evaluations”

164

respectively, of playeri; i= 1,2.

165

We shall first consider the non-zero-sum stochastic differential game problem that is, we 166

analyze the following:

167

Problem P3. Find (u^∗₁, u^∗₂)∈ A₁× A₂ (if it exists) such that 168

(1) J1(t, u1, u^∗₂)≤J1(t, u^∗₁, u^∗₂) for all u1 ∈ A₁ 169

170

(2) J2(t, u^∗₁, u2)≤J2(t, u^∗₁, u^∗₂) for all u2 ∈ A₂ 171

(8)

The pair (u^∗₁, u^∗₂) is called a Nash Equilibrium (if it exists). The intuitive idea is that there 172

are two players, Player I and Player II. While Player I controlsu1, Player II controlsu2. Each 173

player is assumed to know the equilibrium strategies of the other player, and no player has 174

anything to gain by changing only his or her own strategy (i.e., by changing unilaterally).

175

Player I and Player II are in Nash Equilibrium if each player is making the best decision she 176

can, taking into account the other player’s decision.

177

Let mention once more that in this case, u₂ is not known to the trader, but subject to 178

uncertainty. We may regard u₂ as a market scenario or a stochastic control of the market, 179

which is playing against the trader.

180

We shall first solve Problem P3 for t= 0 and then obtain the result for each t∈[0, T] as 181

a corollary. For t= 0 we put 182

Ji(u) =Ji(0, u) =E hZ T

0

fi(s, A(s), A1(s), A2(s), u(s))ds + ϕi(A(T)) + ψi(Yi(0)) i

, i= 1,2 (3.6) Define the Hamiltonians

183

H_i : [0, T]×R×R×R×R×R× R ×U1×U2×R×R×R×R× R −→R, i= 1,2 by

184

Hi(t, a, a1, a2, y, z, k, u1, u2, λ, p, q, r) :=fi(t, a, a1, a2, u1, u2) + λgi(t, a, a1, a2, y, z, k, u1, u2) + p_ib(t, a, a₁, a₂, u₁, u₂) + q_iσ(t, a, a₁, a₂, u₁, u₂)

+ Z

R0

r(ζ)γ(t, a, a₁, a₂, u₁, u₂, ζ)Ne(dζ, dt) (3.7) where Ris the set of functions such that the last term in (3.7) converges.

185

Suppose that H_i is Fr´echet differentiable in the variables a, a₁, a₂, y, z, k, u_i and that 186

∇_kHi(t, ζ) as a random measure which is absolutely continuous with respect toν;i= 1,2.De- 187

fine the adjoint processesλ_i(t), p_i(t), q_i(t) andr_i(t, ζ), t∈[0, T], ζ∈R0 associated to these 188

Hamiltonians by the following system of advanced forward-backward stochastic differential 189

equation (AFBSDEs) 190

(1) Forward SDE inλ_i(t) 191







dλ_i(t) = ∂Hi

∂y (t)dt+∂Hi

∂z (t)dB(t) + Z

R0

d∇_kHi

dν(ζ) (t, ζ)N(dζ, dt), te ∈[0, T] λ_i(0) = ψ_i⁰(Y(0)) ; i= 1,2, .

(3.8) Here and in what follows, we use the notation

∂H_i

∂y (t) = ∂H_i

∂y (t, A(t), A₁(t), A₂(t), u₁(t), u₂(t), Y_i(t), Z_i(t), K_i(t,·), λ_i(t), p_i(t), q_i(t), r_i(t,·)), etc and d∇_kHi

dν(ζ) (t, ζ) is the Radon-Nikodyn derivative of ∇_kHi(t, ζ) with respect to 192

ν(t, ζ).

193

(2) Anticipative BSDE inp_i(t), q_i(t), r_i(t, ζ) 194











dp_i(t) = E[µ_i(t)

F_t] + q_i(t)dB(t) + Z

R0

r_i(t, ζ)Ne(dζ, dt), t∈[0, T] p_i(T) = ϕ⁰_i(A(T)) + h⁰_i(A(T)), q(T) =r(T,·) = 0

p(t) = q(t) =r(t,·) = 0 ; t∈(T, T +δ], i= 1,2, ,

(3.9)

(9)

where 195

µ(t) =−∂Hi

∂a (t)− ∂Hi

∂a₁(t+δ)χ_[0,T_−δ](t)−e^{ρ t} Z _t+δ

t

∂Hi

∂a₂(s)e^{−ρ s}χ_[0,T](s)ds, (3.10) and

∂H_i

∂a (t) = ∂H_i

∂a (t, A(t), A₁(t), A₂(t), u₁(t), u₂(t), Y_i(t), Z_i(t), K_i(t,·), λ_i(t), p_i(t), q_i(t), r_i(t,·)), 196

∂Hi

∂a₁(t+δ)

= ∂H_i

∂a₁(t+δ, A^δ(t), A^δ₁(t), A^δ₂(t), u^δ₁(t), u^δ₂(t), Y_i^δ(t), Z_i^δ(t), K_i^δ(t,·), λ^δ_i(t), p^δ_i(t), q_i^δ(t), r^δ_i(t,·)), withx^δ_i =xi(t+δ), xi =a, a1, a2, u1, u2, y, z, k, λ, p, q, r.

197

Note thatµ(t) contains future values ofA(s), A₁(s), A₂(s), u₁(s), u₂(s), Y_i(s), Z_i(s), 198

Ki(s,·), λ_i(s), pi(s), qi(s), ri(s,·); s≤t+δ 199

200 201

Remark 3.1. Let V be an open subset of a Banach space X and let F :V →R.

202

• We say that F has a directional derivative (or Gateaux derivative) at x ∈ V in the direction y∈ X if

D_yF(x) := lim

ε→0

1

ε(F(x+εy)−F(x)) exists.

• We say that F is Fr´echet differentiable atx∈V if there exists a linear map L:X →R

such that

h→0lim

h∈X

1

khk|F(x+h)−F(x)−L(h)|= 0.

In this case we call L the Fr´echet derivative ofF at x, and we write L=∇_xF

• If F is Fr´echet differentiable, then F has a directional derivative in all directions y∈ X and

DyF(x) =∇_xF(y)

3.1. A sufficient maximum principle for FBSDDE games.

203 204

In the following result, we give a sufficient maximum principle for FBSDDE games. In fact, 205

we prove that, under some assumptions, maximizing the Hamiltonians leads to an optimal 206

control.

207

Theorem 3.2. [Sufficient maximum principle for FBSDDE games] Let (bu1,ub2) ∈ 208

A₁ × A₂ with corresponding solutions A(t),b Ybi(t),Zbi(t),Kbi(t, ζ),bλi(t),pbi(t),qbi(t),bri(t, ζ) of 209

equations (3.1), (3.4), (3.8) and (3.9) fori= 1,2. Suppose that the following are true:

210

(10)

• The functions 211

a7→h_i(a), a7→ϕ_i(a), y7→ψ_i(y), (3.11) 212

(a, a1, a2, y, z, k, v1)7→H1(a, a1, a2, y, z, k, v1,bu2,bλi(t),pbi(t),qbi(t),bri(t,·)) (3.12) and

213

(a, a₁, a₂, y, z, k, v₂)7→H₂(a, a₁, a₂, y, z, k,bu₁, v₂,bλ_i(t),pb_i(t),qb_i(t),br_i(t,·)) (3.13) are concave, when ui(t) =v(t), u3−i(t) =ub3−i(t); i= 1,2.

214

• maxv∈Ui

n E

h

Hi(A(t),b Ab1(t),Ab2(t),Ybi(t),Zbi(t),Kbi(t,·), u₁(t), u2(t),λbi(t),pbi(t),qbi(t),rbi(t,·)) E_t⁽ⁱ⁾i

; u_i(t) =v(t), u3−i(t) =ub3−i(t)o

=Eh

H_i(t,A(t),b Ab₁(t),Ab₂(t),Yb_i(t),Zb_i(t),Kb_i(t,·),bu₁(t),ub₂(t),bλ_i(t),pb_i(t),qb_i(t),br_i(t,·)) E_t⁽ⁱ⁾i

for i= 1,2. (3.14)

for all t∈[0, T], a.s.

215

• In addition, assume the following growth conditions 216

EhZ _T

0

n pb²_i(t)

(σ(t)−bσ(t))²+ Z

R0

(γ_i(t, ζ)−bγ_i(t, ζ))²ν(dζ) (A(t)−A(t))b ²

bq²_i(t) + Z

R0

br²_i(t, ζ)ν(dζ) (Y(t)−Yb(t))²

(∂Hbi

∂z )²(t) + Z

R0

∇_kHb_i(t, ζ)

2

ν(dζ) bλ²_i(t)

(Z_i(t)−Zb_i(t))²+ Z

R0

(K_i(t, ζ)−Kb_i(t, ζ))²ν(dζ)oi

<∞ for i= 1,2. (3.15) Thenu(t) = (b ub1(t),bu2(t)) is a Nash equilibrium for (3.1)-(3.4) and (3.6).

217

Remark 3.3. In the Theorem and in the following, we are using the subsequent notation: If 218

i = 1, A(t) = A^(u¹^,^u^b²⁾(t) and Y₁(t) = Y₁^(u¹^,^u^b²⁾(t) are the processes associated to the control 219

u(t) = (u1(t),ub2(t)), while A(t) =b A⁽^b^u)(t) and Yb1(t) = Y₁⁽^b^u)(t) are those associated to the 220

control bu(t) = (bu₁(t),ub₂(t)).

221

Furthermore, we put

∂Hbi

∂a (t) =∂Hi

∂a (t,A(t),b Ab1(t),Ab2(t),Ybi(t),Zbi(t),Kbi(t,·),u,b bλi(t),pbi(t),bqi(t),bri(t,·)) and similarly with ∂Hbi

∂a1

(t),∂Hbi

∂a2

(t),∂Hbi

∂y (t),∂Hbi

∂z (t),∂Hbi

∂u1

(t),∂Hbi

∂u2

(t) and ∇_k_iHbi(t, ζ), i= 1,2.

222

Proof. We shall first prove that J₁(u₁,ub₂)≤J₁(ub₁,ub₂) for allu∈ A₁. 223

Choose u1∈ A₁ and consider 224

J1(u1,bu2)−J1(ub1,bu2) =I1 + I2 + I3 (3.16)

(11)

where 225

I₁=E Z T

0

n

f₁(t, A(t), A₁(t), A₂(t), u(t)) − f₁(t,A(t),b Ab₁(t),Ab₂(t),bu(t))dto

(3.17) I2=E

ϕ1(A(T))−ϕ1(A(T))b

(3.18) I₃=E

ψ₁(Y₁(0))

− ψ₁(Yb₁(0))

(3.19) By the definition ofH₁ and concavity, we get

226

I₁=E Z T

0

H₁(t)−Hb₁(t)−bλ₁(g₁(t)−gb₁(t))−pb₁(t)(b(t)−bb(t))

−qb1(t)(σ(t)−bσ(t))− Z

R0

br1(t, ζ)(γ1(t, ζ)−bγ1(t, ζ))ν(dζ)

dt

≤E Z T

0

∂Hb1

∂a (t)(A(t)−A(t)) +b ∂Hb1

∂a1

(t)(A1(t)−Ab1(t)) +∂Hb1

∂a2

(t)(A2(t)−Ab2(t)) + ∂Hb1

∂y (t)(Y1(t)−Yb1(t)) +∂Hb1

∂z (t)(Z1(t)−Zb1(t)) + Z

R0

∇_kHbi(t, ζ)(K1(t, ζ)−Kb1(t, ζ))ν(dζ)

∂Hb₁

∂u1

(t)(u₁(t)−bu₁(t))−bλ₁(g₁(t)−bg₁(t))−pb₁(t)(b(t)−bb(t))

−qb₁(t)(σ(t)−bσ(t))− Z

R0

br₁(t, ζ)(γ₁(t, ζ)−bγ₁(t, ζ))ν(dζ)

dt

(3.20) By concavity ofϕ1, Itˆo formula, (3.1) and (3.9), we get

227

I₂ =≤E

ϕ⁰₁(A(T))(A(T)−A(Tb ))

=E

pb1(T)(A(T)−A(T))b

−E

λb1(T)(A(T)−A(Tb ))

=E Z T

0 bp₁(t)(dA(t)−dA(t)) +b Z T

0

(A(t⁻)−A(tb ⁻))dpb₁(t) +

Z T 0

(σ(t)−bσ(t))qb1(t)dt + Z T

0

Z

R0

(γ(t)−γb(t))br1(t, ζ)ν(dζ)dt

−E

bλ1(T)(A(T)−A(Tb ))

=E Z T

0 bp₁(t)(b(t)−bb(t))dt + Z T

0

(A(t⁻)−A(tb ⁻))E[µ(t)|F_t]dt +

Z T 0

(σ(t)−bσ(t))qb1(t)dt + Z T

0

Z

R0

(γ(t, ζ)−bγ(t, ζ))br1(t, ζ)ν(dζ)dt

−E

bλ₁(T)(A(T)−A(Tb ))

(3.21) By concavity ofψ₁, h₁, Itˆo formula, (3.4) and (3.8), we get

228

(12)

I2 =≤E

ψ1(Y1(0))

− ψ1(Yb1(0))

≤E

ψ₁⁰(Yb1(0))(Y1(0)−Yb1(0))

=E

λb₁(0)(Y₁(0)−Yb₁(0))

=E

λb1(T)(Y1(T)−Yb1(T))

−E Z T

0

λb₁(t)(dY₁(t)−dbY₁(t)) + Z T

0

(Y₁(t⁻)−Yb₁(t⁻))dbλ₁(t) +

Z T 0

(Z₁(t)−Zb₁(t))∂Hb₁

∂z (t)dt + Z T

0

Z

R0

(K₁(t, ζ)−Kb₁(t, ζ))∇_kHb₁(t, ζ)ν(dζ)dt

=E

λb1(T)(h1(A(T))−h1(A(T)))b

−E Z T

0

∂Hb1

∂y (t)(Y1(t)−Yb1(t))dt + Z T

0

bλ1(t)(−g₁(t) +bg1(t))dt +

Z T 0

(Z1(t)−Zb1(t))∂Hb1

∂z (t)dt + Z T

0

Z

R0

(K1(t, ζ)−Kb1(t, ζ))∇_kHb1(t, ζ)ν(dζ)dt

=E

λb₁(T)h⁰₁(A(Tb ))(A(T)−A(T))b

−E Z T

0

∂Hb₁

∂y (t)(Y₁(t)−Yb₁(t))dt + Z T

0

bλ₁(t)(−g₁(t) +bg₁(t))dt +

Z T 0

(Z₁(t)−Zb₁(t))∂Hb₁

∂z (t)dt + Z T

0

Z

R0

(K₁(t, ζ)−Kb₁(t, ζ))∇_kHb₁(t, ζ)ν(dζ)dt

(3.22) Summing (3.20), (3.21) and (3.22), we have

229

I1 + I2 + I3≤E Z T

0

∂Hb1

∂a (t)(A(t)−A(t)) +b ∂Hb1

∂a1

(t)(A1(t)−Ab1(t)) +∂Hb1

∂a2

(t)(A2(t)−Ab2(t)) + ∂Hb1

∂u1

(t)(u1(t)−ub1(t)) +µ1(t)(A1(t)−Ab1(t))

dt

=E

Z T+δ δ

∂Hb1

∂a (t−δ) + ∂Hb1

∂a1

(t)χ_[0,T_](t) +µ1(t−δ)

(A1(t)−Ab1(t))dt +

Z T 0

∂Hb₁

∂a2

(t)(A2(t)−Ab2(t))dt+ Z T

0

∂Hb₁

∂u1

(t)(u1(t)−ub1(t))dt

(3.23) Using integration by parts and substituting r =t−δ, we get

230

(13)

Z T 0

∂Hb1

∂a₂ (s)(A2(s)−Ab2(s))ds

= Z _T

0

∂Hb1

∂a₂ (s) Z _s

s−δ

e^ρ(s−r)(A(r)−A(r))dr dsb

= Z T

0

Z r+δ r

∂Hb₁

∂a2

(s)e^−ρsχ_[0,T](s)ds

e^ρr(A(r)−A(r))drb

= Z T+δ

δ

Z t t−δ

∂Hb₁

∂a2

(s)e^−ρsχ_[0,T](s)ds

e^ρ(t−δ)(A(t−δ)−A(tb −δ))dt. (3.24) Combining this with (3.10) and using (3.23), we obtain

231

J1(u1,ub2)−J1(bu1,bu2)

≤E

Z T+δ δ

∂Hb1

∂a (t−δ) +∂Hb1

∂a₁(t)χ_[0,T](t) +

Z t t−δ

∂Hb1

∂a2

(s)e^−ρsχ_[0,T_](s)ds

e^ρ(t−δ)+µ1(t−δ)

(A1(t)−Ab1(t))dt +

Z T 0

∂Hb₁

∂u₁(t)(u₁(t)−ub₁(t))dt

=E Z T

0

∂Hb₁

∂u₁(t)(u₁(t)−ub₁(t))dt

=E Z T

0

E ∂Hb1

∂u₁(t)(u₁(t)−ub₁(t))|E_t⁽¹⁾

dt

≤0. (3.25)

The last inequality follows from condition (3.14) fori= 1. Hence J₁(u₁,bu₂)≤J₁(bu₁,bu₂) for all u₁ ∈ A₁ The inequality

J₂(bu₁, u₂)≤J₁(bu₁,bu₂) for all u₂ ∈ A₂ is proved in the same way.

232

This completed the proof.

233

234

If we now start from t∈[0, T], the it can be easily derived that the following result holds 235

Corollary 3.4. Let (bu₁,ub₂) ∈ A₁ × A₂ with corresponding solutions 236

A(t),b Ybi(t),Zbi(t),Kbi(t, ζ),bλi(t),pbi(t),bqi(t),bri(t, ζ) of equations (3.1), (3.4), (3.8) and 237

(3.9) fori= 1,2. If the other conditions of Theorem 3.2 hold. Then bu(t) = (ub₁(t),bu₂(t)) is 238

a Nash equilibrium for (3.1)-(3.5).

239

Proof. It easily follows from the proof of Theorem 3.2 with the starting value beingtinstead 240

of 0 and using the fact thatE_tⁱ ⊂ F_t, i= 1,2.

241

(14)

3.2. A necessary maximum principle for FBSDDE games.

242 243

One of the drawback with the sufficient maximum principle is the concavity condition 244

(3.12), which may fail in some interesting applications. In particular for a zero-sum game, 245

the concavity condition implies thatϕ1, ψ1 and h1 are affine functions, which is too strong.

246

In what follows, we shall prove a version of the maximum principle which does not need 247

concavity. In fact we shall show the equivalence between a critical point for the conditional 248

Hamiltonian and a directional derivative point for the performance functional. To this end, 249

we make the following assumptions:

250

Assumption A1. For allt0 ∈[0, T]and all boundedE_t⁽ⁱ⁾-measurable random variable αi(ω), 251

the control processβ_i(t) defined by 252

β_i(t) :=χ_(t₀_,T₎(t)α_i(ω) ; t∈[0, T] (3.26) belong to A_i, i= 1,2

253

Assumption A2. For all ui∈ A_i and all boundedβi ∈ A_i, there existsε >0 such that uei(t) :=ui(t) +sβi(t)∈ A_i for all s∈(−ε, ε)

Assumption A3. For all bounded βi∈ A_i, the derivatives processes 254

X1(t) = d

dsA^(u¹^+sβ¹^,^u^b²⁾(t)

s=0; X2(t) = d

dsA⁽^u^b¹^,u²^+sβ²⁾(t) s=0

y1(t) = d

dsY₁^(u¹^+sβ¹^,^u^b²⁾(t)

s=0; y2(t) = d

dsY₂⁽^u^b¹^,u²^+sβ²⁾(t) s=0

z₁(t) = d

dsZ₁^(u¹^+sβ¹^,^u^b²⁾(t)

s=0; z₂(t) = d

dsZ₂⁽^u^b¹^,u²^+sβ²⁾(t) s=0

k₁(t, ζ) = d

dsK₁^(u¹^+sβ¹^,b^u²⁾(t, ζ)

s=0 ;k₂(t, ζ) = d

dsK₂^(b^u¹^,u²^+sβ²⁾(t, ζ) s=0

exist and belong toL²(λ×P).

255

It follows from that (3.1) that 256

dX₁(t) = ∂b

∂a(t)X₁(t) + ∂b

∂a₁(t)X₁(t−δ) + ∂b

∂a₂(t) Z t

t−δ

e^ρ(t−r)X₁(r)dr+ ∂b

∂u₁(t)β₁(t)

dt +

∂σ

∂a(t)X₁(t) + ∂σ

∂a1

(t)X₁(t−δ) + ∂σ

∂a2

(t) Z t

t−δ

e^ρ(t−r)X₁(r)dr+ ∂σ

∂u1

(t)β₁(t)

dB(t) +

Z

R0

∂γ

∂a(t, ζ)X1(t) + ∂γ

∂a₁(t, ζ)X1(t−δ) + ∂γ

∂a₂(t, ζ) Z t

t−δ

e^ρ(t−r)X1(r)dr + ∂γ

∂u1

(t, ζ)β₁(t)

Ne(dt, dζ), t∈[0, T] (3.27)

X1(t) = 0, t∈[−δ,0].

(15)

Here we have used for notational simplicity 257

d

dsA^(u₁ ¹^+sβ¹^,b^u²⁾(t)

s=0= d

dsA^(u¹^+sβ¹^,b^u²⁾(t−δ)

s=0=X₁(t−δ) d

dsA^(u₂ ¹^+sβ¹^,^b^u²⁾(t)

s=0= d ds

Z t t−δ

e^ρ(t−r)A^(u¹^+sβ¹^,^b^u²⁾(r)dr

s=0

= Z t

t−δ

e^ρ(t−r) d

dsA^(u¹^+sβ¹^,b^u²⁾(r)

s=0dr= Z t

t−δ

e^ρ(t−r)X₁(r)dr It follows from (3.4) that

258

dy₁(t) = ∂g₁

∂a(t)X₁(t) +∂g₁

∂a₁(t)X₁(t−δ) + ∂g₁

∂a₂(t) Z _t

t−δ

e^ρ(t−r)X₁(r)dr + ∂g₁

∂u₁(t)β₁(t) +∂g₁

∂y(t)y₁(t) +∂g₁

∂z (t)z₁(t) + Z

R0

∇_kg₁(t, ζ)k₁(t, ζ)ν(dζ)

dt +z₁(t)dB(t) +

Z

R0

k₁(t, ζ)Ne(dt, dζ), t∈[0, T] (3.28) y1(T) =h⁰₁(A^(u¹^,^b^u²⁾(T))X1(T),

and similarly, we obtaindx₂(t), dy₂(t).

259

Theorem 3.5. [Necessary maximum principle for FBSDDE games] Let u ∈ 260

A with corresponding solutions A(t) of (3.1), (Yi(t), Zi(t), Ki(t zeta)) of (3.4), λi(t) 261

of (3.8), (pi(t), qi(t), ri(t, ζ)) of (3.9) and corresponding derivative processes Xi(t) and 262

(y_i(t), z_i(t), k_i(t, ζ)) given by (3.27) and (3.28) respectively. Assume that Assumption A1, 263

A2 and A3 hold. Moreover assume the following conditions 264

E Z T

0

p²_i(t)

(∂σ

∂a)²(t)X_i²(t) + (∂σ

∂a₁)²(t)X_i²(t−δ) + (∂σ

∂a₂)²(t) Z t

t−δ

e^ρ(t−r)X_i(r)dr 2

+ (∂σ

∂ui

)²(t)β_i²(t) + Z

R0

(∂γ

∂a)²(t, ζ)X_i²(t) + (∂γ

∂a1

)²(t, ζ)X_i²(t−δ) + (∂γ

∂a₂)²(t, ζ) Z t

t−δ

e^ρ(t−r)X_i²(r)dr 2

+ (∂γ

∂u_i)²(t, ζ)β_i²(t)

ν(dζ)

dt +

Z _T

0

X_i²(t)

q²_i(t) + Z

R0

r²_i(t, ζ)ν(dζ)

dt <∞, i= 1,2. (3.29)

and 265

E Z T

0

y_i²(t)

(∂H_i

∂z )²(t) + Z

R0

(∇_kH_i)²(t, ζ)ν(dζ)

dt +

Z _T

0

λ²_i(t)

z_i²(t) + Z

R0

k²_i(t, ζ)ν(dζ)

dt <∞, i= 1,2. (3.30) Then the following are equivalent:

266