Linear quadratic mean-field-game of backward stochastic differential systems

This paper is concerned with a dynamic game of N weakly-coupled linear backward stochastic differential equation (BSDE) systems involving mean-field interactions. The backward mean-field game (MFG) is introduced to establish the backward decentralized strategies. To this end, we introduce the notations of Hamiltonian-type consistency condition (HCC) and Riccati-type consistency condition (RCC) in BSDE setup. Then, the backward MFG strategies are derived based on HCC and RCC respectively. Under mild conditions, these two MFG solutions are shown to be equivalent. Next, the approximate Nash equilibrium of derived MFG strategies are also proved. In addition, the scalar-valued case of backward MFG is solved explicitly. As an illustration, one example from quadratic hedging with relative performance is further studied.


1.
Introduction. In recent years, the study of mean-field game (MFG) has attracted consistent and extensive research attentions because of its significant theoretical values and broad practical applications. The methodology of MFG has provided an effective and tractable analysis framework to establish the approximate Nash equilibrium for weakly-coupled stochastic controlled system with mean-field interactions. Such stochastic system consists of a large number of agents who are individually negligible but their collective behaviors will impose some significant impact on all agents. In linear-quadratic (LQ) case, their individual linear dynamics and (or) quadratic cost functionals are coupled via the state-average across the whole weakly-coupled system. The streamline of MFG can be roughly sketched as follows. First, we should study which form should arise in the limit of realized optimal state average as the agent number tends to infinity. Next, such limit is frozen and an auxiliary stochastic control problem should be constructed correspondingly. A set of strategies for finite-agent game is then derived by the so-called consistency condition (CC) system. We call such strategies to be MFG-strategies. The analysis of CC system should involve a fixed-point argument: the realized state-average from MFG strategies should replicate the frozen state limit when constructing the auxiliary control problem. The MFG strategies should form an approximate Nash equilibrium, in the sense that no agent can improve his cost or performance functional by more than (N ) when applying any unilateral strategy perturbation, where (N ) −→ 0 as N −→ +∞. The interested readers may refer [3,4,5,8,19,20,22,23,36,37] for more details of MFG with statistically symmetric or heterogeneous agents. Furthermore, some recent literature can be found in [18,29,31] for the study of MFG allowing minor-major agents; [2,7,39] for the stochastic mean-field type control problems.
The aim of this paper is to present a new type weakly-coupled stochastic system, and introduce the related MFG theory to such coupled system. We point out that the states in all MFG literature mentioned above, follow the forward (linear or nonlinear) stochastic differential equations (SDE). By "forward", we mean the initial condition of system is pre-specified as a priori, and the system propagates from initial to terminal time in forward-time manner. By contrast, this paper introduces a large-population system with individual states following a linear backward stochastic differential equations (BSDEs). Unlike forward SDEs, the terminal condition of BSDEs should be specified as the priori, and as a consequence, the solution of BSDEs should consist of one adapted solution pair (x(t), z(t)). Here, the second component z(t) (it is called diffusion solution component) is naturally introduced to ensure the adaptiveness of x(t) when propagating from terminal backward to initial time. The linear BSDEs were initially introduced in [6], and extended to nonlinear case in [32]. The BSDEs are well-formulated stochastic systems and have been found various applications (e.g., [13,15,27,33,34,40]).
As BSDEs are well-defined stochastic systems with broad-range applications, it is very natural to study the dynamic optimization of BSDE in large-population setup. Indeed, the weakly-coupled BSDE system with mean-field interaction is motivated by a variety of scenarios. One example arises from the quadratic hedging with relative performance. Note that the relative performance is well-documented in economics and finance (see [1,9,10]), and the quadratic hedging is also a wellstudied topic (see [11,12,24]). Another example may refer the hedging of pension funds when there exists the system risk. The hedging of pension fund with portfolio selection is studied in many literature (see e.g., [16,30]). On the other hand, in case there are considerable and competitive pension funds, it is necessary to consider the possible system risk across the whole industry. Hence, the problem can be formulated as a dynamic optimization of large-population BSDE. The third example may arise from the dynamic economic models for production planning problems with some tracking terminal objectives, but affected by the market price via production average. More details can be found from [14,13]. In addition, the controlled forward large population systems, which are subjected to some terminal constraints, can be reformulated by some backward large population systems, as motivated by [21,25]. In particular, [21] studied the relationship between a BSDE and a forward LQ optimal control problem, by regarding the diffusion component in the BSDE and initial state as controls and minimizing the second moment of difference between terminal state and terminal target specified by BSDE. Based on [21], Lim and Zhou [25] discussed a backward LQ optimal control problem but in single agent case. In 2016, Huang, Wang and Wu [17] firstly studied the backward mean-field LQG games, but in their paper, the state and cost functional is too special to deal with the problems we noticed before. Thus, we need to study more general case which also has broad potential application in practice. Here, we extend the study to large-population case with considerable small agents.
Inspired by above mentioned motivations, this paper studies the dynamic problem of large-population backward stochastic differential systems. Specifically, we consider large-population dynamic games involving N weakly-coupled agents satisfying backward stochastic differential equation (BSDE) systems. The agents are interacted through their individual finite time horizon cost functions via their stateaverage and cost-average, which can be interpreted as some index related to the market price or system risk.
The main contributions of this paper are as follows: • (i) The LQ backward MFG is introduced to a general class of weakly-coupled backward stochastic system. The asymptotic state-average limit in BSDE setup is introduced, and an auxiliary stochastic LQ control for BSDE is formulated. By virtue of the maximum principle method, the optimal auxiliary control can be represented via the Hamiltonian system and adjoint process. The decentralized strategies can be derived through the consistency condition (CC) system in BSDE case. We establish the CC system based on the Hamiltonian system (here, we term it as Hamiltonian-type consistency condition (HCC)).
• (ii) The existence and uniqueness of such HCC system is obtained by virtue of the monotonic condition method concerning to backward-forward stochastic differential equation (BFSDE).
• (iii) On the other hand, applying the completion of square method, we can obtain some linear feedback optimal strategies which are mainly based on the introduction of some Riccati-equations. The related CC system is also derived which is termed as Riccati-type CC (RCC). Under some mild conditions, we prove that the HCC and RCC are actually equivalent.
• (iv) Given the MFG solutions based on HCC and RCC, we turn to verify the approximate Nash equilibrium of them. Our result implies that the -Nash equilibrium property should be true for finite N population system with = O(1/ √ N ).
• (v) Last but not least, we give the explicit solutions of backward MFG problem in one-dimensional case and obtain the effects of optimal cost on mean-field coupling parameters, which will be useful to deal with concrete problems. Some remarks to above points are given as follows. As to (i), unlike the standard Hamiltonian system for forward SDE control which is a forward-backward SDE (FBSDE), the Hamiltonian system in BSDE control setup becomes a backwardforward SDE (BFSDE) which is coupled in its initial condition. As to (iii), the HCC is qualified to be an open-loop system but the RCC, which maybe regarded as a decoupled control system, is not qualified to be a closed-loop because it involves some quantity determined by the terminal condition of BSDE. This distinguishes our study of large-population BSDE to its counterpart of forward SDE. One condition under which the HCC and RCC are equivalent is the standard assumption, that is, all weight matrix are set to be non-negative definitive. However, in case the weight matrix are indefinite or negative-definitive, the RCC and HCC should be different. This should be carried out in our future studies. The structure of this paper is as follows. Section 1 gives the introduction, and specifies some standard notations and terminologies. Section 2 formulates the problem of dynamic game of weakly-coupled backward stochastic differential system. Section 3 studies the backward LQ MFG to derive the MFG strategies which is decentralized. Both the Hamiltonian-type consistency condition (HCC) and Riccatitype consistency condition (RCC) systems are introduced, and their equivalence is given under mild conditions. Section 4 is devoted to the asymptotic analysis of related -Nash equilibrium. Section 5 studies the scalar-valued case and one special but illustrating example is also presented therein. The real explanations of them in quadratic hedging with relative performance criteria are also discussed. Section 6 concludes our work and presents some future research direction.
1.1. Notation and terminology. The following notations will be used throughout the paper. Let R m denote the m-dimensional Euclidean space with standard Euclidean norm | · | and standard Euclidean inner product ·, · . The transpose of a vector (or matrix) x is denoted by x . Tr(A) denotes the trace of a square matrix A. Let R m×n be the Hilbert space consisting of all (m × n)-matrices with the inner product A, B := Tr(AB ) and the norm |A| := A, A 1 2 . We denote the set of symmetric n × n matrices with real elements by S n and n × n identity matrices by I n . If M ∈ S n is positive (semi) definite, we write M > (≥) 0.
Consider a finite time horizon [0, T ] for a fixed T > 0. Let X be a given Hilbert space. The set of X-valued continuous functions is denoted by C([0, T ]; X). If N (·) ∈ C([0, T ]; S n ) and N (t) > (≥) 0 for every t ∈ [0, T ], we say that N (·) is positive (semi) definite, which is denoted by . These definitions generalize in the obvious way to the case when f (·) is R n×m (or S n ) valued. Finally, in cases where we are restricting ourselves to deterministic Borel measurable functions f : [0, T ] → R n , we shall drop the subscript F in the notation. For example, L ∞ (0, T ; R n ). We use an integer-valued subscript to denote an individual agent {A i : 1 ≤ i ≤ N }.

2.
Weakly-coupled Linear backward stochastic system. Consider a largepopulation backward system with N weakly-coupled agents {A i : 1 ≤ i ≤ N }. The state of i th agent A i is given by the following controlled linear BSDE: Note that z i , z ij , 1 ≤ j ≤ d, are also part of solution of (1). Actually, they are introduced here ensure the adaptiveness of x i . The terminal conditions for individual agents ξ i ∈ F i T , i = 1, 2, · · · , N , represent some random future objectives to be achieved or tracking targets to be hedged (see e.g. [15]). The admissible control u i ∈ U i , where the admissible control set U i is defined by Later, we shall state some assumptions on the coefficients A(·), B(·) and the terminal condition ξ i , 1 ≤ i ≤ N , to guarantee the existence and uniqueness of ( We refer to such a (d+3)-tuple (x i (·), z i (·), z i1 (·), · · · , z id (·), u i (·)) as an admissible (d+3)-tuple. Let u −i =(u 1 , · · · , u i−1 , u i+1 , · · · , u N ) denotes the strategies except A i ; u = (u 1 , · · · , u N ) denote the set of strategies of all N agents. The individual cost of A i associated with an admissible multi-tuple (x i (·), z i (·), z i1 (·), · · · , z id (·), u i (·)) is given by x i (·) denotes the state-average, and u (N ) (·) = 1 N N i=1 u i (·) the control-average. For sake of notation simplicity, the time argument is suppressed in cost functional above and in the sequel of this paper wherever necessary. Throughout this paper, we shall assume the following assumptions: (Ω; R n ), i = 1, 2, · · · , N, are conditionally independent and identically distributed with respect to F B T . [32] or section 7 of [40]). Now, let us formulate the dynamic game problem of backward large-population system.
3. Backward linear-quadratic (LQ) mean-field game (MFG). We use the MFG theory to study Problem (I). To start, we need introduce some auxiliary stochastic control problems parameterized by the state average limit when N tends to infinity. By (A1) and conditional strong law of large numbers (see [28] exists and ξ 0 is determined by where , respectively. Such adaptiveness property will be verified by our later analysis. Now introduce the following limiting cost functional Then, we formulate the auxiliary LQ backward control problem as follows. We callū i an optimal control for Problem (II), if (5) holds. Moreover, the corresponding state (x i (·),z i (·),z i1 (·), · · · ,z id (·)) is called optimal trajectory. To study problem (II), we first present the following result.
is strictly convex and coercive to u i (·).

KAI DU, JIANHUI HUANG AND ZHEN WU
Proof. For any i = 1, 2, · · · , N , we introduce the variational equation . Then for any variation δu i ofū i , the associated first order variation of J i (ū i ) satisfies Applying Itô's formula, Combining this identity with ς i (T ) = 0 and Using the same method, we get It follows from (12)-(14) that for any For any 1 ≤ i ≤ N , the Hamiltonian system (11) is a backward-forward SDE (BFSDE) system (see the generalized FBSDEs form in [38]) and we have the following well-posedness result.
Proof. For convenience, introduce the notations where β 1 > 0, and Thus, the Hamiltonian system (11) satisfies the monotone assumptions in [41]. Similarly as Theorem 2.1 in [41], it is easy to get that (11) admits a unique solution.
3.1. Hamiltonian-type consistency condition. Problem (II) can be solved by Proposition 2 which is still parameterized by the undetermined state-control average limit processes (x 0 , u 0 ). Keep this in mind, a crucial step of MFG is to determine (x 0 , u 0 ) via the so-called consistency condition (CC) (it is also named Nash certainty equivalence (NCE) principle, see [19], etc). Based on coupled BFSDE (11), we aim to derive the CC system for limiting processes (x 0 , u 0 ), and then to prove such CC system is well-posed. Since our analysis is mainly based on coupled BFSDE (11) which arises as a Hamiltonian system, thus the resulting CC system is called Hamiltonian-type CC system. Firstly, take summation of the above N equations of (11) and divided by N , , respectively, which are stochastic processes to be determined.
Proof. Firstly, for any 1 ≤ i ≤ N , consider the following SDE: Since (23) is a linear SDE with bounded coefficients and square integrable nonhomogeneous terms, it has a unique solution y i (·). Secondly, we can define Applying Itô's formula to (24), Now we define Substituting (26) into (25) and noting (from (24)) that x i (T ) = ξ, it follows that On the other hand, it follows from (26) that Thirdly, substituting (24) and (28) into (23) and noting the initial value of x i (0), we can obtain that y i (·) is a solution of the SDE Finally, we show that Notice that y i (·), P (·), x i (·) and q i (·) are solutions of (23), (19), (25) and (21), respectively. By virtue of (24), y i is also the unique solution of the SDE Using Itô's formula, it is easy to show that Hence (30) follows from the uniqueness of solutions of linear SDEs.

Remark 1.
We have shown that (y i ,x i ,z i ,z i1 , · · · ,z id ) is the solution of the Hamiltonian system (11) if and only if y i (t) is the solution of (23),x i (t) is the solution of (25), andz i (t),z ij (t) satisfy (26). Thus, Hamiltonian system (11) or (23), (25), and (26) interchangeably to represent the processes (y i ,x i ,z i ,z i1 , · · · ,z id ). This is an important observation which simplifies our subsequent analysis considerably. Theorem 3.1. Under (A1)-(A2), the optimal control of Problem (II) has a decoupled form as follows: where the optimal trajectory satisfies and q i (·) is given by (21).
Proof. It follows immediately from Proposition 2 and Proposition 5.

Remark 2.
We remark that the decoupled control (31) disqualifies to be closedloop control since q i (·) depends on η i (·) from (21), which in turn depend on ξ i (·) (the terminal condition of state x i which is given as a priori). Due to this reason, the resulting decoupled CC system is termed Riccati CC system instead of closed-loop CC system. Firstly, summing up the N equations of (20) and dividing by N , it follows that where η Similarly, by (21) and (25), we obtain that and where x i (·) approximated by x 0 (·).
In addition, RCC and HCC are equivalent.
4. Asymptotic analysis: -Nash equilibrium. In previous sections, we obtained the mean-field game (MFG) strategiesū i (·), 1 ≤ i ≤ N via auxiliary Problem (II) and consistency condition system. In this section, we analyze the asymptotic property of MFG strategies and verify the -Nash equilibrium property of them. To start, we first recall the definition of -Nash equilibrium (see [5], [8], [19], etc.).
We state the main result of this section and its proof will be given later.
where q i (·) and u 0 (·) are given by (21) and (36) respectively, and x i (·) denote the state process corresponding to u i (·) of Problem (I), which is given by the following BSDE: Letx i (·) denote the state process corresponding toū i (·) for Problem (II), then we have Note that state average is coupled in cost only and (42)-(43) have the same coefficients. Thus, ( x i , z i , z i1 , · · · , z id , u i ) same as (x i ,z i ,z i1 , · · · ,z id ,ū i ), for any i = 1, 2, · · · , N . To prove Theorem 4.2, under (A1)-(A3), we first present the following estimates.
Proof. Under (A1)-(A2), by (37) and (43), we easily get sup where the last equality is obtained by Lemma 4.3. Similarly, we get and Furthermore, for any i = 1, 2, · · · , N , by z i (·) =z i (·) P − a.s. and z ij (·) =z ij (·) P − a.s., j = 1, 2, · · · , d, we obtain Then, from (2) and (4), it follows that We have addressed some estimates of states and costs corresponding to control u i , 1 ≤ i ≤ N . Our remaining analysis is to prove the strategies set ( u 1 , u 2 , · · · , u N ) is an -Nash equilibrium for Problem (I). For any fixed i, 1 ≤ i ≤ N , consider an admissible alternative control u i ∈ U i for A i and denote the corresponding state by Whereas all other agents keep the control u k , 1 ≤ k ≤ N , k = i, and their state equations are Furthermore, the cost functional of A i is Then, when making the perturbation, we only need to consider u i ∈ U i such that Thus, where C 5 is a positive constant independent of N . Besides, by (49), (50) and the estimates of BSDEs, we obtain the L 2 bound of l j , j = i, and the following inequality where C 6 is a positive constant. For i th agent A i , consider the perturbation in Problem (II) and introduce an auxiliary system. Its state trajectories are the solutions of BSDEs (49) and (50), and its cost functional is Then we have the following estimates.
Proof. By (41), (42) and (50), comparing their coefficients, we can obtain that ( x k , z k , z k1 , · · · , z kd ) is same to (l k , n k , n k1 , · · · , n kd ), k = i. Thus, we have where the last equality is due to the fact of sup (47) and (51), we have Using the similar method, we can get (53) and (54).
In addition, based on Lemma 4.5, we obtain more direct estimate to prove Theorem 4.2.
Lemma 4.6. For any 1 ≤ i ≤ N , Proof. The proof is similar with the proof of Lemma 4.4, we omit it.
Proof of Theorem 4.2. Now, we consider the -Nash equilibrium for A i . Combining Lemma 4.4 and 4.6, we have Thus, Theorem 4.2 follows by taking = O( 1 √ N ).
Remark 3. In this section, we get the -Nash equilibrium of MFG strategies based on RCC system. Note that RCC and HCC are equivalent, thus the MFG strategies from HCC system shall also satisfy the -Nash equilibrium. We omit its proof here.

5.
One-dimensional case. In this section, we carry out more detailed analysis to the scalar-valued backward LQ MFG problem. Under (A1)-(A3), let all coefficients of system be scalar-valued and d = 1, and then the dynamics of i th agent A i is given by following controlled linear BSDE: Here we consider the following terminal condition is a geometric Brownian motion (GBM), which stands for a given market benchmark index (e.g., [11,35]). The cost functional takes the following form: There exists a variety of practical problems in mathematic finance or management which can be formulated in above model. For instance, the relative performance criteria in risk management when considering average performance of other peers through the whole sector. This is the case when a given pension fund to evaluate its own performance by referring the average performance (namely, average hedging cost or initial deposit, surplus ) as its benchmark (see [16,30]). The relative performance criteria is also termed as the Joneses preference, habit formation utility, or relative wealth concern. It has been well-documented in economics and finance literature (see e.g., [1,2,9,10]). Another example may arise from mean-variance and quadratic-objective portfolio selection: the objective of single investor is to minimize the variance (V ar(x) = E[x − Ex] 2 ) of the system (see [12]). When considering the relative performance of all other investors (e.g., [14]), it can be formulated as state (55) and functional (56) above.
Secondly, for any 1 ≤ i ≤ N , the solutions of Riccati equations (18)- (19) become is the Malliavin derivative of h i (t) (see section 5 of [15]). The solution of SDE (21) become where B 2 (t) = exp − t 0 A + B 2 R P (r) dr . It follows by (32) that the optimal trajectory is Then the optimal control of Problem (II) and an -Nash equilibrium of Problem (I) can be obtained from (31) and (41), respectively, and the problem is solved completely.
5.1. Special case: Q = S = S 1 = 0. In this subsection, we consider a more special case where Q = S = S 1 = 0 to demonstrate some practical meanings of mean-field terms, and their effects upon the market behavior. In this case, the dynamics of A i is given by a linear BSDE (55) with d = 1 and the cost functional becomes And the backward HCC system (17) becomes: Using properties of SDEs and linear BSDEs, we can obtain the solution of (65): Therefore, for any 1 ≤ i ≤ N , using the same method, we can get the solutions of (10)- (11): and x 0 (0), u 0 (t) given before. By (4), the optimal cost is 2 is a positive constant.
In the special case, we have the following result for the impact of K 2 and K 3 : Proposition 6. Under (A1)-(A3), the optimal costJ i (ū i ) is decreasing function of K 2 and K 3 .
Proof. Regarding optimal cost J i (ū i ) as a function about K 2 ∈ [0, 1) and K 3 ∈ [0, 1], we have are positive constants. Thus the optimal cost J i (ū i ) is a monotonic decreasing function of parameters K 2 and K 3 .
Remark 4. Proposition 6 implies the following economic interpretation: for an economy involving a large number of agents aiming to minimize the hedging cost, the introduction of higher relative-performance index K 2 and (or) K 3 ∈ [0, 1] in their cost functionals will always lead to lower realized optimal cost functional. Consequently, the average market cost of agents will decrease as K 2 , K 3 −→ 1.
6. Conclusion. We study the dynamic optimization of large-population system with linear backward SDE (BSDE) state. The backward mean-field game (BMFG) is posed and both the Hamiltonian-type and Riccati-type consistency condition system (HCC, RCC) are discussed. The relation of HCC and RCC of backward MFG are also addressed. Our present work suggests various future research directions.
For example, (i) To study the backward MFG with indefinite control weight (this will formulate the mean-variance analysis with relative performance in our setting); (ii) To study the backward MFG with integral-quadratic constraint (this will involve the penalty procedure in Lagrange multiplier and new notation of -Nash equilibrium will be proposed); (iii) Based on (i), it is also interesting to further study the connection of HCC and RCC in the setting of indefinite control weight. It is expected that the HCC and RCC should be no longer equivalent in such setting. We plan to study these issues in our future works.