A Stackelberg Game of Backward Stochastic Differential Equations with Partial Information

This paper is concerned with a Stackelberg game of backward stochastic differential equations (BSDEs) with partial information, where the information of the follower is a sub-$\sigma$-algebra of that of the leader. Necessary and sufficient conditions of the optimality for the follower and the leader are first given for the general problem, by the partial information stochastic maximum principles of BSDEs and forward-backward stochastic differential equations (FBSDEs), respectively. Then a linear-quadratic (LQ) Stackelberg game of BSDEs with partial information is investigated. The state estimate feedback representation for the optimal control of the follower is first given via two Riccati equations. Then the leader's problem is formulated as an optimal control problem of FBSDE. Four high-dimensional Riccati equations are introduced to represent the state estimate feedback for the optimal control of the leader. Theoretic results are applied to a pension fund management problem of two players in the financial market.


Introduction
The Stackelberg game is also known as the leader-follower game, which can be traced back to the early work by Stackelberg [36], when he defined a concept of a hierarchical solution for markets where some firms have power of domination over others. The solutions, in the context of the differential game, is called the corresponding Stackelberg equilibrium points in which there are two players with asymmetric roles, one leader and one follower. For obtaining the Stackelberg solutions, it is usual to divide the game problem into two parts. In the first part, which is also known as the follower's problem, firstly the leader announces his strategy, then the follower will make an instantaneous response, and choose an optimal strategy corresponding to the given leader's strategy to minimize (or maximize) his cost functional. In the second part, knowing the follower would take such an optimal strategy, the leader will choose an optimal strategy to minimize (or maximize) his cost functional. Overall, the decisions must be made by two player and one of them is subordinated to the other because of the asymmetric roles, therefore one player must making a decision after the other player's decision is made.
The Stackelberg game has wide practical financial and economical backgrounds, and has attracted more and more research attentions with applications. Simann and Cruz [35] made an early study on the properties of the Stackelberg solution in static and dynamic non-zero sum two-player games. Bagchi and Başar [2] investigated an LQ stochastic Stackelberg differential game, where the state and control variables do not enter the diffusion coefficient in the state equation. Yong [45] studied an LQ leader-follower differential game in a more general framework where the coefficients of the system and the cost functionals are random, the diffusion of the state equation contains the control variables, and the weight matrices for the controls in cost functionals are not necessarily positive definite. Øksendal et al. [25] proved a maximum principle for a Stackelberg differential game with jump-diffusion, and applied the result to a continuous time newsvendor problem. Bensoussan et al. [3] introduced several solution concepts in terms of the players' information sets, and studied LQ Stackelberg games under both adapted openloop and closed-loop memoryless information structures, whereas the control variables do not enter the diffusion coefficient of the state equation. Meanwhile, the Stackelberg games have been investigated in the mean-field, time-delay, partial information and other fields. Recently, Xu and Zhang [44] studied the discrete-time leader-follower game with time delay and the new co-states which capture the future information of the control and the new state which contains the past effects are introduced to overcome the noncausality of strategy design caused by the delay, then the same technique is used to deal with the continuous-time system. Then Xu et al. [43] studied the leader-follower differential game with time delay appearing in the leader's control, the open-loop solution is given in the form of the conditional expectation with respect to several symmetric Riccati equations by mainly establishing the nonhomogeneous relationship between the forward and the backward variables. Moon and Başar [24] investigated an LQ mean field Stackelberg differential game with the adapted open-loop information structure of the leader where there are only one leader but arbitrarily large number of followers. Lin et al. [22] studied the open-loop LQ Stackelberg game of the mean-field stochastic systems in finite horizon, and a sufficient condition for the existence and uniqueness of the stackelberg strategy was given in terms of the solvability of some Riccati equations and a convexity condition by introducing new state and costate variables. Shi et al. [32] introduced a new explanation for the asymmetric information feature that the information available to the follower is based on the some sub-σ-algebra of that available to the leader for the Stackelberg differential game. Then an LQ stochastic Stackelberg differential game with noisy observation was solved via some measure transformation, filtering technique, linear FBSDE and mean-field FBSDE decoupling technique, where not all the diffusion coefficients contain the control variables. Shi et al. [34] studied an LQ stochastic Stackelberg differential game with asymmetric information, where the control variables enter both diffusion coefficients of the state equation. Shi et al. [33] investigated a kind of stochastic LQ Stackelberg differential game with overlapping information which means that the follower's and the leader's information have some joint part, while they have no inclusion relation. Li and Yu [19] proved the uniqueness and solvability of a kind of coupled forwardbackward stochastic differential equations (FBSDEs) with a multilevel self-similar dominationmonotonicity structure, then this kind of FBSDEs is used to characterize the unique equilibrium of an LQ generalized Stackelberg game with hierarchy in a closed form.
Different from forward stochastic differential equations (SDEs) where a prescribed initial condition x(0) = x 0 is given, the BSDEs is short for a kind of backward SDEs with a given terminal condition y(T ) = ξ. And BSDE admits a pair of adapted solution (y(·), z(·)) under some conditions, where the additional term z(·) may be interpreted as a risk-adjustment factor and is required for the equation to have adapted solution. The linear version of this type of equation was first introduced by Bismut [4] as the adjoint equation in the stochastic maximum principle. General nonlinear BSDEs, introduced independently by Pardoux and Peng [26] and Duffie and Epstein [11], have received considerable research attention in recent years due to their nice structure and wide applicability in a number of different areas, especially in mathematical finance, optimal control and differential games. El Karoui et al. [12] discussed different properties of BSDEs and their application to finance. Two recent monographs about BSDEs can be seen in Pardoux and Rȃşcanu [27] and Zhang [48].
The optimal control problem of BSDEs was first studied by Peng [28,29] and El Karoui et al. [12], when solving the recursive utility maximization problems. Dokuchaev and Zhou [8] studied a stochastic control problem where the system dynamics is a controlled nonlinear BSDE. Kohlmann and Zhou [18] explored the relationship between BSDEs and stochastic controls by interpreting BSDEs as some stochastic optimal control problems. Chen and Zhou [6] investigated an optimization model of stochastic LQ regulators with indefinite control cost weighting matrices, involving a backward LQ problem. Lim and Zhou [21] studied an optimal control of linear BSDEs with a quadratic cost criteria, and the solution is obtained by using the completion-ofsquares technique. Huang et al. [14] studied a partial information control problem of backward stochastic systems, and obtained a new stochastic maximum principle. Shi [30] investigated an optimal control problem for systems described by BSDEs with time delayed generators, and proved a sufficient maximum principle. The mean-field BSDE was firstly introduced by Buckdahn et al. [5]. Ma and Liu [23] investigated an optimal control of an infinite horizon system governed by mean-field BSDE with delay and partial information, and establish the existence and uniqueness results for a mean-field BSDE with average delay. Li et al. [20] studied the LQ optimal control problem for mean-filed BSDEs.
When it comes to the differential game problem of BSDEs, Hamadene and Lepeltier [13] discussed a stochastic zero-sum differential games of the results on BSDEs, and obtained the existence of a saddle point in the bounded case under the Isaacs' condition. Yu and Ji [47] studied an existence and uniqueness result for an initial coupled FBSDE under some monotone conditions, which was applied to backward LQ non-zero sum stochastic differential game problem. Wang and Yu [39] established a necessary condition and a sufficient condition in the form of maximum principle for open-loop equilibrium point of the game systems described by the BSDEs. Wang and Yu [40] continued to establish a necessary condition in the form of maximum principle for open-loop Nash equilibrium point of this type of partial information game, and then gave a verification theorem which is a sufficient condition for Nash equilibrium point. Shi and Wang [31] investigated a non-zero sum differential game, where the state dynamics follows a BSDE with time-delayed generator, and an Arrow's sufficient condition for open-loop equilibrium point is proved. Huang et al. [15] studied a backward mean-field linear-quadratic-Gaussian games of weakly coupled stochastic large-population system, and two classes of foregoing games are discussed and their decentralized strategies are derived through the consistency condition. Huang and Wang [16] discussed a kind of non-zero sum differential game of mean-field BSDE. Wang et al. [38] studied a kind of LQ non-zero sum differential game driven by BSDE with asymmetric information. Aurell [1] studied a mean-field type games between two players with backward stochastic dynamics, and made up a class of non-zero sum, non-cooperating, differential games where the players' state dynamics solve a BSDE that depends on the marginal distributions of player states. Du et al. [9] studied the mean-field game of N weakly-coupled linear BSDE system. Du and Wu [10] investigated a new kind of Stackelberg differential game of mean-field BSDEs. Huang et al. [17] focused on a kind of non-zero sum differential game driven by mean-field BSDE with asymmetric information.
Inspired by the above literatures, in this paper we study the Stackelberg game of BSDEs with partial information, where the coefficients of the backward game system and cost functionals are deterministic, and the control domain is convex. In our framework, we set that the information filtration available to the leader is the complete information filtration naturally generated by the random noise source, and the information filtration available to the follower is based on the subσ-algebra of that available to the leader. The novelty of the formulation and the contribution in this paper is the following. (1) A new kind of general Stackelberg game of BSDEs with partial information is introduced and studied by the maximum principle approach, where a terminal condition ξ is given in advance. For the follower's problem, the partial information maximum principle and verification theorem are given, which are direct from Theorem 2.1 and Theorem 2.3 of Wang and Yu [40]. For the leader's problem, the partial information maximum principle could be derived via the similar technique in Zuo and Min [49] which, however, did not give the corresponding sufficient condition. Therefore, in our paper, the partial information verification theorem is derived, by the Clarke generalized gradient. (2) For the LQ case, it consists of a stochastic optimal control problem of BSDE with partial information for the follower, and followed by a stochastic optimal control problem of coupled conditional mean-field forwardbackward stochastic differential equations (FBSDEs) with complete information for the leader, which is different from that in the (forward) Stackelberg differential game studied in Shi et al. [34]. (3) For giving the state estimate feedback representations for the optimal control of the follower, two Riccati equations, a linear backward stochastic differential filtering equation (BSDFE), and a linear stochastic differential filtering equation (SDFE) are introduced. See Theorem 4.1. Then, four high-dimensional Riccati equations, a linear BSDFE, and a linear SDFE are introduced to represent the optimal control of the leader as the state estimate feedback form. See Theorem 4.2. (4) A pension fund problem of two players with asymmetric information in the financial market is studied, the Stackelberg equilibrium point is represented and the optimal initial wealth reserve is obtained explicitly.
The rest of this paper is organized as follows. In Section 2, the general Stackelberg game of BSDEs with partial information is formulated. Then this general problem is studied in Section 3. The follower's problem of the BSDE with partial information is considered first in Subsection 3.1, while the leader's problem of the conditional mean-field FBSDE is studied in Subsection 3.2. By the partial information maximum principle approach, necessary and sufficient conditions for the optimal controls of the follower and the leader's are given, respectively. Then the LQ Stackelberg game problem with partial information is investigated in Section 4. Specially, Subsection 4.1 is devoted to the solution of an LQ stochastic optimal control problem of BSDE with partial information of the follower, via two Riccati equations, a BSDFE and a SDFE, the optimal control of the follower is given in the state feedback form. Subsection 4.2 is devoted to the solution of an LQ stochastic optimal control problem of coupled conditional mean-field FBSDE with complete information of the leader, the optimal control of the leader is represented as the state feedback form by the solutions to four new high-dimensional Riccati equations, a BSDFE and a SDFE. In Section 5, the theoretic results in the previous sections are applied to a pension fund management problem of two players with asymmetric information in the financial market. Finally, Section 6 gives some concluding remarks.

Problem Formulation
In this paper, we use R n to denote the Euclidean space of n-dimensional vectors, R n×d to denote the space of n × d matrices, and S n to denote the space of n × n symmetric matrices. ·, · and | · | are used to denote the scalar product and norm in the Euclidean space, respectively. A ⊤ appearing in the superscript of a matrix, denotes its transpose. f x , f xx denote the first-and second-order partial derivatives with respect to x for a differentiable function f , respectively.
Let T > 0 be fixed. Consider a complete probability space (Ω, F, P) and two standard m( m)dimensional Brownian motions W (t) and W (t) with W (0) = W (0) = 0, which generates the filtration F t = σ{W (r), W (r) : 0 ≤ r ≤ t} augmented by all the P-null sets in F. L 2 F T (Ω, R n ) denotes the set of R n -valued, F T -measurable, square-integrable random vectors, L 2 F (0, T ; R n ) denotes the set of R n -valued, F t -adapted, square integrable processes on [0, T ], L 2 F (0, T ; R n×d ) denotes the set of n × d-matrix-valued, F t -adapted, square integrable processes on [0, T ], and L ∞ (0, T ; R n×d ) denotes the set of n × d-matrix-valued, bounded functions on [0, T ].
Let us consider the following controlled BSDE: is the control process of the follower, and v 2 (·) ∈ U 2 is the control process of the leader, where U i is a nonempty convex subset of R k i , i = 1, 2. In the backward game system (2.1), the two players work together to achieve a common goal ξ at the terminal time T .
Let G i t ⊆ F t be a given sub-filtration which represents the information available to the follower and the leader at time t ∈ [0, T ], i = 1, 2, respectively, and G 1 t ⊆ G 2 t ⊆ F t . We define the admissible control sets by respectively. We define the cost functionals of the follower and the leader as are given continuous functions in (t, y, z,z, v 1 , v 2 ) and h i : R n → R are given continuous functions, for i = 1, 2. We remark that the cost functional (2.3) describe that the players have their own benefits except for the terminal common goal ξ. Now, we give some assumptions that will be in force through this paper.
(A1) The function f is continuously differentiable in (y, z,z, v 1 , v 2 ). Moreover, the partial derivatives f y , f z , fz, f v 1 and f v 2 with respect to y, z,z, v 1 and v 2 are uniformly bounded.
The problem studied in this paper is proposed in the following definition.
is called an optimal solution to the Stackelberg game of BSDEs with partial information, if it satisfies the following condition: (i) For given ξ ∈ L 2 F T (Ω, R n ) and any v 2 (·) ∈ U 2 [0, T ], there exists a map Γ : The optimal strategy of the follower isv 1 (·) = Γ(v 2 (·), ξ).
We call the above problem a Stackelberg game of BSDE with partial information.

Optimization for The Follower
In this subsection, we seek the necessary and sufficient conditions of the partial information optimal control for the follower. Let ξ ∈ L 2 F T (Ω, R n ) be given, giving the leader's strategy v 2 (·) ∈ U 2 [0, T ]. Letv 1 (·) be an optimal control of the follower, and (yv 1 ,v 2 (·), zv 1 ,v 2 (·),zv 1 ,v 2 (·)) be the corresponding state trajectory. Let the process x(·) ∈ L 2 F (0, T ; R n ) satisfy the following adjoint equation: where H 1y , H 1z and H 1z denote the partial derivatives of H 1 with respect to y, z andz, respectively, and the Hamiltonian function (3. 2) The following two results are direct from Theorems 2.1 and 2.3 of Wang and Yu [40].

Optimization for the leader
In this subsection, we firstly restate the partial information stochastic optimal control problem of the leader in detail. For where for the simplicity of notations, we have denoted for Φ = f, L 1 , respectively. Then we redefine The target of the leader is to find an optimal controlv 2 (·) ∈ U 2 [0, T ].
Before we prove this theorem, let us review some preliminaries of the Clarke generalized gradient, which was used to derive sufficient conditions in Yong and Zhou [46].
Let M : X → R be a locally Lipschitz continuous function, where X is a convex set in R n .
Definition 3.1. The Clarke generalized gradient of M atx ∈ X , denoted by ∂M(x), is a set defined by (2) For any set N ⊂ X of measure zero, where "co" denotes the convex hull of a set.
(3) Ifx attains the maximum or minimum of M over X , then 0 ∈ ∂M(x).
(4) If M is a convex (respectively, concave) function, then p ∈ ∂M(x) if and only if Proof of Theorem 3.4. By the condition (3.11) and Lemma 3.1-(3), we have (3.12) By Lemma 3.2, we further conclude that Since (3.14) For any v 2 (·) ∈ U 2 , then we consider Since h 2 is convex in y, we get Applying Itô's formula to − Q(·), y v 2 (·)− yv 2 (·) , then taking expectation on both sides, we have Similarly, we get With the help of the initial value and terminal value of Q(·) and p(·), respectively, in the equation (3.9), and the condition h 1yy (y) ≡h 1 , we have

The Linear Quadratic Problem
In this section, we aim to study an LQ case with m = m = 1 and to give some explicit forms of the previous results. Moreover, we only consider the special case when the follower's information filtration is G 1 t = σ{W (r) : 0 ≤ r ≤ t}, and the leader's information filtration is
Likewise, from (4.29), the optimal controlv 1 (·) of the follower can also be represented in a nonanticipating way:

Application to Pension Fund Management Problem
In this section, we are denoted to study a defined benefit (DB) pension fund management problem arising from financial markets, which naturally motivate the above theoretical research of the LQ Stackelberg game for BSDE with partial information. It is well known that a pension fund can be classified into two main categories: Defined benefit (DB) pension scheme and defined contribution (DC) pension scheme. In a DB scheme, the benefits are fixed in advance by the sponsor, and the contributions are designed to assure the future payments to claim holders in their retirement period. There are two corresponding representative members who makes contributions continuously over time to the pension fund in [0, T ]. One of the members is the leader with the regular premium proportion v 2 as his contribution, who is usually regarded as the supervisory, government or company. And the other one is the follower with the regular premium proportion v 1 as his contribution, who is usually regarded as the individual producer or retail investor. Premiums are a proportion of salary or income which are continuously deposited into the pension fund plan member's account as the contributions.
We consider a continuous-time setup, and the dynamics of pension fund plan member's account is given by where F (t) is the value process of pension fund plan member's account at time t, d∆(t) is the instantaneous return during the time interval (t, t + dt), v 1 (·) and v 2 (·) are the premium proportions of follower and leader which acts as our control variables, respectively. DB is the pension scheme benefit outgo which is assumed to be a constant for sake of simplicity. Suppose that the pension fund is invested in a risk-free asset (bond) and two risky assets (stocks). The price S 0 (t) of the bond at time t is given by where r(t) > 0 is the instantaneous rate of return at time t. The prices S 1 (t) and S 2 (t) of the two stocks at time t are given by respectively, where W (·) and W (·) are two independent one-dimension Brownian motion. Here µ i (t) > r(t), i = 1, 2 are the instantaneous rates of expected return and σ(t), σ(t) > 0 are the instantaneous rates of volatility, at time t. We assume that µ 1 (·), µ 2 (·), r(·), σ(·) and σ(·) are deterministic bounded functions, and σ −1 (·) and σ −1 (·) are also bounded.
In the real financial market, it is reasonable for the investors to make decisions based on the historical price of the risky asset S 1 (·) and S 2 (·). Therefore, the observable filtration at time t can be set as F t = σ{S 1 (s), S 2 (s)|0 ≤ s ≤ t} and it is clear that F t = σ{W (s), W (s)|0 ≤ s ≤ t}. However, in our Stackelberg game background, there exists two different asymmetric information for two players, to some degree, because of some practical phenomenon such as insider trading or the information asymmetry. So we assume that the one who plays a leader's role knows the full information from the financial market including the price of the risky assets S 1 (·) and S 2 (·), which is called F t = σ{W (s), W (s)|0 ≤ s ≤ t}, but the other one who plays a follower's role only knows the partial information about the price S 1 (·) coming from G 1 t = σ{W (s)|0 ≤ s ≤ t}. Obviously, G 1 t ⊂ F t . Suppose that the proportion π 1 (t) and π 2 (t) of the pension fund is to be allocated in the two stock, respectively, while 1 − π 1 (t) − π 2 (t) is to be allocated in the bond, at time t. Thus the instantaneous return becomes (5.5) Therefore, the pension fund dynamics can be written as the following form: On the one hand, if the pension fund manager wants to achieve the wealth level ξ at the terminal time T to fulfill his/her obligations, then the dynamics of pension fund plan member's account is On the other hand, if we set σ(·)π 1 (·)F (·) = Z(·) and σ(·)π 2 (·)F (·) = Z(·), then the above equation is equivalent to the BSDE where the control processes v 1 (·) and v 2 (·) are adapted to the information filtration G 1 t and F t , respectively.
T ] denote the admissible control sets for the follower and leader, respectively. For any (v 1 (·), v 2 (·)) ∈ U 1 × U 2 , the BSDE (5.8) admits a unique solution triple (F (·), Z(·), Z(·)) in L 2 F (0, T ; R) × L 2 F (0, T ; R) × L 2 F (0, T ; R). Let us introduce the cost functionals where β is a discount factor and N C is a preset target, say, the normal cost. The aim of the members is to minimize the cost functional J i (v 1 (·), v 2 (·); ξ) over U i , i = 1, 2. Recall that the first term of J i (u 1 (·), u 2 (·); ξ) is the running cost due to the deviation of the contribution from the preset target level. This term is introduced here to measure the stability of the DB pension scheme. The second term F (0) is just the initial reserve to operate the scheme.
Let us now explain the leader-follower feature of the game. At time t, first, the big company (leader) announces his/her contribution (premium proportion) v 2 (t). Second, with the help of the part of informations the retail investor (follower) knows, he/she would like to set his/her contribution (premium proportion)v 1 (t) as his optimal response to the company's announced decisions so that J 1 (v 1 (·), v 2 (·); ξ) is the minimum of J 1 (v 1 (·), v 2 (·); ξ) over v 1 (·) ∈ U 1 . Knowing the follower would take such an optimal controlv 1 (·) (supposing it exists, which depends on the choice v 2 (·) of the leader and the initial state ξ, in general), and having the advantages over the follower in case of possessing more information, the big company (leader) would like to choose somev 2 (·) ∈ U 2 to minimize J 2 (v 1 (·), v 2 (·); ξ) over v 2 (·) ∈ U 2 .
We aim to find the Stackelberg equilibrium point (v 1 (·),v 2 (·)) ∈ U 1 ×U 2 , which is the optimal control pairs of the Stackelberg game of BSDE with partial information.
There is much literature to study the pension fund management problem by stochastic control approach, such as Huang et al. [14], Di Giacinto et al. [7], etc. However, our problems are essentially different in that we study the pension fund problem in the framework of Stackelberg game of BSDE with partial information. For more details about financial applications for partial information differential games, please refer to Wang and Yu [40], Shi and Wang [31], Huang et al. [17], Xiong et al. [42], etc.

Concluding Remarks
In this paper, we have discussed the Stackelberg game of BSDEs with partial information. The general problem is studied first and then the LQ special case is researched in some state estimate representations for the Stackelberg equilibrium point, for the follower and the leader, respectively. Theoretical results are applied to the pension fund management problem.
Possible extensions to the Stackelberg game with noisy observations are desired to be researched, and the solvability of the related Riccati equations are very challenging and difficult research topics. We will consider these problems in our future research.