A second-order stochastic maximum principle for generalized mean-field control problem

In this paper, we study the generalized mean-field stochastic control problem when the usual stochastic maximum principle (SMP) is not applicable due to the singularity of the Hamiltonian function. In this case, we derive a second order SMP. We introduce the adjoint process by the generalized mean-field backward stochastic differential equation. The keys in the proofs are the expansion of the cost functional in terms of a perturbation parameter, and the use of the range theorem for vector-valued measures.

1. Introduction. We consider the following optimal stochastic control problem of mean-field type with the state equation dX t = b(t, X t , P Xt , v t )dt + σ(t, X t , P Xt )dB t , X 0 = x, (1) and the cost functional where P ξ denotes the law of the random variable ξ.
The agent wishes to minimize his cost functional, namely, an admissible control u ∈ U is said to be optimal if where U is the set of all admissible controls to be defined later in Section 3.
About stochastic maximum principle (SMP), some pioneering works have been done by Pontryagin et al. [22]. They obtained Pontryagin's maximum principle by using "spike variation". Kushner ([14], [15]) studied the SMP in the framework when the diffusion coefficient does not depend on the control variable, and the cost functional consists of terminal cost only. Haussmann [10] gave a version of SMP when the diffusion of the state does not depend on the control variable. Arkin and Saksonov [1], Bensoussan [3] and Bismut [4], proved different versions of SMP under various setups. An SMP was obtained by Peng [21] in 1990. In that paper, first and second order variational inequalities are introduced, when the control domain need not to be convex, and the diffusion coefficient contains the control variable.
Pardoux and Peng [20] introduced non-linear backward stochastic differential equations (BSDE) in 1990. They showed that under appropriate assumptions, BSDE admits an unique adapted solution, and the associated comparison theorem holds. Buckdahn et al [6] obtained mean-field BSDE in a natural way as the limit of some high dimensional system of forward and backward stochastic differential equations. Li [16] studied SMP for mean-filed controls when the domain of the control is assumed to be convex. Under some additional assumptions, both necessary and sufficient conditions for the optimality of a control were proved. Buckdahn et al [7] studied generalized mean-field stochastic differential equations and the associated partial differential equations (PDEs). "Generalized" means that the coefficients depend on both the state process and its law. They proved that under appropriate regularity conditions on the coefficients, the SDE has a unique classical solution. Buckdahn et al. [5] obtained SMP for generalized mean-field system in 2016.
Sometimes, the Hamiltonian function becomes constant in the control variable, as we will see in the next example, which makes the aforementioned SMP not applicable.
and cost functional: For the control u t ≡ 0, X u t ≡ 1 is the unique solution of (3). It is clear that J(u) = 0, and hence, u is an optimal control. On the other hand, the first order adjoint processes satisfy the following equation: Clearly (p t , q t ) ≡ (0, 0) is the solution. Therefore which makes the SMP useless in charactering the optimal control u t = 0. Now, we discuss singular optimal stochastic controls defined as follows.
As we have seen in last example, the SMP is not very useful under singular control. Our goal is to derive further necessary condition for optimality. We shall call the original SMP as the first order SMP while the one we will derive as the second order one.

SECOND-ORDER STOCHASTIC MAXIMUM PRINCIPLE 3
For second-order SMP of singular control problems, Bell [2], Gabasov [9], Kazemi-Dehkordi [12], Krener [13], Mizukami and Wu [19] devoted themselves to the deterministic case. Lu [17] interested in second order necessary conditions for stochastic evolution system. Tang [23] studied the singular optimal control problem for stochastic system with state equation and the cost functional By applying spike variation and vector-value measure theory, a second-order maximum principle is presented which involves the second-order adjoint process.
In this paper, we study the case when the state equation and the cost functional are in generalized mean-field form. The rest of this paper is organized as follows: In Section 2, we introduce the preliminaries about the generalized mean-field BSDEs. In Section 3, we set up the formulation of the singular optimal stochastic control problem and state the main result of the paper. Section 4 is devoted to the study of the impact of the control actions on the state and the cost functional by using Taylor's expansion. In that section, we also present some estimations about the state. In Section 5, the method in Section 4 is reused for the expansion of the cost functional with respect to the control variable. Sections 6 is devoted to the proof of the second order stochastic maximum principle.

2.
Preliminaries. In this section, for the convenience of the reader, we state some results of Buckdahn et al. [7] without proofs.
Let P 2 (R n ) be the collection of all square integrable probability measures over (R n , B(R n )), endowed with the 2-Wasserstein metric W 2 , which is defined as , for all µ ′ , ν ′ ∈ L 2 (F 0 ; R d ) with P µ ′ = P µ , P ν ′ = P ν . Denote by L 2 (F ; R n ) the collection of all R n -valued square integrable random variables. The following definition is taken from Cardaliaguet [8].
we call this function ∂ µ f the derivative of f .
Given f ∈ C 1,1 b (P 2 (R d )), and y ∈ R d , the question of the differentiability of its raises. This can be discussed in the same way as the first order derivative ∂ µ f above.
Similarly, we see that Let us now consider a complete probability space (Ω, F , P ) on which we define a d-dimensional where T ≥ 0 denotes an arbitrarily fixed time horizon. We make the following assumptions: There is a sub-σ-field F 0 ⊂ F such that i) the Brownian motion B is independent of F 0 , and ii) F 0 is "rich enough", i.e., we denote the filtration generated by B, completed and augmented by F 0 . Given deterministic Lipschitz functions σ : and It is well-known that under the assumptions above both SDEs have unique solu- , for all µ ∈ P 2 (R d ) iii) All the derivatives of σ i,j , b j , up to order 2 are bounded and Lipschitz continuous.
The following theorem is taken from [7]. It gives the Itô formula related to a probability measure.
For simplicity, we will make use of the following notations concerning matrices. We denote by R n×d the space of real matrices of n × d-type, and by R n×n Given any α, β ∈ R n , L, S ∈ R n×d , γ ∈ R d and M, N ∈ R n×n d , we introduce the following notation: For mean-field type SDE and BSDE, we have still to introduce some notations. Let (Ω,F,P ), (Ω,F ,P ) be two copies of the probability space (Ω, F , P ). For any random variable ξ over (Ω, F , P ), we denote byξ andξ its copies onΩ andΩ, respectively, which means that they have the same law as ξ, but defined over (Ω,F ,P ) and (Ω,F ,P ).Ẽ[·] = Ω (·)dP andĒ[·] = Ω (·)dP act only over the variables from ω andω, respectively.
3. Formulation of the singular optimal stochastic control problem and the main result. In this section, we formulate our generalized mean-field optimal control problem and state the main result of this article. Let (Ω, F , P ) be a probability space with filtration F t . Suppose that B t is a Brownian motion on (Ω, F , P ), where F is the filtration generated by B t , augmented by all P -null sets. Let U denote the admissible control set consisting of F t -adapted process u t , take values in U , such that sup 0≤t≤T The state equation and the cost functional are defined by (1) and (2). Throughout this paper, we make the following assumptions on the coefficients: Hypothesis 3.1. (1) The functions b, σ, h, Φ are differentiable with respect to (x, µ, v). b, σ satisfy Lipschitz condition with respect to (x, µ, v).
(2) The first-order derivatives with respect to (x, µ) of b, σ are Lipschitz continuous and bounded.
(4) The second-order derivatives with respect to (x, µ) of b, σ, h, Φ are continuous and bounded. All the second-order derivatives are Borel measurable with respect to (t, x, µ, v).
Suppose that u is an optimal control and X u is the associated trajectory. We are to find the necessary conditions satisfied by u. Firstly, we introduce the following abbreviations: b(t) := b(t, X u can also be introduced. Consider the first order adjont process According to Theorem 3.1 [6], this BSDE admit a unique adapted solution. We also denote the solution as (p u t , q u t ). Define the Hamiltonian as follows: The following first-order SMP is obtained as a special case of [5].
Theorem 3.1 (The First Order SMP). Let Hypothesis 3.1 hold. Suppose that X u t is the associated trajectory of the optimal control u, and (p, q) is the solution to the mean-field backward stochastic differential equation (MFBSDE) (12). Then, there is a subset I 0 ⊂ [0, T ] which is of full measure such that ∀t ∈ I 0 , H(t, X u t , P X u t , p t , q t , u t ) = inf v∈U H(t, X u t , P X u t , p t , q t , v), a.s..
As we pointed out in the introduction, the aim of this article is to derive another SMP when the Hamiltonian function above becomes singular, and hence, the SMP above is not suitable for characterizing of the optimal control u t . To this end, we define the second-order adjoint process as follows: Remark 1. By changing the terminal condition p T , we can always eliminate the terminal cost when deducing the variational inequality. In fact, the terminal condition P T = 0 is due to the assumption that Φ ≡ 0. Without this assumption, we only need to set Without loss of generality, we assume the terminal cost Φ ≡ 0 in the following sections.

HANCHENG GUO AND JIE XIONG
Finally, we present our main result in this article.
Theorem 3.2. Assume that Hypothesis 3.1 hold. Let X u · , u · be an optimal pair and let u · be singular on the control region V . Suppose that (P, Q) is the unique adapted solution of equation (14). Then, there is a full measure subset I 0 ⊂ [0, T ] such that at each t ∈ I 0 , X u · , u · satisfies, not only the first-order stochastic maximum principle, but also the following inequality 4. Quantitative analysis of the impact of control actions on the state.
In this section, we expand the state process according to different orders of the perturbation parameter d(u, v), a distance between the optimal control u and its perturbation v.
Proof. By the state equation (1), for τ ∈ [0, T ] we have, From Gronwall's inequality, we then have the desired result.
For v i ∈ U, i = 1, 2, we define Given the optimal pair (X u · , u · ), we now proceed to the perturbation X v of X u . Let and Denote The following lemmas give the estimation of their orders according to parameter d(v, u).
Lemma 4.2. Assume that Hypothesis 3.1 holds. Then, there exists a K > 0, such that for any v(·), u(·) ∈ U, we have Proof. For any τ ∈ [0, T ], denote By Hypothesis 3.1 and the Burkholder-Davis-Gundy inequality, we have and The application of Grownwall's inequality allows to obtain that

HANCHENG GUO AND JIE XIONG
Notice that the first-order derivative b x is bounded. Then, (26) implies the following estimate According to assumption about v and u, then, In fact, by Minkowski's inequality, we have The following lemma gives the order of X v * t . Lemma 4.3. Assume Hypothesis 3.1 holds. For v(·) ∈ U and Borel subset Then we have when |I ρ | → 0.
Proof. We introduce the following notations first where X v,12 Similarly notations can be introduced with b replaced by σ.
We now proceed to estimating X * . (v) defined by (20). By (1), (18) and (19), we have where . We can represent α(t) as follows. Denote It is easy to show that Simularly, by setting ) .
Note that By Burkholder-Davis-Gundy inequality, for τ ∈ [0, T ], we obtain the following estimation According to Gronwall's inequality, we have About A(s;v) we have Note that similar estimates hold with b replaced by b x and b µ . Since Xv ,12 t → 0 as |I ρ | → 0, so we also have △b xx (s; λη;v) → 0, replace b xx by b µµ , b µy and b xµ , we can get the similar result when |I ρ | → 0.
According to estimation of Xv ,1 · , Xv ,2 · in Lemma 4.2, we obtain Similarly Finally, by (37) we have the desire result.

5.
Expansion of the cost functional with respect to control variable. In this section, we use the method of Lemma 4.3 again to study the expansion of the cost functional according to different order of the purtubation.
Recalling that p t is given by (12) and applying Itô's formula to p t X v,12 t , we obtain Hence, Now we apply the range theorem for vector-valued measures due to [18], to deduce the variational inequality.
Recall thatv t is defined by (30). According to Lemma 4.1 [18], for any 0 < ρ < 1, we can choose a suitable I ρ ⊂ [0, T ] such that |I ρ | = ρT, and Lemma 5.2. For the I ρ above, t ∈ [0, T ], we also have The proof of the above lemma is essentially the same as Lemma 4.1 [18]. For the convenience of readers, we present the proof here.
For any s ∈ [0, T ], we can always find a m ≥ 0, s.t. s ∈ [t m , t m+1 ). Then we have where I G (t) is the indicator function of G. Thus, for each 1 ≤ i ≤ k, Set φ 1 (t) = △b(t, v), then letting ρ → 0, we finish the proof.
Further more, inspired by [24], we also have the following lemma holds.