A stochastic maximum principle for linear quadratic problem with nonconvex control domain

This paper considers the stochastic linear quadratic optimal control problem in which the control domain is nonconvex. By the functional analysis and convex perturbation methods, we establish a novel maximum principle. The application of the proposed maximum principle is illustrated through a work-out example.

1. Introduction. The stochastic linear quadratic optimal control problems play an important role in optimal control problems. On one hand, many nonlinear control problems can be approximated by the linear quadratic control problems; on the other hand, solutions to the linear quadratic control problems show elegant properties because of their brief and beautiful structures.
A classical form of the stochastic linear quadratic optimal control problems is to minimize the quadratic cost functional J(u(·)) = E 1 2 T 0 [ Q(t)X(t), X(t) + 2 S(t)X(t), u(t) + R(t)u(t), u(t) ] dt + 1 2 GX(T ), X(T ) (1) with the control u(·) being a square integrable adapted process and the state being the solution to the linear stochastic differential equation where A, B, C, D, b, σ are deterministic matrix-valued functions of suitable sizes; T is a fixed terminal time; W is a Wiener process. Under suitable conditions, the cost functional (1) and the control system (2) are well defined. This kind of stochastic linear quadratic problem was first studied by Wonhanm [22]. Then Bismut [4] proved an existence result for optimal control.
As one of the foundations of modern finance theory, the mean-variance portfolio issue can be translated to the stochastic linear quadratic problems. As indicated in [14], the mean-variance problem is essentially an indefinite stochastic linear quadratic problem because its running cost is identically zero. The mean-variance model was formulated as a quadratic programming problem by Markowitz in the case of a single-period investment (see [16,17]), and the multi-period counterpart was soon considered after the pioneering work (see [9,11,21,18]). By resorting to the method of dynamic programming, the mean-variance model in a continuous time setting was developed a bit later (see [10,6,7]). By employing the embedding technique, Zhou and Li [26] turned the continuous-time mean-variance problem into a stochastic linear quadratic problem, which could be solved by using the results in Chen, Li and Zhou [5].
Recently, Li, Zhou and Lim [14] discussed a continuous-time mean-variance problem with no-shorting constraints. By using a martingale approach, Bielecki et al. [3] investigated the problem of continuous-time mean-variance with bankruptcy prohibition. Heunis [12] carefully considered to minimize the expected value of a general quadratic loss function of the wealth in a more general setting where there is a specified convex constraint on the portfolio over the trading interval, together with a specified almost sure lower-bound on the wealth at close of trade. Li and Xu [15] studied the continuous-time mean-variance problem with the mixed restriction of both bankruptcy prohibition and convex cone portfolio constraints. The control domains in the aforementioned literature are assumed to be convex sets or convex cones. Inspired by the above discussion, we intend to further study the stochastic linear quadratic problem with nonconvex control domain, which may be helpful to the mean-variance problems with nonconvex control domain.
In this paper, we develop a novel stochastic maximum principle to tackle the stochastic linear quadratic problem with nonconvex control domain. To begin with, based on functional analysis approach, we transform the original linear quadratic problem into a quadratic optimization problem in a Hilbert space, which consists of all square integrable control processes. Next, by introducing a parameter in the quadratic optimization problem, we turn equivalently the original problem into a concave control problem with convex control domain, which can be solved by the classical stochastic maximum principle. Then we derive a new stochastic maximum principle. It is well known that the stochastic maximum principle is of great importance in solving stochastic optimal control problems (see [8], [13], [19,20], [23,24]). A local form of the stochastic maximum principle for the classical stochastic recursive optimal control problem was established later in Peng [20]. By utilizing the spike variational method and the second-order adjoint equations, Peng [19] obtained a general stochastic maximum principle when the admissible control domain needs not to be convex and the diffusion coefficients contain the control variable. The achievement of the stochastic maximum principle in [19] relies heavily on the second-order adjoint equations. Compared with the classical methods dealing with the nonconvex stochastic linear quadratic problem, the approach provided in this paper has two main advantages as follows. Firstly, we need not to introduce the second-order adjoint equation, and the presented stochastic maximum principle has a concise form. Secondly, we don't have to impose any positive definite requirement on the coefficients.
The paper is organized as follows. In section 2, we present some notations and formulate the stochastic linear quadratic problem. We formulate our stochastic linear quadratic optimal control problem by functional analysis approach in section 3. We derive the maximum principle in section 4 and give a work-out example in section 5. In section 6 we conclude the paper.
2. Problem formulation. Throughout this paper, we denote by R n the ndimensional vector space and R k×n the set of k × n matrices. Particularly, we denote by S n the set of symmetric n × n matrices. For any given Euclidean space H, we denote by ·, · (resp. | · |) the scalar product (resp. norm) of H. When M = (m ij ), N = (n ij ) ∈ R k×n , we define M, N = tr{M N } and |M | = √ M M , where the superscript denotes the transpose of vectors or matrices. We say M > (resp. ≥) 0 if M ∈ S n is positive (resp. nonnegative) definite.
Let W (·) = (W 1 (·), . . . , W d (·)) be a standard d-dimensional Brownian motion defined on a complete probability space (Ω, F, P ). The information structure is given by a filtration F = {F t } 0≤t≤T , which is generated by W (·) and augmented by all the P -null sets. Let To simplify the presentation, we assume the dimension of the Brownian motion d = 1.
For a given x ∈ R n , consider the following linear stochastic differential equation: where A, B, C, D, b, σ are deterministic matrix-valued functions of suitable sizes. In the above equation, u(·) is a control process and X(·) is the corresponding state process. In addition, the quadratic cost functional is given by where G ∈ S n , Q, S and R are S n -, R k×n -and S k -valued functions, respectively.
Our stochastic linear quadratic control problem is to find an admissible control u(·) such that J(ū(·)) = min 3. Results of linear quadratic problem by functional analysis. Since the state equation in a stochastic linear quadratic problem is linear, by variation of constant formula, the state process can be explicitly expressed in terms of the initial state and the control. Substituting this relation into the cost functional, we obtain a functional quadratic in the state and control terms. To describe the method in detail, we first introduce the following matrix-valued process: from the method of stochastic calculus, we know that Φ −1 (t) exists for all t ≥ 0, and satisfies The solution of (3) can be written as: By BDG-inequality, we can get the estimate of X(·) as Next, ∀x ∈ R n , u(·) ∈ U ad , we define the following operators: Then, the state equation (3) and its terminal value can be written as: Our next goal is to find a representation of the cost functional (4) in terms of control.
To this end, we note that the following operators are bounded linear operators We need to find the adjoint operators of the above bounded linear operators, and Actually, we can define the adjoint operators through the following backward stochastic differential equation: We have the following results. Then L * : L 2 F (0, T ; R n ) → U ad and Γ * : L 2 F (0, T ; R n ) → R n are bounded operators satisfying (6).
(ii) For any η ∈ L 2 F (Ω; R n ), let (p 1 (·), q 1 (·)) ∈ L 2 F (0, T ; R n ) × L 2 F (0, T ; R n ) be the adapted solution of (8) with ξ(·) = 0. Define Once we have the above results, we can obtain another form of the cost functional: where It is clear that N : U ad → U ad is a bounded operator. 4. Stochastic maximum principle. Generally speaking, since the control domain of the stochastic linear quadratic problem is nonconvex, one should take the second-order adjoint process into consideration. To conveniently state this classical maximum principle, we give the following propositions, which come from [25] Chapter 3. We consider the following controlled stochastic differential equation: with the cost functional The controller wants to find the infimum of J over admissible control set. Moreover, we introduce the following assumption as in [25].
Let (X(·),ū(·)) be an optimal pair of the stochastic system (11)- (12). We introduce the following first-order and second-order adjoint equation where the Hamiltonian H is defined by Proposition 4.2. (General Stochastic Maximum principle [19,25]) Let Assumption 4.1 hold, (X(·),ū(·)) be an optimal pair of the stochastic system (11)- (12). Then there exist pairs of processes (p(·), q(·)) ∈ L 2 F (0, T ; R n ) × L 2 F (0, T ; R n ), (P (·), Q(·)) ∈ L 2 satisfying the first-order and second-order adjoint equations (13) and (14), respectively, such that It should be noted that the second-order adjoint equation have to be introduced in the above stochastic maximum principles. Different from [19] [25], the primal problem (3)-(5) will be turned into a concave problem with convex control domain in this section. And a novel stochastic maximum principle without the second-order adjoint equation will be also proposed. To proceed further, the following useful proposition is needed. The interested readers can find this in [20] and [25].  [20,25]) Letū(·) be an optimal control andX(·) be the corresponding trajectory. The control domain is convex and all the coefficients are C 1 in u. Then there exists a pair of process (p(·), q(·)) ∈ L 2 F (0, T ; R n ) × L 2 F (0, T ; R n ) satisfies the first-order adjoint equation (13) such that The next theorem is one of the main results in this paper. We first transform the original linear quadratic problem into a quadratic optimization problem in a Hilbert space by functional analysis approach as shown in Section 3. Then through the introduced parameter, we can turn the original problem (3)-(5) into a concave control problem with convex control domain. Then we can apply the stochastic maximum principle in Proposition 4.3 to the transformed concave problem, and obtain the new stochastic maximum principle.
Theorem 4.4. Suppose Assumption 2.1 hold and let (X(·),ū(·)) be an optimal pair of the stochastic linear quadratic control problem (3) − (5). Then there exists an adapted solution (p(·), q(·)) ∈ L 2 F (0, T ; R n ) × L 2 F (0, T ; R n ) to the following backward stochastic differential equation , P-a.s., where the Hamiltonian function H µ is defined by the parameter µ is defined as: −µ is the largest eigenvalue of N in (10), the operator I is identity operator.
Proof. The proof is technical. We divide it into two steps to make the idea clear.
Step 1. Equivalent Formulation As shown in section 3, we obtain a functional quadratic in the triple of the initial state, the control and the nonhomogeneous term. That is Given any real number µ ∈ R, letN = N + µI andH(x) = H(x) − 1 2 µI, then we can define the following stochastic linear quadratic control problem . When we select −µ as the largest eigenvalue of N , the operatorN is negative semi-definite, then J µ is concave with respect to u(·). We claim that the problem is equivalent (in the sense of coinciding optimal control and optimal value) to whereŪ ad is defined bȳ e, a.s.}. Note that we have defined E = {u ∈ R k : 0 ≤ u ≤ e} in Section 2.
Thusū is the optimal control of J(u(·)). Therefore, the problem (16) is equivalent to the minimization of a concave quadratic function over the convex set, so we can apply the classical stochastic maximum principle as shown in Proposition 4.3, which deals with convex control domain, to the equivalent problem.
Step 2. Apply Stochastic Maximum Principle In order to derive the maximum principle for the classical stochastic optimal control problem, one needs to obtain the variational equation of state equation. Based on the variational inequality and the introduced adjoint equation, one can get the stochastic maximum principle (see [1,2,20] and [25]).
In our case, we should introduce the adjoint equation as the following backward stochastic differential equation:

Moreover, the Hamiltonian function H µ is defined by
x, x + 2 S(t)x − µ 2 I, u + (R(t) + µI)u, u ]}, then, we can obtain the following stochastic maximum principle This completes the proof.

SHAOLIN JI AND XIAOLE XUE
where H 0 is defined by (15) if µ = 0. Moreover, for any v ∈ U , we have the following results We can see that the nonconvex maximum principle is complex. However, the approach provided in this paper is relatively concise. In the next section we will give an example to show how the new maximum principle works.