Weak Closed-Loop Solvability of Stochastic Linear-Quadratic Optimal Control Problems

Recently it has been found that for a stochastic linear-quadratic optimal control problem (LQ problem, for short) in a finite horizon, open-loop solvability is strictly weaker than closed-loop solvability which is equivalent to the regular solvability of the corresponding Riccati equation. Therefore, when an LQ problem is merely open-loop solvable not closed-loop solvable, which is possible, the usual Riccati equation approach will fail to produce a state feedback representation of open-loop optimal controls. The objective of this paper is to introduce and investigate the notion of weak closed-loop optimal strategy for LQ problems so that its existence is equivalent to the open-loop solvability of the LQ problem. Moreover, there is at least one open-loop optimal control admitting a state feedback representation. Finally, we present an example to illustrate the procedure for finding weak closed-loop optimal strategies.


Introduction
Let (Ω, F, P) be a complete probability space on which a standard one-dimensional Brownian motion W (·) = {W (t); 0 t < ∞} is defined, and let F = {F t } t 0 be the natural filtration of W (·) augmented by all the P-null sets in F. Let 0 t < T and consider the following controlled linear stochastic differential equation (SDE, for short) on the finite horizon [t, T ]: where A, C : [0, T ] → R n×n , B, D : [0, T ] → R n×m are given deterministic functions, called the coefficients of the state equation (1.1); b, σ : [0, T ] × Ω → R n are F-progressively measurable processes, called the nonhomogeneous terms; and (t, x) ∈ [0, T ) × R n is called the initial pair. Here, R n is the usual n-dimensional Euclidean space consisting of all n-tuple of real numbers, and R n×m is the set of all n × m real matrices. In the above, the process u(·), which belongs to the following space: and E T t |u(s)| 2 ds < ∞ , is called the control process, and the solution X(·) of (1.1) is called the state process corresponding to (t, x) and u(·). According to the standard results of SDEs, under appropriate conditions, for any initial pair (t, x) and any control u(·) ∈ U [t, T ], equation (1.1) admits a unique (strong) solution X(·) ≡ X(· ; t, x, u(·)) which is continuous and square-integrable.
To measure the performance of the control u(·), we introduce the following quadratic cost functional: J(t, x; u(·)) = E GX(T ), X(T ) + 2 g, X(T ) where G ∈ R n×n is a symmetric constant matrix; g is an F T -measurable random variable taking values in R n ; Q : [0, T ] → R n×n , S : [0, T ] → R m×n , and R : [0, T ] → R m×m are deterministic functions with Q and R being symmetric; and q : [0, T ] × Ω → R n , ρ : [0, T ] × Ω → R m are F-progressively measurable processes. In the above, M ⊤ stands for the transpose of a matrix M . The problem that we are going to study is the following: Problem (SLQ). For any given initial pair (t, x) ∈ [0, T ) × R n , find a controlū(·) ∈ U [t, T ] such that J(t, x;ū(·)) J(t, x; u(·)), ∀u(·) ∈ U [t, T ]. (1. 3) The above is called a stochastic linear-quadratic (LQ, for short) optimal control problem. Anyū(·) ∈ L 2 F (t, T ; R m ) satisfying (1.3) is called an open-loop optimal control of Problem (SLQ) for the initial pair (t, x); the corresponding state processX(·) ≡ X(· ; t, x,ū(·)) is called an optimal state process; and the function V (· , ·) defined by and J 0 (t, x; u(·)) = E GX(T ), X(T ) + T t Q(s) S(s) ⊤ S(s) R(s) X(s) u(s) , X(s) u(s) ds , (1.5) respectively. We refer to the problem of minimizing (1.5) subject to (1.4) as the homogeneous LQ problem associated with Problem (SLQ), denoted by Problem (SLQ) 0 . The value function of Problem (SLQ) 0 will be denoted by V 0 (· , ·).
LQ optimal control is a classical and fundamental problem in control theory, whose history can be traced back to the works of Bellman-Glicksberg-Gross [2], Kalman [10], and Letov [11]. These works were concerned with deterministic cases, i.e., the state equation is a linear ordinary differential equation (ODE, for short), and all the involved functions are deterministic. Stochastic LQ problems were firstly studied by Wonham [17] in 1968. Later, Bismut [4] carried out a detailed analysis for stochastic LQ optimal control with random coefficients. See also some follow-up works of Davis [9], Bensoussan [3] and Tang [15,16]. In the classical setting, it is typically assumed that the cost functional has positive semidefinite weighting matrices for the control and the state. Namely, the following assumption was taken for granted: For some constant δ > 0, Such a condition ensures that the LQ problem admits a unique open-loop optimal control and that the following associated Riccati equation has a unique positive definite solution on [0, T ] (with the argument s being suppressed): (1.7) Further, the unique open-loop optimal control can be expressed as a linear feedback of the current state via the solution to (1.7) (see [4] or [18,Chapter 6]). It is noteworthy that (1.6) is a quite strong set of conditions for the existence of an open-loop optimal control. Later developments show that a stochastic LQ problem might still admit an open-loop optimal control even if the control weight R(·) is negative definite; see [5,12,6,8,1,7] for some relevant works on the so-called indefinite stochastic LQ control problem.
Recently, Sun-Yong [14] and Sun-Li-Yong [13] investigated the open-loop and closedloop solvabilities of stochastic LQ problems. It was shown that the existence of an openloop optimal control (open-loop solvability of LQ problem) is equivalent to the solvability of the associated optimality system (which is a constrained forward-backward SDE, abbreviated as FBSDE), and that the existence of a closed-loop optimal strategy (closed-loop solvability of LQ problem) is equivalent to the regular solvability of the following generalized Riccati equation (GRE, for short): where M † denotes the Moore-Penrose pseudoinverse of a matrix M . In the above, the argument s is again suppressed; We will do that in the following, as long as no ambiguity will arise. It was found ( [14,13]) that the existence of a closed-loop optimal strategy implies the existence of an open-loop optimal control, but not vice versa. Thus, there are some LQ problems that are open-loop solvable, but not closed-loop solvable; for such problems, one could not expect to get a regular solution (which does not exist) to the associated GRE (1.8), so that the state feedback representation of the open-loop optimal control might be impossible. To be more convincing, let us look at the following simple example.
In this example, the associated GRE readṡ Clearly, P (s) ≡ 1 is the unique solution of (1.9). From [13], we know that such a solution is not regular. A usual Riccati equation approach specifies the corresponding state feedback control as follows (noting that R(·) = 0, D(·) = 0, and 0 † = 0): which is not open-loop optimal for any nonzero initial state x. In fact, let (t, x) ∈ [0, 1)× R be an arbitrary but fixed initial pair with x = 0. By the variation of constants formula, the state process X * (·) corresponding to (t, x) and u * (·) is given by Hence, J(t, x; u * (·)) = E|X * (1)| 2 = x 2 > 0.
Since the cost functional is nonnegative, we see thatū(·) is open-loop optimal for the initial pair (t, x), but u * (·) is not.
The above example suggests that the usual solvability of the generalized Riccati equation (1.8) may not be helpful in handling open-loop solvability of certain stochastic LQ problems. It is then natural to ask: When Problem (SLQ) is merely open-loop solvable, not closed-loop solvable; is it still possible to get a linear state feedback representation for an open-loop optimal control? The objective of this paper is to tackle this problem. We shall provide an alternative characterization of the open-loop solvability of Problem (SLQ) using the perturbation approach introduced in [13]. We point out here that our result, which avoids the subsequence extraction, is a sharpened version of [13,Theorem 6.2]. In order to obtain a linear state feedback representation of open-loop optimal control for Problem (SLQ), we introduce the notion of weak closed-loop strategies. This notion is a slight extension of the closed-loop strategy developed in [14,13]. We shall prove that as long as Problem (SLQ) is open-loop solvable, there always exists a weak closed-loop strategy whose outcome is an open-loop optimal control. Note that it might be that the open-loop optimal control is not unique and we are able to represent one of them in the state feedback form.
The rest of the paper is organized as follows. In Section 2, we collect some preliminary results and introduce a few elementary notions for Problem (SLQ). Section 3 is devoted to the study of open-loop solvability by a perturbation method. In section 4, we show how to obtain a weak closed-loop optimal strategy and establish the equivalence between openloop and weak closed-loop solvability. An example is presented in Section 5 to illustrate the results we obtained.

Preliminaries
Throughout this paper, and recall from the previous section, M ⊤ stands for the transpose of a matrix M , tr (M ) the trace of M , R n×m the Euclidean space consisting of (n × m) real matrices, endowed with the Frobenius inner product M, N → tr [M ⊤ N ]. We shall denote by I n the identity matrix of size n and by |M | the Frobenius norm of a matrix M . Let S n be the subspace of R n×n consisting of symmetric matrices. For M, N ∈ S n , we use the notation M N (respectively, M > N ) to indicate that M − N is positive semi-definite (respectively, positive definite). Let [t, T ] be a subinterval of [0, ∞) and H be a Euclidean space (which could be R n , R n×m , S n , etc.). We further introduce the following spaces of functions and processes:      To guarantee the well-posedness of the state equation (1.1), we adopt the following assumption: (A1). The coefficients and the nonhomogeneous terms of (1.1) satisfy The following result, whose proof can be found in [14, Proposition 2.1], establishes the well-posedness of the state equation under the assumption (A1).
Moreover, there exists a constant K > 0, independent of (t, x) and u(·), such that To ensure that the random variables in the cost functional (1.2) are integrable, we assume the following holds: (A2). The weighting coefficients in the cost functional satisfy Remark 2.2. Suppose that (A1) holds. Then according to Lemma 2.1, for any initial pair (t, x) ∈ [0, T ) × R n and any control u(·) ∈ U [t, T ], equation (1.1) admits a unique (strong) solution X(·) ≡ X(· ; t, x, u(·)) which belongs to the space L 2 F (Ω; C([t, T ]; R n )). If, in addition, (A2) holds, then the random variables on the right-hand side of (1.2) are integrable and hence Problem (SLQ) is well-posed. It is worth pointing out that we do not impose any positive-definiteness/nonnegativeness conditions on Q(·), R(·), and G. Now we recall some basic notions of stochastic LQ optimal control problems.
is called an open-loop optimal control for (t, x).

(ii) (uniquely) open-loop solvable if it is (uniquely) open-loop solvable at any initial pair
T ] → R m×n be a deterministic function and v : [t, T ]×Ω → R m be an F-progressively measurable process.
The set of all closed-loop strategies ( where X * (·) is the solution to the closed-loop system under (Θ * (·), v * (·)) (with the argument s suppressed in the coefficients and non-homogeneous terms): and X(·) is the solution to the following closed-loop system under (Θ(·), v(·)): (2.4) (iii) If for any t ∈ [0, T ), a closed-loop optimal strategy (uniquely) exists on [t, T ], we say Problem (SLQ) is (uniquely) closed-loop solvable.
From [14, Proposition 3.3], we know that (Θ * (·), v * (·)) is a closed-loop optimal strategy on [t, T ] if and only if (2.5) is an openloop optimal control of Problem (SLQ) for the corresponding initial pair (t, X * (t)). is not an outcome of some (regular) closed-loop optimal strategy. Since the closedloop representation is very important in practice, we naturally ask: Is it possible for some (not necessarily all) open-loop optimal controls, one can find less regular closed-loop representation? Motivated by this, we introduce the following notion.
Definition 2.5. Let Θ : [t, T ) → R m×n be a locally square-integrable deterministic function and v : [t, T ) × Ω → R m be a locally square-integrable F-progressively measurable process; that is, Θ(·) and v(·) are such that for any T ′ ∈ [t, T ) The set of all weak closed-loop strategies is denoted by where X(·) is the solution of the weak closed-loop system (2.6), and X * (·) is the solution to the weak closed-loop system (2.6) corresponding to (t, x) and (Θ * (·), v * (·)).
Similar to the case of closed-loop solvability, we have the following equivalence: A weak closed-loop strategy (Θ * (·), v * (·)) ∈ Q w [t, T ] is weakly closed-loop optimal on [t, T ) if and only if (2.8) We conclude this section with some existing results on the closed-loop and open-loop solvabilities of Problem (SLQ), which play a basic role in our subsequent analysis. For proofs and full discussion of these results, we refer the reader to Sun-Li-Yong [13].
if and only if the following two conditions hold:
To prove Theorem 3.1, we need the following lemma.

The implication (iii) ⇒ (ii) is trivially true.
Finally, we prove the implication (ii) ⇒ (iii). The proof is divided into two steps.
To verify this, it suffices to show that every weakly convergent subsequence of {u ε (·)} ε>0 has the same weak limit which is an open-loop optimal control of Problem (SLQ) for (t, x). Let u * i (·); i = 1, 2, be the weak limits of two different weakly convergent subsequences {u i,ε k (·)} ∞ k=1 (i = 1, 2) of {u ε (·)} ε>0 . The same argument as in the proof of (ii) ⇒ (i) shows that both u * 1 (·) and u * 2 (·) are optimal for (t, x). Thus, recalling that the mapping u(·) → J(t, x; u(·)) is convex, we have J t, x; u * 1 (·) + u * 2 (·) 2 This means that is also optimal for Problem (SLQ) with respect to (t, x). Then we can repeat the argument employed in the proof of (i) ⇒ (ii), replacing v * (·) by , to obtain (see (3.14)) Taking inferior limits then yields Adding the above two inequalities and then multiplying by 2, we get or equivalently (by shifting the integral on the right-hand side to the left-hand side), It follows that u * 1 (·) = u * 2 (·), which establishes the claim.

Remark 3.3.
A similar result first appeared in [13], which asserts that if Problem (SLQ) is open-loop solvable at (t, x), then the limit of any weakly/strongly convergent subsequence of {u ε (·)} ε>0 is an open-loop optimal control for (t, x). Our result sharpens that in [13] by showing the family {u ε (·)} ε>0 itself is strongly convergent when Problem (SLQ) is open-loop solvable. This improvement has at least two advantages. First, it serves as a crucial bridge to the weak closed-loop solvability presented in the next section. Second, it is much more convenient for computational purposes because subsequence extraction is not required.

Weak Closed-Loop Solvability
In this section, we establish the equivalence between open-loop and weak closed-loop solvabilities of Problem (SLQ). We shall show that Θ ε (·) and v ε (·) defined by (3.6) and (3.7) converge locally in [0, T ), and that the limit pair (Θ * (·), v * (·)) is a weak closed-loop optimal strategy. We emphasize the fact that in general the limits Θ * (·) and v * (·) are merely locally square-integrable over [0, T ) and cannot be obtained directly by solving the associated Riccati equation and BSDE (see Examples 1.1 and 5.1).
We are now ready to state and prove the main result of this section, which establishes the equivalence between open-loop and weak closed-loop solvabilities of Problem (SLQ). {u * (s); t s T } of Problem (SLQ) (for the initial pair (t, x)). Let {X * (s); t s T } be the corresponding optimal state process; that is, X * is the solution to If we can show that u * (s) = Θ * (s)X * (s) + v * (s), t s < T, (4.4) then (Θ * (·), v * (·)) is clearly a weak closed-loop optimal strategy of Problem (SLQ) on [t, T ). To justify the argument, we note first that by Lemma 2.1, where {X ε (s); t s T } is the solution to equation (3.5). Second, by Propositions 4.2 and 4.3, It follows that for any 0 < T ′ < T , Recall that u ε (s) = Θ ε (s)X ε (s) + v ε (s); t s T converges strongly to u * (s); t s T in L 2 F (t, T ; R m ) as ε → 0. Thus, (4.4) must hold. The above argument shows that the open-loop solvability implies the weak closed-loop solvability. The reverse implication is obvious by Definition 2.5.

An Example
In this section we present an example in which the LQ problem is open-loop solvable (and hence weakly closed-loop solvable) but not closed-loop solvable. This example illustrates the procedure for finding weak closed-loop optimal strategies. We point out that neither Θ * (·) nor v * (·) is square-integrable on [0, 1). Indeed,