Forward Backward SDEs in Weak Formulation

Although having been developed for more than two decades, the theory of forward backward stochastic differential equations is still far from complete. In this paper, we take one step back and investigate the formulation of FBSDEs. Motivated from several considerations, both in theory and in applications, we propose to study FBSDEs in weak formulation, rather than the strong formulation in the standard literature. That is, the backward SDE is driven by the forward component, instead of by the Brownian motion noise. We establish the Feyman-Kac formula for FBSDEs in weak formulation, both in classical and in viscosity sense. Our new framework is efficient especially when the diffusion part of the forward equation involves the $Z$-component of the backward equation.


Introduction
In the standard literature, a coupled FBSDE takes the following form: t ∈ [0, T ], P 0 -a.s. (1.1) where Θ := (X, Y, Z) is the solution triplet, B is a Brownian motion under the probability measure P 0 and the coefficients b, σ, f , and g are F B -progressively measurable in all variables. There have been many publications on the subject, see e.g. Antonelli [1], Ma, Protter & Yong [17], Hu & Peng [14], Yong [30], Peng & Wu [23], Pardoux & Tang [22], Delarue [7], Zhang [32], Ma, Wu, Zhang & Zhang [18], as well as the monograph Ma & Yong [19]. However, the theory is still far from complete. The existing methods in the literature provide quite different sets of sufficient conditions, and the unified approach proposed in [18] works only in one dimensional case and the conditions there are rather technical. Even worse, many FBSDEs arising from applications do not fit in any existing works.
To understand the problem better, we take one step back and try to understand the formulation of the problem. Is (1.1) indeed the "right" formulation of the problem? As we will justify below, we feel the following alternative form, which we call FBSDEs in weak formulation, or simply weak FBSDEs, seems more appropriate in many situations: To indicate the difference, we denote by Θ S := (X S , Y S , Z S ) the solution to (1.1) and Θ W := (X W , Y W , Z W ) the solution to (1.2), where the superscripts S and W stand for strong and weak, respectively. We note that in (1.2) the stochastic integration in the backward equation is against dX t , not against the Brownian motion dB t . In the case that the mapping z → zσ(t, x, y, z) has an inverse function ψ(t, x, y, z), (1.3) by denotingZ := Z W σ(t, Θ W t ) and thus Z W = ψ(t, X W t , Y W t ,Z t ), one can easily check that (X W t , Y W t ,Z t ) is a solution to the following FBSDE in strong formulation: where, for θ := (t, x, y, z), b(θ) = b(t, x, y, ψ(θ)),σ(θ) = σ(t, x, y, ψ(θ)),f (θ) = f (t, x, y, ψ(θ)) − ψ(θ)b(θ).
When σ = σ(t, x, y) is independent of z and σ > 0, it is clear that ψ(t, θ) = z σ(t,x,y) . However, when σ depends on z, typically we do not have the inverse function ψ.
We justify the weak formulation (1.2) in four aspects. Firstly, in the option pricing and hedging theory, which is one of the main applications of BSDEs and FBSDEs, let S denote the stock price driven by a Brownian motion B. For a hedging portfolio h with wealth value V , the self financing condition gives dV t = [· · · ]dt + h t dS t . Note that (S, V ) here correspond to (X, Y ) in FBSDE, and the stochastic integration in dV is against dS t , not dB t . In simple models like Black-Scholes model, S and B generate the same filtration, then such difference is not crucial and there is no problem for using the strong formulation.
However, for superhedging problem in incomplete markets, for example, one has to use dS t to superhedge, then the weak formulation is indeed more appropriate. In fact, in many practical applications, X is the state process we observe and B is the noise used to model the distribution of X. Note that one rationale of using Brownian motion is the central limit theorem, where the convergence is in distribution, in this case the value of B may even not exist physically. So in these applications the weak formulation is more appropriate.
Secondly, in Markovian setting and in the case σ = σ(t, x, y), the FBSDE (1.1) is associated with the following quasilinear PDE with terminal condition u(T, x) = g(x): ∂ t u + 1 2 σ 2 (t, x, u)∂ 2 xx u + b(t, x, u, ∂ x uσ(t, x, u))∂ x u + f (t, x, u, ∂ x uσ(t, x, u)) = 0, (1.5) through the so called nonlinear Feynman-Kac formula: Y S t = u(t, X S t ), Z S t = ∂ x u(t, X S t )σ(t, X S t , u(t, X S t )). (1.6) However, when σ depends on Z, the PDE will involve the inverse function ψ in (1.3) which typically does not exist. The weak formulation (1.2), instead, corresponds to the following more natural PDE even in the case σ = σ(t, x, y, z): x, u, ∂ x u) = 0, (1.7) and the nonlinear Feynman-Kac formula is also simpler: (1. 8) In particular, in the option pricing and hedging theory, the representation (1.8) means exactly that Z W is the Delta-hedging portfolio h. The case that σ depends on Z indeed makes the difference between strong and weak formulations. For example, the following well known counterexample in strong formulation: has infinitely many solutions. However, the corresponding weak FBSDE is wellposed in the sense of Example 2.1 and Remark 2.2 below: Thirdly, as another major application, many FBSDEs arise from stochastic control problems through the stochastic maximum principle. However, the stochastic control problem typically does not have optimal control in strong formulation. Indeed, even the following simple problem may not have an optimal control in strong formulation: The corresponding control problem in weak formulation: has optimal control under mild and natural conditions. Consequently, the associated FB-SDE will have weak solution but no strong solution. It is more natural and convenient to write the FBSDE in weak formulation when one studies weak solutions.
Fourthly, again for stochastic control problems, there are typically two approaches in the literature: the dynamic programming principle and the stochastic maximum principle.
Both approaches lead to certain hamiltonians but the two hamiltonians for the same control problem look quite different. As we observe, if one uses weak FBSDE as the adjoint equation involved in the stochastic maximum principle, then the hamiltonian will coincide with the one derived from the dynamic programming principle. In this sense, the weak formulation provides an intrinsic connection between the two approaches.
After carrying out the above motivations in details, we define weak solutions for weak FBSDEs and the equivalent forward backward martingale problems. By utilizing the recently developed theory of path dependent PDEs, we establish the nonlinear Feynman-Kac formula for path dependent weak FBSDEs. That is, if the associate path dependent PDE has a classical solution, then the weak FBSDE has a (strong) solution.
Our main goal of this paper is to apply the viscosity solution method to establish the uniqueness of weak solution of the weak FBSDE. We shall follow the arguments in Ma, Zhang, & Zheng [21] and Ma & Zhang [20], which study weak solutions for FBSDEs in strong formulation in the case that σ is independent of z. Our arguments rely heavily on the regularity results for the PDE. Since such regularity results for the path dependent PDEs are not available in the literature, in this part we shall restrict to the Markovian case.
The main idea is to study the so called nodal sets of the weak FBSDEs, whose upper and lower bounds provide viscosity subsolution and supersolution of the PDE. Then, provided the comparison principle for viscosity solutions of the PDE, we obtain the uniqueness of weak solutions to the weak FBSDE. We remark that, as in [21,20], the problem is equivalent to the so called martingale problem, see also Costantini & Kurtz [5] for the application of viscosity solution methods on martingale problems in an abstract framework.
The rest of the paper is organized as follows. In Section 2 we motivate weak FBSDEs.
In Section 3 we define weak solutions and establish the nonlinear Feynman-Kac formula, provided the path dependent PDE has a classical solution. In Section 4 we prove the existence and uniqueness of weak solutions for Markovian weak FBSDEs. Finally in Appendix we provide some counterexamples in control theory, which help to motivate the weak formulation, and provide some detailed arguments for the required regularities for the PDE.

Some motivations for weak FBSDEs
In this section we provide some heuristic motivations for weak FBSDE (1.2). To simplify the presentation, we restrict to Markovian case in one dimensional setting.

Applications in option pricing and hedging theory
Consider a financial market with a risky asset S and a risk free asset with interest rate r = 0 (for simplicity). Assume S satisfies the following SDE: where B is a P 0 -Brownian motion (so we assume P 0 is a risk neutral measure). Given a portfolio (λ, h) with value process V t = λ t + h t S t , the self-financing condition states that Now given an European type of option with payoff ξ at terminal time T , we say a selffinancing portfolio (λ, h) is a hedging portfolio if V T = ξ, P-a.s. This, combining with (2.2), leads to a backward SDE against dS t : Then (2.1)-(2.3) become a decoupled weak FBSDE with solution (X, Y, Z) = (S, V, h). We remark that BSDE (2.3) can be rewritten in strong formulation: V 0 := inf{y : ∃h such that y + T 0 h s dS s ≥ ξ, P 0 -a.s.}. (2.5) This is in the sprit of the weak FBSDE. Indeed, one can formulate it as a reflected BSDE in weak formulation, which is beyond the scope of this paper and is left for future research.
An alternative explanation for the nonexistence of solution to the weak FBSDE in above situation is that X does not have martingale representation property for F B -martingales.
In this case, for theoretical interest we may relax BSDE (2.3) by applying the extended martingale representation theorem, see e.g. Protter [25]: where N is an orthogonal martingale such that N 0 = 0 and d S, N t = 0. Then (2.6) will have a unique solution (V, h, N ).

Nonlinear Feynman-Kac formula
As is well known, in the case σ = σ(t, x, y), the strong FBSDE (1.1) is associated with the quasilinear PDE (1.5) via the nonlinear Feynman-Kac formula (1.6). The problem becomes tricky when σ = σ(t, x, y, z) because the PDE will involve the inverse function ψ in (1.3), which typically does not exist. The weak FBSDE (1.2) is associated with the quasilinear PDE (1.7), which is more natural at least in the following aspects: • σ may depend on z and the PDE does not involve the inverse function ψ in (1.3).
• The component Z of the solution corresponds to ∂ x u directly, rather than ∂ x u σ.
In particular, in the application to the option pricing and hedging theory, the Z in weak formulation corresponds directly to the Delta-hedging portfolio.
• The PDE is more natural in the sense that the coefficients σ, f depend directly on • It is more convenient to study weak solutions of the weak FBSDE, which is closely related to the viscosity solution of the PDE (1.7), than that of the strong FBSDE.
To see the advantage of the weak formulation more directly in the case that σ depends on z, let's consider the counterexample (1.9). It is well known that (1.9) has infinitely many solutions. Indeed, for any Z ∈ L 2 (F B , P 0 ), X t := Y t := x + t 0 Z s dB s is a solution to (1.9). However, the weak FBSDE (1.10) is wellposed in the following sense. Note that Z ∈ L 4 (P 0 ) implying < ∞, and thus X, Y are P 0 -martingales. We shall comment on the requirement Z = 0 in Remark 2.2 below.
Proof It is clear that is a solution to (1.10). We next show that it's the unique solution such that Z ∈ Z.
For any (t, x, y) and Z ∈ L 4 (P 0 ), denote and define Note that both X t,x,Z and Y t,x,y,Z are P 0 -martingales, then Y t,x,y,Z T ≥ X t,x,Z T , P 0 -a.s. implies x,Z T ] = x, and thus u(t, x) ≥ x. Similarly, u(t, x) ≤ x and thus u(t, x) ≤ x ≤ u(t, x). On the other hand, for any solution (X, Y, Z) to (1.2), by the definition of u(t, x) and u(t, x) we see that u(t, X t ) ≤ Y t ≤ u(t, X t ). Thus u(t, x) = u(t, x) = u(t, x) := x and Y t = u(t, X t ) = X t . This implies further that Z t = |Z t | 2 . Since Z = 0, we see that Z = 1 and hence (2.7) is the unique solution.
Remark 2.2. (i) If we allow Z = 0, then the solution is not unique. Indeed, for any Z satisfying Z = |Z| 2 (namely Z takes values 0 and 1), it is clear that X t = Y t = x + t 0 Z s dB s is a solution to weak FBSDE (1.2). However, we note that even in this case, the relationship Y t = X t still holds, and the decoupling function u(t, x) = x is still unique. Moreover, without surprise, u(t, x) = x is a solution to the PDE (1.7) corresponding to b = 0, σ = z, f = 0: (ii) When Z = 0, this is exactly the case that X has degenerate diffusion coefficient σ.
As we will see in the paper, the nondegeneracy of σ is crucial.
(iii) As we mentioned in (i), even if we allow Z = 0, the decoupling function u(t, x) = x is still unique. However, when X can be degenerate, That's why the uniqueness fails in this degenerate case.
To avoid the degeneracy issue, we may modify the example as follows.
is a solution to the following strong FBSDE: has a unique solution Here the uniqueness holds for Z ∈ L 2 (F B , P 0 ).
Proof (i) is obvious, and (ii) follows the same arguments as in Example 2.1. In particular, the weak BSDE can be rewritten as: and then we see that Z = 1 is the unique fixed point of: σ(z) = zσ(z), thanks to the nondegeneracy of σ. Moreover, since σ is bounded, then , so the uniqueness holds for Z ∈ L 2 (F B , P 0 ).

Stochastic control in strong formulation
Consider a simple stochastic control problem in strong formulation: Here the admissible controls α are F B -progressively measurable. Note that We first use the stochastic maximum principle to derive an associated FBSDE. Let ∆α be given such that α + ε∆α ∈ A for any ε ∈ [0, 1]. Denote One can easily see that where b ′ , σ ′ , f ′ refer to the derivatives with respect to α. Introduce an adjoint BSDE: By applying Itô formula onỸ α t ∇X α,∆α t we obtain Now assume α * ∈ A is an interior point of A and is an optimal control. Then ∇V α * ,∆α 0 ≤ 0 for arbitrary ∆α. This implies Assume further that (2.12) determines an α * : α * t = I(t,Ỹ α * t ,Z α * t ) for a function I. Then combining (2.9)-(2.11), we obtain the following coupled FBSDE in strong formulation: (2.13) However, the above FBSDE is typically not covered by the existing methods in the literature, especially since σ depends onZ. We remark that all the existing works on weak solutions of (strong) FBSDEs do not allow σ depending on Z, see e.g. Antonelli [21], and Ma & Zhang [20].
We thus turn to weak FBSDE for which we can study weak solutions more conveniently.
Rewrite the adjoint BSDE (2.11) in the spirit of weak formulation: (2.14) One can easily see that its solution is: again assuming σ > 0, (2.15) and the optimality condition (2.12) becomeŝ Assume the above determines an optimal α * : for a functionÎ. Then (2.13) becomes a (multidimensional) FBSDE in weak formulation: (2.17) Remark 2.4. When the weak FBSDE (2.17) has no strong solution, but only weak solution, the stochastic optimization problem (2.9) in strong formulation still does not have optimal control. To obtain the existence of optimal control, it is more appropriate to study the optimization problem in weak formulation, see Subsection 2.3.3 below.

Consistency with dynamic programming principle
As is well known, another standard approach for stochastic control problem is the dynamic programming principle, which focuses more on the value function. Assume the control α takes values in A. Then V 0 = u(0, 0), where u satisfies the following HJB equation: where H(t, z, γ) := sup where α * t = I(t,Ỹ t ,Z t ) is the optimal control. On the other hand, notice that the optimality condition (2.12) can be viewed as the first order condition of However, we have the following discrepancy which has already been noticed in [31]: This discrepancy is due to the fact thatZ involves σ(t, α) and thus twisted the optimization in the Hamiltonian. It will disappear if we consider the weak FBSDE (2.17). Indeed, in this case the optimality condition (2.16) can be viewed as the first order condition of Then we have the desired identity: This is reflected in (2.22). In particular, we haveŶ t = Z t in this model.
(ii) The derivation of (2.16) requires the differentiation of the coefficients b, σ, f in α.
However, such differentiation is not needed for the optimization of the Hamiltonian in (2.21). In fact, one may determineÎ by the optimal arguments in (2.21), and then formally derive the same FBSDE (2.17). These arguments are in the line of dynamic programming principle, rather than stochastic maximum principle.

Stochastic drift control under weak formulation
To understand the weak FBSDE (2.17) better, we consider a special case that σ = 1.
The general case with diffusion control will involve the second order BSDE introduced in Soner, Touzi, & Zhang [27]. In this case (2.16) becomeŝ Then the optimal control takes the form α * t =Î(t,Ŷ α * t ) and thus FBSDE (2.17) becomes Moreover, note that (2.24) is the first order condition of the following optimization problem: Then, together with (2.25) and under appropriate technical conditions, (2.26) leads to (2.28) The FBSDE (2.28) can be understood a lot easier if we use weak formulation for the control problem:V By comparison of BSDE we see immediately thatV 0 =Ȳ 0 and α * t :=Î(t,Z t ) is an optimal control of (2.29), for the sameÎ in (2.28). Now together with the definition of B α and X = B, we may rewrite (2.31) as In the spirit of weak solution as we will introduce in the next section, this is equivalent to (2.28). So in this sense, the weak FBSDE (2.28), or the more general one (2.17), is more in the spirit of weak formulation. (ii) There are many situations that the optimal control in weak formulation exists but that in strong formulation does not. See some examples in Appendix.
(iii) The difference between strong formulation and weak formulation becomes more crucial when one considers zero sum stochastic differential games, see Hamadene & Lepeltier [13] and Pham & Zhang [24].

Weak solutions of FBSDEs and Feynman-Kac formula
Our objective is the following weak FBSDE: and all other processes and functions have appropriate dimensions. The coefficients b, σ, f, g may depend on the paths of X, among them b, σ, f are F X -progressively measurable in all variables, and g is F X T -measurable.
Given a probability space (Ω, F, P), let L 0 (F) denote the set of F-progressively measurable processes with appropriate dimensions. For p, q ≥ 1, denote S p (F, P) := X ∈ L 0 (F) : X is continuous, P-a.s. and E P sup Throughout this paper, we shall assume (ii) f (t, x, 0, 0), g(x) have polynomial growth in x := sup 0≤t≤T |x t |, and f is uniformly Lipschitz continuous in (y, z). FBSDEs, both for practical considerations and for theoretical reasons, it is more natural that the coefficients depend on X. However, in a more general setting, for example in the incomplete market with observable noise as in Subsection 2.1, we may allow the coefficients to depend on both X and B. The problem will become harder in this case. In this paper we restrict to the case that the coefficients do not depend on B.
(ii) As explained in Section 2.1, the presence of N is due to the fact that X may not satisfy the martingale representation property.
(ii) We say a weak solution is semi-strong if (Y, Z) are F X -progressively measurable.
(iii) We say a weak solution is strong if N = 0 and Θ is F B -progressively measurable.
Given our conditions, all weak solutions actually have stronger integrability.
Proof By the boundedness of b, σ, the estimate for X is obvious. Since f (t, x, 0, 0) and g(x) have polynomial growth, we have E P |g(X · )| p + T 0 |f (t, X · )| p dt < ∞. Now by the uniform Lipschitz continuity of f in (y, z), the rest estimates follows from standard BSDE arguments, see e.g. El Karoui & Huang [11].
(ii) The following two processes are P-martingales: Proof Let (Ω, F, P, B, Θ, N ) be a weak solution to FBSDE (3.1). Note that d Y, X t = Z t d X t = Z t σσ ⊤ (t, X · , Y t , Z t )dt and X , Y are all F X,Y -progressively measurable. Since σσ ⊤ > 0, then Z is also F X,Y -progressively measurable. Now by recasting everything into the canonical space of (X, Y ), it is straightforward to verify that (P, Z) is a solution to the forward-backward martingale problem of (3.1).
To see the other direction, let (Ω, F, X, Y ) be the canonical setting in Definition 3.5 and (P, Z) a solution to the forward-backward martingale problem of (3.1). Note that Assumption 3.1 (iii) implies d 0 ≥ d 1 , and there exist orthogonal matrices U ∈ R d 1 ×d 1 and V ∈ R d 0 ×d 0 as well as k 1 , · · · , k d 1 = 0 such that and 0 refers to the d 1 × (d 0 − d 1 )-zero matrix. It is clear that U, V, K are F-progressively measurable processes. DenoteB ThenB is a continuous local martingale under P and By Levy's characterization theorem we see thatB is a P-Brownian motion. Now letB be an where dB t := V ⊤ t dB t is also a d 0 -dimensional P-Brownian motion, since V is orthogonal. Now define In the rest of the paper this will be enforced.

Path dependent PDEs
In this subsection we introduce the PPDE in the setting of Ekren, Touzi, & Zhang [9,10].
Let Ω := C([0, T ], R d ) be the canonical space equipped with ω := sup 0≤t≤T |ω t |, X the canonical process, F := F X the natural filtration, and Λ : For some generic dimension m, let C 0 (Λ; R m ) be the space of continuous functions Λ → R m .

Nonlinear Feynman-Kac formula
The following result is an extension of the four step scheme of Ma, Protter, & Yong [17]. (3.6) Moreover, the solution is unique (in law) among all weak solutions.
Uniqueness. For notational simplicity let's assume d 2 = 1. The multidimensional case can be proved similarly without any significant difficulty. Let (B, Θ, N, P) be an arbitrary weak solution of (3.1). We first claim that (3.6) holds. Indeed, denotẽ Applying functional Itô formula (3.4) on u(t, X · ) and recalling (3.5), we have: where α, β are bounded. Note that ∆Y T = 0. Applying Itô formula on |∆Y t | 2 and recalling Assumption 3.1 (iii) we have Then by the standard BSDE arguments we have |∆Y | = |∆Z| = 0. This proves (3.6). Now plug (3.6) into the forward SDE of (3.1), we see that X has to satisfy the SDE (3.7). By the uniqueness of (3.7) we see that X is unique, which, together with (3.6), implies further the uniqueness of Θ, hence that of N .

Wellposedness for Markovian weak FBSDEs
We now turn to weak solutions. We shall follow the approach in Ma, Zhang, & Zheng [21] and Ma & Zhang [20]. Our approach will rely heavily on viscosity solutions as well as the a priori estimates for the related PDE. We remark that all the results can be easily extended to path dependent case provided that the corresponding estimates can be established for PPDEs, which however are not available in the literature and are in general challenging. We thus restrict to Markovian case, and for the purpose of viscosity theory, we assume d 2 = 1.
We emphasize again that, by Propositions 3.6 and 3.8, we may allow d 0 = d 1 and (4.1) may depend on b(t, X · , Y t , Z t ) as well. Throughout this section, we use a generic constant C > 0 which depends only on T and C 0 , c 0 , L, d in Assumption 4.1.
Under the above assumption, we have the following regularity results for the PDE (4.2).
The arguments are mainly from Ladyzenskaja, Solonnikov & Uralceva [16], and we sketch a proof in Appendix.
where C δ may depend on δ as well.
(iii) There exists a constant C g , which depends on the same parameters T, C 0 , c 0 , L, d, as well as ∂ xx g ∞ , such that |∂ xx u| ≤ C g . Proof Let (σ n , f n , g n ) be a smooth mollifier of (σ, f, g) such that they satisfy Assumption 4.1 uniformly. Applying Theorem 4.2, let u n be the classical solution to PDE (4.2) with coefficients (σ n , f n , g n ), and then {u n } n≥1 satisfy (4.4) uniformly, uniformly in n. Applying the Arzela-Ascoli theorem, possibly along a subsequence, u n converges to a function u such that u satisfies (4.4) and the convergence of (u n , ∂ x u n ) to (u, ∂ x u) is uniform. In particular, by the stability of viscosity solutions we see that u is a viscosity solution of PDE (4.2).

Existence
Next, by Proposition 3.6 and Theorem 3.9 the martingale problem (4.1) with coefficients (σ n , f n , g n ) has a solution (P n , Z n ) such that Y t = u n (t, X t ), Z n t = ∂ x u n (t, X t ), P n -a.s. By Zheng [34], possibly along a subsequence, we see that P n converges to some P weakly. By the uniform convergence of (u n , ∂ x u n ), we have Z n t → Z t uniformly, and (4.3) holds. Moreover, it follows from (4.4) that (Y, Z) are bounded. Finally, by the uniform convergence, it is straightforward to verify that (P, Z) solves the martingale problem (4.1) with coefficients (σ, f, g).  We next define the nodal sets. (ii) u is upper semi-continuous and u is lower semi-continuous;

Nodal sets
Proof (i) For any (t, x, y) ∈ O with corresponding weak solution (Θ, N, P), we have Since g and f (t, x, , 0, 0) are bounded by C 0 and f is uniformly Lipschitz continuous in (y, z), it follows from standard BSDE arguments that In particular, |y| = |Y t | ≤ C. This implies |u|, |u| ≤ C.
Since O is closed, (ii) is a direct consequence of the definitions of u, u. To see (iii), let (T, x, y) ∈ O. By definition there exist (t n , x n , y n ) ∈ O such that t n ↑ T and (x n , y n ) → (x, y). Let (B n , Θ n , P n ) be a weak solution at (t n , x n , y n ). Then thanks to (4.6). Send n → ∞, we see that y = g(x). This proves (iii).
We have the following result improving Theorem 4.3, which is not used in this paper but is nevertheless interesting in its own right. such that |Z| ≤ C.
Proof It is clear that (t, x, y) ∈ O implies y ∈ [u(t, x), u(t, x)]. Then it suffices to show that, for any y ∈ [u(t, x), u(t, x)], there exists a weak solution at (t, x, y) such that Z is bounded. We proceed in two steps.
Step 1. For any n ≥ 1, let σ n , f n , g n be smooth mollifiers of σ, f, g such that for some small ε n > 0 which will be specified later. Denote f n := f n + 2 n , f n := f n − 2 n , g n := g n + 1 n , g n := g n − 1 n .
By Theorem 4.2, the PDE (4.2) with coefficients (σ n , f n , g n ) (resp. (σ n , f n , g n )) has a classical solution u n (resp. u n ). We claim that, for any (t, x, y) ∈ O and any n, Without loss of generality we will prove only the right inequality at t = 0. We shall follow similar arguments as in Theorem 3.9. Let (B, Θ, N, P) be a weak solution to FBSDE (4.1) at (0, x, y) with coefficients (σ, f, g). Fix n and denotẽ Apply Itô formula, we have By Theorem 4.2 (iii), there exists a constant C n , which is independent of ε n , such that |∂ 2 xx u n | ≤ C n . Note that f n − f = f n + 2 n − f ≥ 1 n and |σ| ≤ C 0 . Then, for ε n ≤ 1 nC 0 Cn , we have where |α n |, |β n | ≤ C n . Note further that ∆Y T = g n (X T )−g(X T ) = g n (X T )+ 1 n −g(X T ) ≥ 0. It is clear that ∆Y 0 ≥ 0. This implies 0 ≤Ỹ 0 − Y 0 = u n (0, x) − y, proving (4.8).
Step 2. Let y ∈ [u(t, Send m → ∞, we obtain u n (t, x) ≤ u(t, x) ≤ y ≤ u(t, x) ≤ u n (t, x), for all n. (4.9) For any n ≥ 1 and α ∈ [0, 1], denote ϕ α n := αϕ n + [1 − α]ϕ n for ϕ = f, g, and let u α n be the classical solution of PDE (4.2) with coefficients (σ n , f α n , g α n ). By the arguments in Theorem 4.2, it is clear that the mapping α → u α n (0, x) is continuous. Since u 0 n (t, x) = u n (t, x) ≤ y ≤ u n (t, x) = u 1 n (t, x), there exists α n ∈ [0, 1] such that u αn n (t, x) = y. For each n ≥ 1, by Proposition 3.6 and Theorem 3.9 the martingale problem (4.1) at (t, x, y) with coefficients (σ n , f αn n , g αn n ) has a solution (P n , Z n ) such that Y s = u αn n (s, X s ), Z n s = ∂ x u αn n (s, X s ), t ≤ s ≤ T , P n -a.s. Now following the arguments in Theorem 4.3 we see that, possibly following a subsequence, P n → P, Z n → Z, u αn n → u, where (P, Z) is a solution to the martingale problem (4.1) at (t, x, y) with coefficients (σ, f, g) and u is a viscosity solution to PDE (4.2) with coefficients (σ, f, g). It is clear that |Z s | = |∂ x u(s, X s )| ≤ C, P-a.s. Proof We shall prove the result only for u. The result for u can be proved similarly.

Uniqueness
Let (t n , x n , y n ) ∈ O such that (t n , x n , y n ) → (t 0 , x 0 , y 0 ), and (P n , Z n ) a weak solution to the martingale problem (4.1) at (t n , x n , y n ). Define N n as in (3.2). By using regular conditional then Y t = u(t, X t ), P-a.s.
Next, for any δ > 0, 0 < t ≤ T − δ, and any partition 0 = t 0 < · · · < t n = t with Since X t − X s = t s σ(r, Θ r )dB r and σ is bounded, one can easily show that Moreover, by the martingale property of X, we have and, applying Itô formula, Then we have Send n → ∞ and thus h → 0, note that Since σ is nondegenerate and t and δ are arbitrary, we obtain That is, (4.3) holds.
Thenσ is Hölder continuous and (B, X, P) satisfies the SDE: (s, X s )dB s , P-a.s. [28], the above SDE has a unique (in law) weak solution. This, together with (4.3), implies the uniqueness (in law) of (B, Θ, P). Finally, by (3.2), the joint law with N is also unique.

By Stroock & Varadahn
Remark 4.10. An alternative approach to prove the uniqueness is to consider the stochastic target problem, as in Soner and Touzi [26]. That is, in the spirit of (2.8), define and define u similarly. The idea is to prove that u and u are viscosity solutions of the PDE.
However, there are technical difficulties in establishing the regularity and the dynamic programming principle for these functions. We shall leave this possible approach to future research.

Some counterexamples
In this subsection we provide two counterexamples related to the control problems in Section 2.3. In particular, they will show that the stochastic control problems in weak formulation have optimal controls, while the corresponding problems in strong formulation do not have optimal control. In the first example, we also show that the associated weak FBSDE has a weak solution, but no strong solution.

The case with drift control
In this case we shall consider an example with path dependence. We note that all the heuristic analysis in Section 2.3 can be easily extended to the path dependent case. We first recall a result due to Tsirel'son [29].
Lemma 5.1. Let t n > 0, n ≥ 1, be strictly decreasing with t 0 = T and t n ↓ 0, and θ(x) := is the largest integer in (−∞, x]. Define the non-curtailing functional K: Then the following path dependent SDE has no strong solution: We remark that K is bounded and thus SDE (5.11) has a unique (in law) weak solution, following the standard Girsanov Theorem. We also note that the above K is discontinuous.
When K is state dependent, namely K = K(t, X t ), the SDE could have a strong solution even when K is discontinuous, see Cherny & Engelbert [4] and Halidias & Kloeden [12] for some positive results.
Our example considers the following setting, with f depending on the paths of X: Example 5.2. Let K be defined in (5.10), and A := L 2 (F B , P 0 ).
(ii) For each n, denote t i := iT n , i = 0, · · · , n, and α n t : . Recall (5.13) and note that It is clear that Since α n is piecewise constant, then F B = F B α n , and thus there exists a piecewise constant processα n such that α n t (B · ) =α n t (B α n · ). That is, n s (B α n · )ds + B α n t , P α n -a.s.
However, it does not have a strong solution such that t 0 Z s dB s is a BMO martingale. We refer to Zhang [33] Chapter 7 for BMO martingales. Indeed, if there is such a solution, then by (5.16) we immediately have This implies that Y = Z = 0. Then X has to be a strong solution of SDE (5.11), contradicting with Lemma 5.1.

The case with diffusion control
We first recall a result due to Barlow [3]. Recall the function θ(x) in Lemma 5.1.  Then the following SDE has a unique weak solution but no strong solution: X t = t 0 σ 0 (X s )dB s , P 0 -a.s. (5.18) (ii) By standard literature we also have V 0 = u(0, 0) = g(0) = 0. Assume by contradiction that (5.21) has an optimal control α * (B · ). Note that the optimal control for the Hamiltonian in (5.22) is ∂ 2 xx u(t, x) = σ 0 (x), then we must have α * t (B · ) = σ 0 (X α * t ), P 0 -a.s.. Thus X * := X α * satisfies SDE (5.18). Since by definition α * is F B -progressively measurable, we see that X * is also F B -progressively measurable, and hence X * is a strong solution of SDE (5.18), contradicting with Lemma 5.4.
Remark 5.6. In this example, since σ 0 is not differentiable in x, then neither is f . Consequently, the stochastic maximum principle in Section 2.3.1 does not work.
We finally prove the Hölder continuity of u in terms of t. Let k be large enough and omit the subscripts k and superscripts k in (5.26). Then we have the representation u(t, x) =Ỹ t,x t , andỸ t,x