The stochastic value function on metric measure spaces

Let $(S,d)$ be a compact metric space and let $m$ be a Borel probability measure on $(S,d)$. We shall prove that, if $(S,d,m)$ is a $RCD(K,\infty)$ space, then the stochastic value function satisfies the viscous Hamilton-Jacobi equation, exactly as in Fleming's theorem on ${\bf R}^d$.


Introduction
Let T d : = R d Z d denote the d-dimensional torus, let w 0 ∈ C 2 (T d ) and let F : (−∞, 0] × T d → R be a continuos potential. It is well-known ( [11]) that the solution of the Schrödinger equation, backward in time, is given by the Feynman On the other side ( [13]), u is a value function, i. e.
and the inf is over all smooth vector fields Y .
It is natural to ask whether some of these facts remain true in a more general setting. If we look at the various ingredients of Fleming's proof, the Brownian motion is the one with the longest history. Brownian motions on fractals have been studied since the Eighties (see [7], [8] and references therein); the crucial connection with Dirichlet forms was proposed in [17]. The "minimal" requirement to have a Brownian motion is the following: (S, d) is a compact metric space, m is a Borel probability measure on S, positive on open sets, and E is a strongly local Dirichlet form on L 2 (S, m) (see section 1 below for the precise definitions).
By [14], this implies the existence of a Brownian motion starting from m a. e. x ∈ S.
Next, we have to make sense of (3) or, equivalently, of the Fokker-Planck equation. We recall that, on for all test functions φ ∈ C ∞ 0 ((t, 0)×T d ). All of this translates to our setting: ∆ becomes ∆ E , the "Laplacian" associated with the Dirichlet form E. As for the internal product X, ∇φ the theory of Dirichlet forms provides an object which behaves similarly: it is called the carré de champs, and we shall suppose that the carré de champs is defined on D(E), the domain of E. As we shall see in section 3 below, one can also define a class of test function T , namely the functions φ such that φ ∈ C 1 ([t, 0], L 2 (S, m)) ∩ L ∞ ([t, 0], D(∆ E )).
This setting is sufficient to prove points 1) and 2) of theorem 1 below; if we want to go farther, we need to prove that the function u defined in (1) above satisfies the Hamilton-Jacobi equation (2). In other words, we need information on the Laplacian ∆ E (log w). It turns out that ∆ E (log w(t, ·)) ∈ L 2 (S, m) if the carré de champs of w(t, ·) belongs to L 2 . That's why we need our last ingredient, i. e. that (S, d, m) is a RCD(K, ∞) space and that E is the double of Cheeger's energy: in this setting, it is standard that w(t, ·) is Lipschitz and that the carré de champs of w(t, ·) is bounded by its Lipschitz constant. Using these facts and the method of [13], we shall be able to prove one inequality of formula (7) below. For the opposite inequality, we need to solve the Fokker-Planck equation with drift ∇u; again, the fact that u is Lipschitz will be essential.
We shall use the strategy just outlined to prove the following theorem; we refer the reader to the next sections for the definitions of the various terms appearing in it. 1) Then, there is a unique w ∈ C 1 ((−∞, 0], L 2 (S, m)) ∩ C((−∞, 0], D(∆ E )) which solves the Schrödinger where equalities are in the L 2 (S, m) sense, i. e. m a. e..
2) The function w is given by the Feynman-Kac formula for m a. e. x ∈ S.
3) If (2.13) below holds, we shall see that the maximum principle implies w(t, ·) is bounded away from 0 and +∞ for all t ≤ 0; we can thus consider ) and satisfies the Hamilton-Jacobi equation with time reversed where Γ is the carré de champs associated with E .

4)
Lastly, for all probability density ρ t ∈ L ∞ (S, m) and all t ≤ 0, we have that where µ is a solution of the Fokker-Planck equation with drift V starting at µ t = ρ t m; the min is over all The paper is organised as follows: in section 1 we recall from [14] and [5] some definitions and theorems about Dirichlet forms; we shall also recall from [2], [3], [4], [5] and [20] the results we need about RCD(K, ∞) spaces. In section 2, we tackle equations (4) and (6) and the Feynman-Kac formula. In section 3, we introduce the notion of weak solutions of the Fokker-Planck equation and prove one inequality of (7); the opposite inequality is proven in section 4.
Acknowledgement. The author would like to thank the referee for the careful reading and the helpful comments. §1

Preliminaries and notation
To prove points 1) and 2) of theorem 1, it suffices to have a measured metric space with a symmetric, strongly local Dirichlet form on it. We shall call this situation the Dirichlet form setting; let us state the precise hypotheses.
Following [14], we shall assume that (S, d) is a metric space (which we shall suppose compact for simplicity) and that m is a probability measure on S, positive on open sets.
It is standard ( [14], [17]) that E is closed if and only if the quadratic form : u → E(u, u) is lower semicontinuous in L 2 (S, m).
We shall assume two further properties on E; the first one is that E is regular. This means that E has a core, i. e. a subset C ⊂ D(E) ∩ C(S) such that C is dense in D(E) for || · || D(E) , and is dense in C(S) for the sup norm.
The second one is that E is strongly local, i. e. that if f, g ∈ D(E) and f is constant on a neighbourhood of the support of g.
By theorem 1.3.1 of [14], there is a non-positive self-adjoint operator ∆ E such that is dense in L 2 (S, m) and (1.1) Now −∆ E , being self adjoint and non-negative, is monotone maximal; thus we can apply the theory of [10], getting that − 1 2 ∆ E generates a semigroup of contractions, backward in time, on L 2 (S, m). Namely, for t ≤ s there is P t,s : L 2 (S, m) → L 2 (S, m) such that The semigroup is autonomous, i. e. P t+h,s+h = P t,s , and we could have called it P t−s as well. The reason for this clumsier notation is that in the next sections we shall need to keep track also of the starting time of the trajectory.
For each fixed f ∈ L 2 and t ≤ s, the map : t → P t,s f is continuous and (1.2) Since − 1 2 ∆ E is the generator of P t,s , we have that where the limits are in L 2 (S, m). The Brownian motion is the stochastic process behind the semigroup P t,s ; namely, by theorem 4.5.3 of [14] for m a. e. x ∈ S, it is possible to define a probability measure P (t,x) on C([t, +∞), S) (and a related expectation E (t,x) ) such that P (t,x) concentrates on and, for t ≤ s, For τ ∈ [t, +∞) we define e τ as the evaluation map We denote as usual by F ♯ µ the push-forward of a measure µ by a map F ; for h > 0 we shall set p h (x, dy) = (e h ) ♯ P (0,x) .
By (1.4) we easily get that P t,s is positivity preserving: The last hypothesis we shall make on E is that its carré de champs is defined on D(E); in other words, we ask that there is a symmetric bilinear form such that (1.5) We recall from [5] and [9] some properties of the carré de champs Γ.
•) If f, g ∈ D(E) and η ∈ Lip(R), then η(f ) ∈ D(E); moreover, the chain rule holds (1.7) •) By formula (2.18) of [5] we have that, if η ∈ C 2 (R) has bounded first and second derivatives (the hypothesis There is an important example of a Dirichlet form which satisfies all these hypotheses: the double of Cheeger's energy on RCD(K, ∞) spaces. When we shall be in this more restrictive hypothesis, we shall say that we are in the RCD(K, ∞) setting. We refer the reader to [2], [3], [4], [5] and [20] for their definition and the study of their properties; here, we only recall the few facts we need.
•) There is a "natural" Dirichlet form E; E is the double of Cheeger's energy, which we don't define (see for instance [2].) The form E is regular, strongly local and its carré de champs is defined on D(E).
•) Each x ∈ S is the starting point of a Brownian motion.
•) We recall that we called p h (x, dy) the transition probability of the Brownian motion. We define P(S) as the space of all the Borel probability measures on S; if µ, ν ∈ P(S), we define their Wasserstein distance where the minimum is over all Borel probability measures Σ on S × S whose first and secon marginals are, respectively, µ and ν.
An important property of RCD(K, ∞) spaces is that the map : to (P(S), W 2 ); the Lipschitz constant is bounded by e −Kh .
•) The probability p h (x, dy) has a density: (1.10) •) By section 4.1 of [5], Lipschitz functions are dense in D(∆ E ) for the L 2 topology. As a consequence, in theorem 1 the hypotheses on w 0 are not empty. §2 The Feynman-Kac formula In this section, we are going to prove that the Schrödinger and Hamilton-Jacobi equations have a unique solution and that the Feynman-Kac formula holds; as usual, Feynman-Kac will imply a maximum principle.
We shall also use the Feynman-Kac formula to prove that the solution of Schrödinger's equation (4) is Lipschitz; this will help us to deduce equation (6) at the end of this section.
We start in the Dirichlet form setting. We saw above that − 1 2 ∆ E is a monotone maximal operator; in particular, its graph is closed. Said differently, D(∆ E ) with the internal product is a Hilbert space. We shall denote its norm by Let g ∈ C((−∞, 0], L 2 (S, m)); shall say that u is a strong solution of the inhomogeneous heat equation, Let us set From now on we shall make the following assumption on the potential of the Schrödinger equation F : The following lemma is well-known (see for instance theorems 6.1.4 and 6.1.5 of [18]); we give the proof for completeness.
has a unique strong solution ; by the theorem of [16] we mentioned before formula (2.2), we see that it suffices to find a solution w ∈ C 1 ((−∞, 0], L 2 ) of the integral equation Setting s = t − r, this becomes (note that r ≤ 0) For T > 0 let us set . We begin to show that, for T > 0 small, the operator is a contraction from A T to itself and thus it has a unique fixed point.
Formula (2.5) and an easy calculation imply that If w,w ∈ A T , this yields the equality below; the first inequality comes from Hölder and (1.2); the last inequality follows since t ∈ [−T, 0].
Thus, if T is so small that we get that Analogously, By the last two formulas, we see that Φ is a contraction.
Deducing from this existence for all times is standard. It suffices to show that there is a decreasing We find the function ǫ. We choose w(−T ) as a final condition for the operator Φ; in other words, we set and defineΦ Arguing as above, we see that the Lipschitz constant ofΦ is smaller than 1 2 as long as Now it suffices to take ǫ(T ) as the largest ǫ for which the formula above holds.

Lemma 2.2.
Let (S, d, m) and E satisfy the hypotheses of the Dirichlet form setting. Let F satisfy (F), let w 0 ∈ D(∆ E ) and let w be the unique solution of (2.3). Then, for m a. e. x ∈ S we have that Proof. We recall the argument of [11]. We fix t < 0; for s ∈ [t, 0] we set We are going to prove that, for all ψ ∈ L 2 , the function l ψ : s → a(s, ·), ψ L 2 has the right hand derivative identically equal to zero at every s ∈ [t, 0). It is a standard fact (see for instance [19], theorem 8.21) that this implies that l ψ is absolutely continuous on (t, 0). Using (2.8) and dominated convergence, we easily see that l ψ is continuous at s = 0. Thus, integrating l ′ ψ , we get that Thus, a(0, ·) = a(t, ·) in L 2 , which implies (2.7).
We calculate the right hand derivative for l ψ . The first equality below comes from the definition of We begin to tackle (2.9) a . We consider the measure m ⊗ P (t,x) on S × C([t, 0], S); since F is continuous and the Brownian motion has continuous trajectories, we have that x) a. e.; adding in the fact that F is bounded, we get that x) a. e.. By dominated convergence, this implies that, for all h ∈ (0, 1], in L 2 (S, m).
We assert that We prove this; the first equality below is the definition of P t,s+h ; the inequality is (1.2) and the limit follows from the regularity of w.
The first equality below is the definition of P t,s , while the limit comes from the fact that P t,s is strongly continuous.
By (2.10) and (2.12) we easily get that Together with (2.11), this implies the convergence below, which is in the L 2 topology.
The first equality below comes from (2.9) and the last two formulas, the second one comes from the fact that w solves (2.3). d + ds a(s, ·), ψ L 2 = as we wanted.

\\\
Note that the integral in (2.7) converges also if the initial condition w(0, ·) is only bounded; actually, we shall show in lemma 2.4 below that, if w 0 is Lipschitz, also w(t, ·) is such. Naturally, if w 0 ∈ D(∆ E ), we lose the fact that w solves (2.3).
An immediate consequence of (2.7) and hypothesis (F) is the maximum principle. (2.14) Up to this point, we have only used the properties of a strongly local Dirichlet form in a metric space; if we want to prove formula (6) of the introduction, we need much more. Thus, from now on we strengthen our hypotheses to the RCD(K, ∞) setting. We shall need a stronger condition on F too. Proof. Since w is defined by (2.7), hypothesis (F) and dominated convergence give the first equality below; the second one is the definition of Wiener's measure.
Thus, it suffices to prove that the multiple integral on the right is the last one comes from the chain rule (1.6).

\\\ §3
The weak version of Fokker-Planck and the value function In this section, we define the weak version of the Fokker-Planck equation and prove one inequality of (7). In this section and in the next one, we shall suppose that (S, d, m) is a RCD(K, ∞) space and that E is the double of Cheeger's energy.
The space of test functions. We recall from section 2 that D(∆ E ) with the internal product ·, · ∆E is a Hilbert space.
Let u ∈ D(∆ E ); since D(∆ E ) ⊂ D(E), we can take g = u in (1.8) getting that By the definition of || · || D(E) and Young's inequality, this implies that In particular, we have that, if t < 0, We say that φ is a test function, or that φ ∈ T for short, if By (3.1), we have that The space of drifts. If µ: [t, 0] → P(S) is a curve of measures, we shall need to compare the "tangent spaces" to P(S) at µ τ and µ τ ′ ; as we shall see below, the following definition radically simplifies the problem.
We say that a Borel function µ: [t, 0] → P(S) is admissible if µ τ = ρ τ m for all τ ∈ [t, 0] and if there is We check that the integral on the left makes sense. Note that, since φ ∈ T , the second inequality below follows, while the first one is Hölder.
Since φ ∈ T , the map : τ → ∆ E φ τ is bounded from [t, 0] to L 2 (S, m); together with (3.3), this implies the integrability of 1 2 ∆ E φ τ . As a last remark, note that we don't address the question whether a Borel curve of measures satisfying (3.4) is continuous or absolutely continuous; we only note that the solution of Fokker Planck we build in section 4 is continuous. We refer the reader to [12] for a study of this problem on R d .
Beginning of the proof of theorem 1. The solution of (4) exists and is unique by lemma 2.1; the solution of (6), by lemma 2.5. The Feynman-Kac formula of point 2) follows from lemma 2.2. Thus, we are left with proving (7); the following lemma gives one of the inequalities.
Equality holds in the formula above when V = −u.
Proof. We note that u ∈ T : indeed, it belongs to C 1 ( If V = −u, then the only inequality in the formula above becomes an equality, implying the last assertion of the lemma.

\\\
For the opposite inequality, we need to solve the Fokker-Planck equation; that's what we do in the next section. §4

Solving Fokker-Planck
From the last assertion of lemma 3.1 we deduce that the following proposition implies the inequality opposite to (3.5). We shall need the stronger hypothesis that (S, d, m) is a RCD(K, ∞) space because we want to apply the standard way (see for instance [6]) to solve Fokker-Planck's conjugate. This requires to solve the Schrödinger equation of formula (4.4) below; we saw in section 2 above that its final condition, which is φ(s, ·)w(s, ·) for some φ ∈ T , must be in the domain of ∆ E . We shall see that this follows if Γ(w(s, ·), w(s, ·)) is bounded; in turn, on RCD(K, ∞) spaces this follows from the fact that w(s, ·) is Lipschitz, which we know from section 2. and Using (1.8), we get that This formula holds for g ∈ D(E) ∩ L ∞ , which is dense in D(E) for the graph norm of D(E) (with some overkill, this follows by proposition 4.10 of [4]). Thus, by (1.8), (ab) ∈ D(∆ E ) and (4.1) holds if we show that ∆ E a · b + ∆ E b · a + 2Γ(a, b) ∈ L 2 (S, m).
Now ∆ E a · b and ∆ E b · a belong to L 2 since ∆ E a, ∆ E b ∈ L 2 and a, b ∈ L ∞ ; moreover, by Cauchy-Schwarz. Since Γ(a, a) ∈ L 1 and Γ(b, b) ∈ L ∞ by our hypotheses on a and b, we are done.

\\\
Let w solve equation (4)    Note that (4.5) defines a function f s also when G is only continuous: we shall use this fact to define the curve of probability measures.
We list below the properties of f s (G, ·). Let (S, d, m) satisfy the hypotheses of the RCD(K, ∞) setting and let (F) and (FF) hold.
Then the following two points hold.
1) Let G ∈ D(∆ E ) ∩ L ∞ (S, m); then, for all t ≤ s, we have that and , log w(t, ·))(x) = 0 m a. e. in S.  and Proof. First of all, we recall that ψ s ∈ D(∆ E ) by lemma 2.1 while 1 w ∈ D(∆ E ) by (1.9), lemma 2.1, corollary 2.3 and lemma 2.4; both functions are bounded by corollary 2.3, and 1 w is Lipschitz. Thus, we can apply (4.1) to a = ψ s and b = 1 w and get that Now (4.6) follows from the formula above, (4) and (4.4).
Next, to the bounds on the density of f s (ψ, t, ·).
Since F is bounded, (4.5) and (2.14) imply that there is an increasing function  Together with Fubini, this yields (4.7) of point 2).
We only sketch the standard proof of point 3). We start from the right hand side of (4.8); the first and second equalities are (4.5) for f s (·, t, ·) and f r (·, s, ·) respectively; the third equality comes from the fact that the Brownian motion on [s, r] is independent of [t, s]; the last equality is (4.5) for f r .

\\\
End of the proof of proposition 4.1.
Step 1. We define the curve of measures.
Let G: S → R be continuous; for t ≤ s ≤ 0, let f s be defined as in (4.5); we saw above that f s (G, t, x) is defined for all s > t and all x ∈ S. We can define It is immediate from (4.5) that Λ Given an initial probability density ρ t , we define µ t s as µ t s = S µ (t,x) s ρ t (x)dm(x). (4.11) Step 2. We assert that µ We briefly prove this fact. Let G ∈ C(S, R); the first equality below is (4.10); the second one comes from the fact that the solution of (4.6) is a semigroup in the past, i. e. (4.8). The third equality is (4.10) applied to f r and f s . Step 3. We prove point 1) of proposition 4.1, i. e. that : s → µ t s satisfies (3.3).