Quantitative Approximation Properties for the Fractional Heat Equation

In this note we analyse \emph{quantitative} approximation properties of a certain class of \emph{nonlocal} equations: Viewing the fractional heat equation as a model problem, which involves both \emph{local} and \emph{nonlocal} pseudodifferential operators, we study quantitative approximation properties of solutions to it. First, relying on Runge type arguments, we give an alternative proof of certain \emph{qualitative} approximation results from \cite{DSV16}. Using propagation of smallness arguments, we then provide bounds on the \emph{cost} of approximate controllability and thus quantify the approximation properties of solutions to the fractional heat equation. Finally, we discuss generalizations of these results to a larger class of operators involving both local and nonlocal contributions.


1.
Introduction. This article is dedicated to qualitative and quantitative approximation properties of solutions to certain mixed local-nonlocal equations. As a model problem, we consider the heat equation for the fractional Laplacian with s ∈ (0, 1), and study the quantitative approximation properties of the mapping Here W ⊂ R n is an open, bounded Lipschitz set, such that W ∩ B 1 = ∅. The precise mapping properties of the solution map P s for the problem (1) are discussed in Section 2.
Let us recall that this approximation property crucially relies on the nonlocality of the operator under consideration. Indeed, solutions to the corresponding local equation (for which s = 1) are very rigid; for instance they satisfy the strong maximum principle and Harnack's inequality locally. In contrast, solutions of the nonlocal problem are much more flexible. The maximum principle for example in general only holds on global scales, which does not rule out local oscillations [20,34]. Further than that, a central, and at first sight surprising finding in [9] shows that the approximation properties of equations like (1) are solely determined by their nonlocal part. In particular, in the framework of [9] parabolicity is not needed, it would for instance also be possible to consider wave type operators, c.f. Section 5 below.
While proving the density of the image of the mapping (2), the argument in [9] left open the question on more quantitative approximation properties: In the argument from [9], the underlying domain geometry and the structure of the explicit fundamental solution for the stationary problem play an important role, and the size of the support of u is not explicitly controlled. In the present article, we address these more quantitative properties. Here our contributions are two-fold: First, by extending the qualitative Runge approximation arguments from [17] to the case of more general nonlocal equations, we obtain precise control on the support of u. Our argument, which relies on duality, is very flexible and can also deal with more general operators e.g. with variable coefficients and lower order contributions, c.f. Section 5.3. Secondly, as the main contribution of this article, we quantify the cost of approximating a given function h ∈ L 2 (B 1 × (−1, 1)). More precisely, we address the following question: Q: Given an error threshold > 0 and a function h ∈ L 2 (B 1 × (−1, 1)), how large is the value of a suitable norm of a possible control function f , which is such that P s f approximates h up to the error threshold ? In the context of the model problem (1) our main result on this can be formulated as the following proposition: Main Theorem (Cost of approximation). Let h ∈ H 1 0 (B 1 × (−1, 1)) and > 0. Let W ⊂ R n \ B 1 be a Lipschitz domain with W ∩ B 1 = ∅. Then there exists a control function f ∈ L 2 ((−1, 1), C ∞ c (W )) such that h − P s f L 2 (B1×(−1,1)) ≤ , where the constants C > 1 and σ > 0 only depend on n, s, and W . Moreover, we note that f can be expressed in terms of the minimizer of a suitable "energy" (more precisely of the functional (21)).
The question on quantitative properties for operators like the heat operator with fractional diffusion was partly motivated by stability results in inverse problems for QUANTITATIVE APPROXIMATION 3 nonlocal operators. Indeed, in [37] similar quantitative approximation properties allowed us to derive quantitative stability properties in the fractional Calderón problem, which had been introduced in [17]. Relying on an elliptic analogue of Theorem 1, we could show that in spite of the nonlocality of the operator, the associated stability estimates are only of logarithmic type. By virtue of [38] this is optimal. Based on the main results of the present article, it is thus probable that also for related inverse problems with more general nonlocal operators only logarithmic stability properties are available.
The quantitative study of these approximation properties can further be considered as a continuation of the investigation started in [36], which was motivated by problems from medical imaging. Here stability and invertibility properties of the truncated Hilbert transform were studied.
We remark that similar quantitative questions are also of relevance in control theory: Indeed, the result of [9] can be read as an approximate controllability result, showing that by applying suitable controls from the exterior any function (in a suitable function space) can be approximated by solutions to the fractional heat equation. Theorem 1 then quantifies this and estimates the cost of the approximation by measuring the size of the exterior data that are needed for the given solution to be sufficiently close to the desired function (in terms of the error threshold ). While authors in the control theory community are however often interested in approximating a given datum by steering an equation within a given time to a desired final state (c.f. for instance [11,13,27,25,31,32,33,41] and in particular [2,30,29] for the setting of non-local equations and the references therein), in the present article, motivated by applications to inverse problems, we seek to approximate a given function at each time slice by choosing suitable exterior data. In spite of this difference, our problem shares many features with the described problem from the control theory community. In particular, in addressing our main question, we borrow tools and ideas from control theory, in particular from [15].
Let us comment on the result of Theorem 1: In the model setting of the heat equation for the fractional Laplacian it quantifies an L 2 version of the result from [9]. The condition that h vanishes on the boundary does not pose serious restrictions compared to the result of [9], as this can always be achieved after a suitable extension. Indeed, it is always possible to reduce to the situation where h ∈ H 1 0 by considering the control problem in a slightly larger Lipschitz domain Ω × (−2, 2) ⊂ B 2 × (−2, 2) (where Ω is adapted to the geometry of B 1 and W ) and by extending the given function h ∈ H 1 (B 1 × (−1, 1)) to a functionh ∈ H 1 0 (Ω × (−2, 2)) with the properties thath| B1×(−1,1) = h and Considering an analogue of (1) and Theorem 1 in Ω × (−2, 2) then implies the L 2 version of the approximation result from [9] for the fractional heat equation.
Regarding the dependences on and h in the estimate (3) in Theorem 1, we expect that the exponential dependence on > 0 is indeed necessary. In the elliptic counterpart this was indeed recently established by the authors [38,Theorem 2]. We expect that similar arguments persist in the parabolic case. Although it is natural that higher order norms of h appear in the estimate, we do not believe that the norms, which are used in (3), are optimal (for instance, by the scaling of the equation, one might expect that one can work with parabolic Sobolev spaces). Yet we hope that the ideas introduced here are robust enough to be extended to a number of other problems in which both local and nonlocal operators are involved. A number of further operators for which these ideas are applicable are discussed in Section 5.
Similarly as in [36], our approach to the question on the cost of control relies on (i) a propagation of smallness result, (ii) quantitative unique continuation properties of the adjoint equation (9), (iii) the variational technique from [15], (iv) and on a global estimate for solutions to (9) (c.f. equation (18)). As in the qualitative density result, it is the underlying nonlocal operator, whose properties we mainly exploit (c.f. ingredients (i)-(iii)). The parabolic character of the problem only enters by invoking global estimates. It is therefore possible to extend this result to a much richer class of local-nonlocal operators (c.f. Section 5). The remainder of the article is structured as follows: In Section 2 we first discuss the qualitative approximation properties of the fractional heat equation. This is based on Runge type approximation arguments. Next, in Section 3, we address the quantitative uniqueness properties for the fractional heat equation with s ∈ (0, 1). Here we rely on propagation of smallness estimates. In Section 4 we introduce a variational approach to the approximation problem and prove Theorem 1. Finally, in Section 5, we explain how to extend the presented arguments to more general (variable coefficient) local-nonlocal operators.
2. Qualitative approximation and weak unique continuation. In this section, we discuss the qualitative approximation properties of the mapping (2). As the main result, we recover an L 2 version of certain approximation properties identified in [9]. Instead of relying on boundary asymptotics of the problem, we however use Runge type approximations as introduced in [17] (c.f. also [24,4,5] for similar ideas in the setting of different local equations). In principle this could be upgraded to (stronger) approximation properties in Hölder spaces (c.f. Section 6 in [17] and [9]). As we are however mainly interested in the quantitative approximation properties outlined in the next section, we do not pursue this here.
2.1. Notation and well-posedness. In the following we will mainly rely on two definitions of the fractional Laplacian: On the one hand, we regard it as a Fourier multiplier in R n , i.e. for u ∈ α∈R H α (R n ) we have (−∆) s u = F −1 (|ξ| 2s F u), where F u(k) = R n e −ix·k u(x)dx denotes the Fourier transform. We will mainly use this definition of the fractional Laplacian in the discussion of the mapping properties of the fractional heat equation.
In our argument leading to an estimate of the quantitative cost of approximation, it will however be convenient to work with a local operator. To this end, we recall that by virtue of [7] it is possible to realize the nonlocal operator (−∆) s with s ∈ (0, 1) as a local operator by adding an additional dimension: Given a function v ∈ L 2 (R n ), and writing x = (x , x n+1 ) ∈ R n+1 , we have that for some constant where the functionv is a solution to Here ∇ = (∂ 1 , . . . , ∂ n+1 ) t denotes the full gradient in n + 1 dimensions (i.e. in the tangential and normal directions). If convenient, we also abbreviate the tangential part of it by ∇ . In the sequel, we will use the convention that for a function v ∈ L 2 (R n ) we denote its Caffarelli-Silvestre extension into R n+1 + by v. We refer to [23] for further equivalent definitions of the fractional Laplacian.
As in [17] and [37] we will mainly use energy spaces. To that end, we recall that for s ∈ R We denote the corresponding homogeneous spaces by adding a dot to these spaces, e.g.Ḣ s (R n ). As we are working with a time dependent problem, we will also use the corresponding Bochner spaces, which are associated with the energy spaces of our equations.
Having introduced the previous notation, we discuss the well-posedness of equations as in (1). Here we restrict our attention to standard regularity assertions in the energy space as this suffices for our purposes. For more refined results we refer to for instance [14,21,26,3,18]. We remark that the operator (−∆) s is always understood to act in the variable x ∈ R n . Lemma 2.1. Let n ≥ 1 and s ∈ (0, 1). Then for any F ∈ L 2 ((−1, 1), H −s (B 1 )) and any f ∈ L 2 ((−1, 1), H s (R n )) with f | B1×(−1,1) = 0, there exists a unique function Moreover, Remark 1. We refer to the function u as a weak solution of (5). Note also that changing t to −t, we obtain an analogous solvability result for the problem Proof. We first note that writing v = u − f and invoking the support assumption for f , the problem reduces to finding v solving  (7), then multiplying the equation by v, integrating over (−1, t) × R n , and using that v(−1) = 0 gives the initial estimate The Hardy-Littlewood-Sobolev inequality gives w L 2 (B1) ≤ C w L 2n n−2s ≤ C n,s w Ḣs for w ∈H s (B 1 ) (if n = 1 and s ≥ 1/2, one can interpolate the easy L 2 → L 2 andḢ 1 → L ∞ bounds). Using this and Young's inequality yields that and using the equation once more implies the energy estimate for solutions of (7). Now (8) implies uniqueness as well as norm estimates for a solution u = f + v of (5), using the triangle inequality and the support assumption for f . Hence, it remains to discuss existence of solutions. This follows from a Galerkin approximation. To that end, we consider an eigenbasis {ϕ k } ∞ k=1 associated with the Dirichlet fractional Laplacian in B 1 , i.e.
We normalize these eigenfunctions so that they form an orthonormal basis ofH s (B 1 ) and an orthogonal basis of L 2 (B 1 ). Thus, writing α k (t) = (v(t), ϕ k ) L 2 (B1) , testing the equation (7) with ϕ k , and requiring α k (−1) = 0 results in the ODE This function solves (7) withF replaced byF N : This yields enough compactness to extract a weak limit v as N → ∞. Testing the equation for v with functions of the form w(t, x) = M k=1 w k (t)ϕ k (x), which form a dense set, we obtain a solution v to (7) satisfying the desired a priori bounds.
For later reference, we also note the following spatial higher regularity result: The claimed estimate follows from the L 2 ((−1, 1), H s (R n )) bound in Lemma 2.1.

2.2.
Qualitative approximation. We next approach the qualitative density properties of the fractional heat equation. By means of a duality argument as in [17] this is reduced to unique continuation properties of the fractional Laplacian.

Remark 2.
We emphasize that the choice of the spatial domain B 1 is not essential in our argument. It is for instance possible to consider more general, bounded Lipschitz domains.
Proof. By the Hahn-Banach theorem, it is enough to show that if v ∈ L 2 (B 1 × (−1, 1)) satisfies 1)), then v ≡ 0. Now let v be such a function. We consider the dual problem to (1). It is given by We note that by virtue of Lemma 2.1 and Remark 1, both (1) and (9) are well-posed. (1), and let ϕ solve (9).
Remark 3. The adjoint property (10) can also be inferred using the Caffarelli-Silvestre extension, see (4) and Section 3. Denoting the Caffarelli-Silvestre extension associated with ϕ(x, t) by ϕ(x, x n+1 , t) and using the notation from (4), the equation (9) can be formulated as With this notation, we then have Here we first integrated by parts in time, then used that ϕ and u are solutions to the Caffarelli-Silvestre extension for each fixed time slice and finally exploited that u obeys (1).
Remark 4. The argument of Theorem 2.3 shows that also in the case, in which a local operator is combined with a nonlocal operator, the density properties of R are purely determined by the nonlocal component of the operator: The local part of the operator does not play a role in the reduction to the weak unique continuation principle and only the weak unique continuation properties of the underlying nonlocal operator are of relevance.
In particular, this implies that as in [9] the parabolic character of the problem at hand was not essential in the qualitative density argument. The same strategy can be pursued for more general operators, e.g. the fractional wave equation (c.f. Section 5). In analogy to the notation from in control theory we use the following convention in the sequel: Definition 2.4. Let P s for s ∈ (0, 1) be as in (2). Given a function h ∈ L 2 (B 1 × (−1, 1)) and an error threshold > 0, we refer to a function f ,h , which satisfies as a control function for h with error threshold > 0. If there is no danger of confusion, we also simply refer to it as a control.

Propagation of smallness.
With the qualitative behaviour from the previous section at hand, we now proceed to quantitative aspects of these approximation results. We begin our analysis by deducing a central propagation of smallness property, which quantifies the weak unique continuation result used in Section 2 and provides the basis for the proof of Theorem 1. This result is stated in terms of the Caffarelli-Silvestre extension (c.f. [7]), which we have recalled in the beginning of Section 2.
We recall the following generalization of the three spheres inequality (c.f. [1] for a survey of these bounds in the case s = 1/2) and of the Lebeau-Robbiano boundary-bulk interpolation estimate (c.f. [25] for the case s = 1/2) to solutions of the degenerate elliptic equation: We have the following fractional bulk-boundary interpolation estimate due to Proposition 5.6 together with Remark 5.2 in [37], where ∈ (0, 1], µ ∈ (0, 1) and C > 1 are constants depending on n, s and W , and W/2 := {x ∈ W : dist(x, ∂W ) > (max z∈W dist(z, ∂W ))/2}. We have also written Thus, using that ϕ is a Caffarelli-Silvestre extension of ϕ, for each fixed time t ∈ (−1, 1) and each radius r with 0 < r ≤ (x 0 ) n+1 /5, we can apply the three spheres inequality from (i) in the spatial variables x = (x , x n+1 ) in the form We consider a chain of N balls, K : [37, proof of Theorem 5.5] for more details on this argument). Due to the constraint (x i ) n+1 ≥ 5r i , we note that the constant N can be chosen to be of the order where C > 1 is a constant that only depends on n, s, W and may change from line to line. Applying (13) iteratively along this chain, we infer that where ∈ (0, 1] is as in (ii), and [37,Lemma 4.5], (15) can be upgraded to read . Combining this with a simple trace estimate (using the fundamental theorem of calculus) also yields . Combining this with (ii), i.e. the analogue of the bulk-boundary interpolation estimate of Lebeau and Robbiano [25], further yields Here we have used that ϕ = 0 on W × {0} × {t}.
Integrating the square of (16) in time for t ∈ (−1, 1) and applying Hölder's inequality then gives By energy estimates for solutions to (9) (c.f. Lemma 2.1) we further have Combining this with a boundary estimate for the Caffarelli-Silvestre extension, i.e., and with equation (17), then allows us to conclude that Recalling the bound from (14) for N therefore yields the claimed inequality for ϕ.
Step 2. Estimate for x 1−2s n+1 ∂ n+1 ϕ. With the strategy from Step 1 at hand, we explain the necessary modifications for the estimate for ψ(x) := x 1−2s n+1 ∂ n+1 ϕ(x). To this end we use duality, which gives that if ϕ is a solution to [7] and [6]). Thus, in the interior of the upper half-plane we can argue analogously as in Step 1 and infer that with the notation of Step 1 Spelling out the definition of ψ then yields Invoking Caccioppoli's inequality thus entails This, however, is in a form which allows us to apply the bulk-boundary interpolation estimate from (ii), whence Combining this with the energy estimate from (18) therefore leads to the desired estimate for x 1−2s n+1 ∂ n+1 ϕ.
Remark 5. The argument for Proposition 1 can be regarded as consisting of two main ingredients: On the one hand, we exploit (interior and boundary) three balls arguments and propagation of smallness properties for solutions to (11). This leads to the bound in (17) and only depends on the underlying nonlocal operator (and its localization by means of the harmonic extension). On the other hand, we combine these propagation of smallness results with a global energy estimate, c.f. (18). It is only at this point, at which we have made use of the full equation with its local and nonlocal contributions, i.e. only at this point the parabolic nature of the problem is exploited.

Proof of Theorem 1.
With the quantitative uniqueness result from Proposition 1 at hand, we now proceed to quantitative approximation results. Here we are interested in estimating the cost of approximation: More precisely, for a given function h ∈ L 2 (B 1 × (−1, 1)) and an error threshold > 0, we seek to derive bounds on the size of suitable norms of a possible control function f ,h (in dependence of suitable norms of h and of > 0). This will prove the main approximation result of Theorem 1. We follow the variational approach presented in [15]. We thus characterise f ,h in terms of the minimizer of the functional Here ϕ and v are related through (9), and η ∈ C ∞ c (W ) is a cutoff function satisfying 0 ≤ η ≤ 1 and η = 1 on W/2 := {x ∈ W : dist(x, ∂W ) > (max z∈W dist(z, ∂W ))/2}. If 0 < s < 1/2 we could replace η by the characteristic function χ W , but if s ≥ 1/2 then χ W is not a pointwise multiplier on H s (R n ) and we need to use a smooth cutoff.
In order to prove the result of Theorem 1, we argue in three steps, which we split into three lemmata: We first show that, for a given function h and an error threshold > 0, a unique minimizerv of the functional (21) exists (Lemma 4.1). This is a consequence of the weak unique continuation properties of the fractional Laplacian. Secondly, ifφ is the solution of (9) corresponding tov, we argue that f := −η 2 (−∆) sφ is a control for h corresponding to an error threshold > 0 (i.e., that it satisfies the first estimate in (3)). This follows from minimality (Lemma 4.2). Finally, in the last step (Lemma 4.3), we provide the bound on the cost of approximation (i.e., the second estimate in (3)). This relies on the estimates from Proposition 1.
Proof. It is enough to prove that J ,h,s is strictly convex, continuous, and coercive, since then it will have a unique minimizer (see e.g. [12, Section II.1]). The functional J ,h,s is convex since it is the sum of three convex functionals, and it is strictly convex since v → η(−∆) s ϕ 2 L 2 (W ×(−1,1)) is strictly convex (this uses again weak unique continuation for the fractional Laplacian). In addition, J ,h,s is continuous since it is the sum of three continuous functionals: The fact that v → η(−∆) s ϕ 2 is continuous follows since (−∆) s ϕ is evaluated at W × (−1, 1), where ϕ = 0 and where according to Lemma 2.2 strong elliptic regularization is present.
With existence of a minimizer at hand, we address the approximation property: (−1, 1)). Let J ,h,s be the functional from (21) and letv be its unique minimizer. Denote byφ the solution to (9) with inhomogeneityv, and let f := −η 2 (−∆) sφ . Then the solution u of (1) satisfies Moreover, f ∈ L 2 ((−1, 1), C ∞ c (W )) and Proof. Letv be the minimizer of the problem (21) and letφ be the corresponding solution of (9). The approximation property in (22) then follows from spelling out the minimality condition for all µ ∈ R, combined with the triangle inequality to estimate the difference of the L 2 norms and by passing to the limit µ → 0 ± . Indeed, Dividing by µ = 0 and passing to the limits µ → 0 ± , we obtain Here ϕ denotes the solution to (9) corresponding to v ∈ L 2 (B 1 × (−1, 1)). Defining f := −η 2 (−∆) sφ and denoting the associated solution to (1) by u, an analogous computation as in (12) implies that (24) turns into By duality this yields (22). One also has f ∈ L 2 ((−1, 1), C ∞ c (W )) by Lemma 2.2. We note that choosing v =v and repeating the argument leading to (23) (where one now avoids the triangle inequality) gives for |µ| small hv dx dt. Dividing by µ = 0 and letting µ → 0 ± implies that which directly leads to −1,1)) .
Finally, since 0 ≤ η ≤ 1 we have Last but not least, we estimate the cost of control. (−1, 1)). Let J ,h,s be the functional from (21) and letv be its unique minimizer. Denote byφ the solution to (9) with inhomogeneityv. Then we have that f := −η 2 (−∆) sφ satisfies, for some C and σ only depending on n, s, W , Proof. In order to finally provide the estimate on the cost of control, we consider a second functional in addition to J ,h,s (v): where, with slight abuse of notation, we write ∂ s n+1 ϕ(x, δ, t) := c s δ 1−2s ∂ n+1 ϕ| (x,δ,t) and ∂ s n+1 ϕ(x, 0, t) = c s ∂ s n+1 ϕ(x, t) (in the sense of Section 2). As in [15] we rewrite our original functional from (21) as Here we used that ϕ(x, 0, t) = ϕ(x, t) and that ϕ solves (9). If we can ensure that (27) we then obtain that Since by Lemma 4.2, f 2 L 2 (W ×(−1,1)) ≤ −2 min J ,h,s (v) = −2I 1 , this translates into It thus remains to estimate I 2 and to ensure (27). We split the argument for this into two steps.
Step 2. Ensuring (27). In order to conclude the proof of Theorem 1, it suffices to ensure that (27) is satisfied and to deduce from this the resulting requirements on and δ. To this end, we observe that where we integrated by parts. We discuss these contributions separately in the sequel.
On the one hand, the fundamental theorem of calculus yields 1,1)) .
In the last line we here used the energy estimate (18) to infer the bound On the other hand, (B1×(−1,1)) . (27), we obtain the following condition on δ, :

Thus, inserting this into
Defining δ as saturating the upper bound in this estimate and plugging it into (28) then finally results in where C > 1 and σ > 0 depend on n, s, and W .
As an immediate consequence of Lemmas 4.1-4.3 we infer the result of Theorem 1.
Remark 6. We remark that the above proof shows that the dependences on the domain W are explicit and could in principle be tracked.

5.
Extensions to more general operators. The arguments presented in Sections 2-3 extend to a much more general class of operators. In the sequel, we briefly comment on some of these. 5.1. Qualitative approximation. As already pointed out in Remark 4 the qualitative approximation argument does not use any regularizing properties of the underlying (nonlocal) equation. It only exploits the weak unique continuation properties of the fractional Laplacian and is hence a purely nonlocal phenomenon (in the sense that the unique continuation properties of the nonlocal operator determine the approximation properties independently of which additional local contributions are involved in the equation). Provided that the associated problem is well-posed (i.e. that the boundary data are prescribed correctly), it is therefore possible to prove these qualitative approximation properties for general operators of the form L + (−∆) s , where L is an arbitrary local differential operator. This recovers (a part of) the result of [9].
In general, qualitative approximation results which are obtained by means of the Runge approximation, require two ingredients: (a) well-posedness of the underlying equation and its adjoint, (b) weak unique continuation for the associated nonlocal operator.
We again emphasize that in (b) only the weak unique continuation properties of the nonlocal operator are of relevance. As the weak unique continuation property is such a crucial ingredient, it is an interesting question to ask for which nonlocal operators it is valid. A large class of operators for which this holds is identified by Isakov: Lemma 3.5.4). Let µ j , j ∈ {1, 2}, be measures with supp(µ j ) ⊂ B r . Let E ∈ S (R n ). Assume that F(E) cannot be written as the sum of a meromorphic function (in C n ) and a distribution supported on the zero set of some nontrivial entire function. Then if E * (µ 1 − µ 2 ) = 0 in R n \ B r , we have that µ 1 = µ 2 globally.
For convenience, we recall the proof of Isakov.
Proof. As µ j , j ∈ {1, 2}, and E * (µ 1 − µ 2 ) are compactly supported, the Paley-Wiener theorem asserts that F(µ 1 −µ 2 ) and F(E * (µ 1 −µ 2 )) are analytic functions. But we have that Thus, on the set in which the entire function F(µ 1 − µ 2 ) does not vanish, we have that The right hand side is by definition a meromorphic function in C n (and thus by [28] defines an element of D (R n )). As a consequence, F(E) can be written as where the first term on the right hand side is a meromorphic function, while the second term h is a distribution supported on the zero set of the entire function F(µ 1 − µ 2 ). This is a contradiction to the assumption of the lemma unless F(µ 1 − µ 2 ) = 0 globally.
Due to the presence of a branch-cut, Isakov's lemma for instance applies to operators of the form L : for X j ∈ R mj , a j ∈ R and s j ∈ (0, 1). In particular, these operators need not be elliptic. We will give the proof for more general operators of the form where m(D X2 ) is a Fourier multiplier in the X 2 variable with at most polynomial growth in Fourier space, i.e., there exists N ∈ N such that Corollary 1. Let s 1 ∈ (0, 1), n 1 , n 2 ∈ N ∪ {0}, n 1 ≥ 1 and let m(D X2 ) be a Fourier multiplier. Let n = n 1 + n 2 and ϕ : R n = R n1 × R n2 → R, ϕ ∈ H −s (R n ) for some s > 0, be such that for some r > 0 where X = (X 1 , X 2 ) ∈ R n1 × R n2 . Then we have that ϕ = 0.
Proof. Instead of reducing the corollary to the statement of Lemma 5.1, we prove it directly by a similar argument. By virtue of our assumptions and by the Paley-Wiener theorem, we first infer that the functionsφ(k, η) and (|k| 2s1 + m(η))φ(k, η) are real analytic and have entire analytic extensions into C n . With slight abuse of notation, we do not change the notation for the analytic extensions, i.e., for instance the functionφ(k, η) denotes both the original function defined on R n and its analytic extension onto C n (which of course is consistent by restriction).
Let us next assume that the statement of the corollary were wrong, i.e. that ϕ ≡ 0 as a function on R n and hence alsoφ ≡ 0 as a function on C n . This implies that there exists a vector ξ = (k 0 , η 0 ) ∈ R n1−1 × R n2 = R n−1 such that ϕ(k 1 , ξ ) ≡ 0 as a function of k 1 ∈ R (and hence also as a function of k 1 ∈ C). As ϕ is analytic as a function in each of its variables, this entails thatφ(k 1 , ξ ), as a function on C, only has a countable discrete set Z ⊂ C of zeroes. In particular, for eachk 1 ∈ C there exist radii R 1 (k 1 ) > R 2 (k 1 ) > |k 1 | such that on the open annulus A R1(k1),R2(k1) (k 1 ) := B R1(k1) (k 1 ) \ B R2 (k 1 ) centered atk 1 ∈ C the function ϕ(k 1 , ξ ) does not have any zeroes and such that A R1(k1),R2(k1) (k 1 )∩R is a relatively open, nonempty set (else it would be possible to construct an accumulation point of zeroes by considering a decreasing sequence (R (j) 2 ) of radii with R (j) 2 → R 1 and by invoking the theorem of Bolzano-Weierstraß). But for some analytic function g(ξ) with ξ = (k 0 , η 0 ) we have that Therefore, on the one hand, for eachk 1 ∈ C we can define an analytic continuation of the function f (k 1 ) : ϕ(k1,ξ ) . This defines a holomorphic function on A R1(k1),R2(k1) (k 1 ). On the other hand, for the standard choice of the logarithm (where the branch cut is located on the negative real axis), the function k 1 → e s1 log(k 2 1 +|k0| 2 ) + m(η 0 ) is analytic in C\{±iα ; α ≥ |k 0 |} and hence this function is also obtained by analytic continuation from the restriction of f (k 1 ) onto A R1(k1),R2(k1) (k 1 ) ∩ R. By uniqueness of the analytic extension we thus deduce that But as the logarithm is discontinuous at its branch points on R − × {0} ⊂ C and as s 1 ∈ (0, 1), the function e s1 log(k 2 1 +|k0| 2 ) is discontinuous along the line iR + +i|k 0 | ⊂ C (c.f. Figure 1). If we choosek 1 = i|k 0 |, this yields a contradiction to the analyticity of f (k 1 ) on A R1(k1),R2(k1) (k 1 ).
Thus, the contradiction assumption must have been wrong and hence ϕ = 0, proving the desired result.
Remark 7. We remark that technically an important ingredient in our argument was the reduction to the one-dimensional situation, which allowed us to invoke properties of holomorphic functions in a single complex variable instead of working with several complex variables.
Remark 8. The requirement s 1 ∈ (0, 1) can be relaxed; all powers s ∈ R, which ensure the presence of a branch-cut for the continuation of |ξ| 2s can be used in the argument from above.
As discussed in [19], Lemma 5.1 does not only apply to the specific class of nonlocal operators from (30), but also to other interesting operators.
If the underlying equations are well-posed, the Runge-type arguments from above yield for instance the following Corollary: Corollary 2. Let s 1 ∈ (0, 1), n, n 1 , n 2 ∈ N with n = n 1 +n 2 and let B 1 ⊂ R n1 ×R n2 . LetL be as in (30) where the Fourier multiplier m is real, i.e. m(η) ∈ R for all η ∈ R n2 . Assume that for some s > 0,L is bounded H s (R n ) → H −s (R n ), and that the problemL Then we have that for any R > 1 the set Proof. Arguing similarly as in Theorem 2.3, by a Hahn-Banach argument, the density result reduces to the weak unique continuation property of the nonlocal operator L in R n \ B R . This however follows from Corollary 1.
More precisely, we show that if v ∈ L 2 (B 1 ) is such that (PLf, v) L 2 (B1) = 0 for all f ∈ C ∞ c (R n \ B R ), then necessarily v = 0. Indeed, using the assumed wellposedness, we define h ∈H s (R n ) by the requirement Then, Here we used that PLf − f ∈H s (B 1 ) and h ∈ H −s (R n ), thatL is bounded H s → H −s , and that m(D X2 ) is self-adjoint. As a consequence, we infer that
Remark 9. Assuming the validity of the corresponding well-posedness theory and further supposing that the local and nonlocal contributions act in different variables, it is straightforward to extend the statement of Corollary 2 to a combination of local and nonlocal operators. This follows by observing that as in the case of the fractional heat equation, the local terms "disappear" on the right hand side of the analogue of the duality argument outlined in (31). As in the setting of the heat equation, the variables on which the local operators act are then simply treated as parameters in the unique continuation properties of the nonlocal operators.

Further constant coefficient operators.
In contrast to the discussion on qualitative approximation in Section 2, the arguments on the quantitative approximation in Section 3 also relied on properties of the underlying operator (including the local terms). Here we made use of two main ingredients: We combined • quantitative weak unique continuation properties (where the main thrust originated from the nonlocal part of the operator), • with specific (regularity) properties of the full underlying operator, in the form of (global) energy estimates, c.f. (18) and Remark 5. These properties are for instance reflected in the respective norms of h, which arise in the estimate on the cost of approximation (c.f. the bounds in Step 3 in the proof of Theorem 1). While this entails that in contrast to the qualitative approximation properties their quantitative counterparts depend more delicately on the structure of the underlying operator -also on the elliptic/parabolic/hyperbolic nature of the local part of the operator -the overall strategy of proof is very robust. It can be applied to a large class of equations, including elliptic/parabolic/hyperbolic ones. To illustrate this, we remark that analogous arguments as outlined above with the same energy functional (21) (but where ϕ now solves the dual problem for the fractional wave equation) lead to quantitative approximation properties for the fractional wave equation (∂ 2 t + (−∆) s )u = 0 in B 1 × (−1, 1), u = f in (R n \ B 1 ) × (−1, 1), Here W ⊂ R n \B 1 is a bounded Lipschitz set. Arguing by a Galerkin approximation, this problem is well-posed. We consider the Poisson operator for (32), P w s : L 2 ((−1, 1), C ∞ c (W )) → L 2 (B 1 × (−1, 1)), f → P w s f = u| B1×(−1,1) . In the setting of the wave equation the energy estimates replacing (18)  ∂ t ϕ L 2 (B1) + ϕ H s (B1) ≤ C v L 2 ((−1,1),L 2 (B1)) .
We remark that the control f can be obtained by a similar minimization problem as the one in (21). 5.3. Variable coefficient operators. Last but not least, we emphasize that the described techniques permit us to deal with variable coefficient perturbations of the local and nonlocal parts of the operator (c.f. also the recent article [16] for qualitative statements). Here the variable coefficient nonlocal operators can for instance be understood as in [39], [8]. As in [35], Section 6, and [36], Section 4, the corresponding estimates carry over to this regime, if the coefficients are suitably regular (c.f. Section 6 in [35] or also [40] for weak and strong unique continuation properties of the variable coefficient fractional Laplacian and the associated necessary regularity assumptions on the coefficients).
Proof. We only give a sketch of the argument, as there are no major changes with respect to the proof of Theorem 1. For the qualitative approximation property, it suffices to note that the crucial identity (v, P s,a ij f ) L 2 (B1×(−1,1)) = −((−∆ a ij ) s ϕ, f ) L 2 (R n ) remains valid. This can for instance be inferred by the extension definition of the operator.

ANGKANA RÜLAND AND MIKKO SALO
Next we note that the quantitative propagation of smallness result which is based on three balls and boundary-bulk interpolation arguments is also true in this setup. This then allows to argue variationally as previously. Here we consider the functional where ϕ and v are related through (−∂ t + (−∆ a ij ) s )ϕ = v in B 1 × (−1, 1), This then concludes the argument.