FINITE ELEMENT APPROXIMATION OF SPARSE PARABOLIC CONTROL PROBLEMS

. We study the ﬁnite element approximation of an optimal control problem governed by a semilinear partial diﬀerential equation and whose ob- jective function includes a term promoting space sparsity of the solutions. We prove existence of solution in the absence of control bound constraints and pro- vide the adequate second order suﬃcient conditions to obtain error estimates. Full discretization of the problem is carried out, and the sparsity properties of the discrete solutions, as well as error estimates, are obtained.

1. Introduction. Throughout this paper, Ω denotes an open, bounded subset of R n , 1 ≤ n ≤ 3, with boundary Γ, and 0 < T < +∞ is fixed. We set Q = Ω × (0, T ) and Σ = Γ × (0, T ). The control problem is defined in the way (P) min where J(u) = F (u) + µj(u) with µ > 0, For every u ∈ L ∞ (Q), we denote y u the solution of    ∂ t y + Ay + a(x, t, y) = u in Q, y = 0 on Σ, y(0) = y 0 in Ω. (1) Here, A is the linear elliptic operator Our objective in this work is to study the finite element discretization of the problem: we describe the sparsity pattern of the discrete solutions, prove convergence and provide error estimates. The first application of L 1 -promoting-sparsity terms to optimal control problems was done in [17] for control problems governed by linear elliptic equations. Finite element discretization and error estimates for such a problem were obtained in [18] also for linear elliptic equations. The semilinear case was treated in [6] for piecewise constant approximations of the control and in [5] for continuous piecewise linear approximations. In [3,2,15] the case of measure controls for problems governed by linear elliptic equations is studied.
In [11] directional sparsity is introduced and an application to problems governed by linear parabolic equations is considered. In a similar framework, measure-valued controls are considered in [4,9,10,12] for a problem governed by a linear parabolic equation. The measures used in [12] promote, as in the work at hand, a constant-intime sparsity pattern; a finite element approximation is studied and error estimates for the approximation of the states are provided.
The control of semilinear parabolic equations with measures is quite complicated due to the possible non-existence of solution of the partial differential equation; see [8] for a discussion of this topic for semilinear elliptic equations. To avoid this difficulty, we will use functions to control the nonlinear equation.
The plan of the paper is as follows. At the end of this section the main assumptions are introduced. In Section 2 we recall results about the existence and uniqueness of solution of the state equation and the differentiability properties of the control-to-state mapping and cost functional. Next, in Section 3, we prove existence of solution of the control problem, write the first order necessary optimality conditions and show the regularity and sparsity properties of the optimal controls. Since we are not imposing any bound constraints on the control, existence of solution of problem (P) cannot be deduced by the direct method of calculus of variations as usual, so we employ a truncation method; see Theorem 3.2.
In Section 4 we investigate second order optimality conditions. First and second order necessary and sufficient optimality conditions for control problems governed by semilinear parabolic equations and with a term promoting sparsity in the objective functional have recently been studied in [7]. Three different cases are described in that work, promoting each of them a particular case of sparsity: global sparsity, spatial sparsity whose pattern changes with time and spatial sparsity whose pattern is constant in time. We are interested in this last case. In [7,Theorem 4.12] the authors prove that under adequate second order conditions, the critical point is a strict local minimum in the L ∞ (Ω; L 2 (0, T )) sense. This result is not enough to derive error estimates of the numerical estimation of the control problem. The argument we use in Lemma 5.5 to show the existence of a sequence of local minima of the discretized problems converging strongly in L 2 (Q) to a strict local minimum of the continuous problem would be incorrect in L ∞ (Ω; L 2 (0, T )). To overcome this difficulty, we prove in Theorem 4.2 that under the same second order sufficient conditions, the critical point is also a strict local minimum in the L 2 (Q) sense.
Finally, in Section 5, we fully discretize the problem using, in space, continuous piecewise linear elements for the state and piecewise constant approximations for the control and, in time, piecewise constant functions for both variables. We show that the discrete optimal controls follow a sparsity pattern alike the one obtained for the continuous ones and prove convergence and an error estimate in the L 2 (Q) norm of the control variable of order O( √ τ + h), where τ denotes the step size in time and h is the mesh size in space. Finally, two numerical experiments are included in Section 6. In the first one we investigate the experimental order of convergence and compare with our theoretical results and in the second one we expose the directional sparsity properties of the solution of (P).
The study of approximations of the control by means of continuous piecewise linear functions in space will be done in a forthcoming paper.
We make the following assumptions.
2. Analysis of the state equation and the objective functional. Next we describe the differentiability properties of the control-to-state mapping and later we analyze the cost functional. The next results are quoted from [7].
for a.a. x ∈ Ω u and t ∈ (0, T ), where 3. Existence of solution for (P), first order optimality conditions and regularity of the optimal controls. The absence of control bounds leads to some difficulties regarding the existence of optimal controls for (P). We cannot apply the usual direct approach to prove existence of solution of (P), because we cannot conclude the boundedness in L ∞ (Q) of a minimizing sequence. Alternatively, we could have settled the problem in L 2 (Q), but in this case theorems 2.1 and 2.2 do not fulfill. Instead, we are going to introduce an auxiliary problem with bound control constraints to prove existence of a solution of (P). Associated to this set, we have the problem Existence of a solutionū M for problem (P M ) is standard, see [7,Theorem 1.4], and the following first order optimality conditions are satisfied.
The proof is standard and can be found in [7, Theorem 2.1]. The projection formulaū follows in a standard way from (15). Next, we prove existence of solution for (P).
Consequently, for every M ≥ C ∞ , any solutionū M of (P M ) is also a solution of (P).
Proof. Using the optimality ofū M we have that J(ū M ) ≤ J(0), hence whereỹ is the state associated to the control u ≡ 0. Subtracting a(x, t, 0) at both sides of the PDE in (13), multiplying byȳ M and integrating from 0 to t we have that Using the monotonicity of a(x, t, ·), we obtain by means of the Cauchy-Schwarz and Friedrichs' inequalities that there exists C Ω > 0 such that where Λ is the coercitivity constant of the operator, described in (2). Reordering, we get Using (17) we obtain Using the variational inequality (15) and the equality in (11), we have that It can be easily checked that this implies that Hence, using (19), we have that the first claim holds for , and the proof is complete.
To end this section, we describe the sparsity properties of optimal controls, as well as their regularity.

4.
Second order conditions. In this section, we provide necessary and sufficient second order optimality conditions. First let us introduce the cone of critical directions Proposition 2. The set Cū is a closed, convex cone in L 2 (Q).
The proof of this proposition can be found in [7, Proposition 3.1] and is based on the observation that We define The expression for j (u; v 2 ) is just notation, it does not mean that there exists a second derivative in the direction v. In fact, the integral above could be +∞ in some cases. Observe that the integral is well defined because the integrand in Ω u is nonnegative, which can be proved easily with the Schwarz inequality. In the sequel Necessary conditions are a consequence of [7, Theorem 3.3, Case III].
Theorem 4.1. Letū be a local minimum of (P).
Sufficient conditions are nevertheless different from [7,Theorem 4.12], since in that reference local optimality is proved in L ∞ (Ω; L 2 (0, T )), whereas we are able to prove local optimality in L 2 (Q). This is essential to prove error estimates for finite dimensional approximations of (P); see Lemma 5.6 below.
Theorem 4.2. Letū satisfy the first order optimality conditions given by Theorem 3.1 and such that J (ū; v 2 ) > 0 ∀v ∈ Cū \ {0}. Then, there exist ε > 0 and δ > 0 such that Proof. If (30) does not hold, then for any integer k ≥ 1 there exists an element w k ∈ L ∞ (Q) such that Since w k (x) −ū(x) L 2 (0,T ) → 0 in L 2 (Ω), we can extract a subsequence, denoted in the same way such that w k (x) −ū(x) L 2 (0,T ) → 0 for almost all points x ∈ Ω. Then, from Egorov's theorem we deduce the existence of a subsequence {w j k } ∞ k=1 and a sequence {Ω k } ∞ k=1 of measurable subsets of Ω such that (31) holds and Moreover, j k can be chosen so that j k > 2k. Then setting u k = w j k we get with (34) which proves (32) we can extract a subsequence denoted in the same way so that v k v in L 2 (Q). The proof is split into three steps.
Step I. v ∈ Cū. Using that v → j (ū; v) is convex and continuous, we have that The last equality is an immediate consequence of the definition of v k . From this inequality, (32) and (33) we get This inequality and (28) imply that Step II. v = 0. For β > 0 small we define and with Lemma 4.3 Since ū(x) L 2 (0,T ) ≥ β > 0 for every x ∈ Ω β,k , we have that j β,k is infinitely differentiable. Making a Taylor expansion we get Observe that relation (32) and the definition of v k lead to for all k ≥ 2 β . Hence, the above integrals are finite for every k ≥ 2 β . Now, using the convexity of the mapping f → f L 2 (0,T ) , we get From (33) we get where u θ k =ū + θ k ρ k (u k −ū) with 0 ≤ θ k ≤ 1. We deduce from (28) Dividing this expression by ρ 2 k /2 we obtain From [7, Lemma 4.2] and the identity v k L 2 (Q) = 1 we deduce Let us estimate the second term of (35). By using Hölder's inequality, the expression of j β,k (u ϑ k ; v 3 k ), that u ϑ k (x) L 2 (0,T ) ≥ β 2 for every k large enough, (32), and v k L 2 (Q) = 1, we obtain So we get
Step III. Contradiction. Since v = 0, then z v k → 0 strongly in L 2 (Q). Hence, from the expression of F given by (9), and using the identity v k L 2 (Q) = 1, we have that lim which contradicts the assumption ν > 0.
5. Numerical approximation. Next, we will study the approximation of (P) using finite elements. The goal of this section is to show not only convergence of the solutions of the discrete problems to solutions of (P), but also how the sparsity structure of an optimal control (cf. (21)) is inherited by the discrete optimal controls. Both the state and the control will be discretized. In both cases, we will use piecewise constant functions in time, but in space we will use continuous piecewise linear functions for the state and piecewise constant functions for the control. Finally, error estimates are derived. The study of approximations of the control by means of continuous piecewise linear functions will be done in a forthcoming paper. Along this section we will assume that Ω is a convex set. We consider, cf. [1, definition (4.4.13)], a quasi-uniform family of triangulations {K h } h>0 ofΩ and a quasi-uniform family of partitions of size τ of [0, T ], 0 = t 0 < t 1 < · · · < t Nτ = T . We will denote Ω h = int ∪ K∈K h K, N h and N I,h the number of nodes and interior nodes of K h , I j = (t j−1 , t j ), τ j = t j − t j−1 , τ = max{τ j } and σ = (h, τ ). We assume that every boundary node of Ω h is a point of Γ. Additionally we suppose that the distance D(x, Γ) ≤ C Γ h 2 for every x ∈ Γ h = ∂Ω h , which is always satisfied if n = 2 and Γ is of class C 2 ; see, for instance, [16,Section 5.2]. Under this assumption we have that where | · | denotes the Lebesgue measure. In the sequel we denote Q h = Ω h × (0, T ). Now we consider the finite dimensional spaces The elements of Y σ can be written as where y h,j ∈ Y h for j = 1, . . . , N τ , y i,j ∈ R for i = 1, . . . , N I,h and j = 1, . . . , N τ , is the nodal basis associated to the interior nodes {x i } N I,h i=1 of the triangulation and χ j denotes the characteristic function of the interval I j = (t j−1 , t j ). For every u ∈ L ∞ (Q h ), we define its associated discrete state as the unique element where, for all y, z ∈ H 1 (Ω h ), From a computational point of view, this scheme can be interpreted as an implicit Euler discretization of the system of ordinary differential equations obtained after spatial finite element discretization. By using the monotonicity of the nonlinear term a(x, t, y), the proof of the existence and uniqueness of a solution for (40) is standard.
Assuming that Ω ⊂ R 2 , it is proved in the work by I. Neitzel and B. Vexler [14] that there exist h 0 > 0 and τ 0 > 0 such that Remark 3. In the afore-mentioned reference, the estimate is obtained for n = 2, a polygonal domain and quadrilateral elements. The adaptation of the proofs to convex domains and triangular elements or n = 1 is straightforward. An extension to n = 3 is also possible and is currently being written by D. Meidner and B. Vexler.
To discretize the controls, we will use piecewise constant functions. Consider The elements of U σ can be written as We formulate the discrete problem as and we define j σ : U σ −→ R by

EDUARDO CASAS, MARIANO MATEOS AND ARND RÖSCH
The existence of a solution of problem (P σ ) is an obvious consequence of the continuity and the coercivity of J σ in the finite dimensional space U σ . Under the assumptions 1-2, F σ : Lp(0, T ; Lq(Ω h )) → R is of class C 2 . Moreover, for every u, v ∈ Lp(0, T ; Lq(Ω h )), we have that where, for every u ∈ Lp(0, T ; Lq(Ω h )), ϕ σ (u) ∈ Y σ is its associate discrete adjoint state, which can be written as and satisfies the equations For every u σ ∈ U σ , the sets K σ and K 0 σ are defined as Notice that if we define Ω h,uσ and Ω 0 h,uσ as we did in Proposition 1 using the set Ω h instead of the set Ω, we have that that Ω h,uσ = int K∈Kσ(uσ) K and Ω 0 h,uσ = K∈K 0 σ (uσ) K. We have that λ σ ∈ ∂j σ (u σ ) ⊂ U σ if and only if The directional derivative of j σ at a point u σ ∈ U σ in the direction v σ ∈ U σ can be written as In the sequel we denote J σ (u σ ; v σ ) = F σ (u σ )v σ + µj σ (u σ ; v σ ). We also define π h : L 1 (Ω) −→ U h by With P τ we denote the space of piecewise constant functions associated with the temporal grid {t 0 , t 1 , . . . t Nτ }. Then, the projection operator π τ : L 2 (0, T ) −→ P τ is given by Then we have π τ π h u = π h π τ u ∈ U σ for all u ∈ L 1 (Ω; L 2 (0, T )). We also have that π τ • π h : L 2 (Q) −→ U σ is the projection operator.
Theorem 5.1. Ifū σ is a local solution of (P σ ), then there existȳ σ = y σ (ū σ ), Proof. First order optimality conditions follow in a standard way from the convexity of j σ , the definition of subdifferential and the expression for the derivative of F σ , taking into account that 5.1. Sparsity properties. Before proving error estimates, we will show that the discrete optimal controls show a sparsity pattern alike the solutions of Problem (P). Let us introduce the following notation Observe that Theorem 5.2. Ifū σ is a local solution of (P σ ), then andλ σ is unique forū σ given.

5.2.
Convergence and error estimates. We will show that the solutions of the discretized problems converge strongly to solutions of Problem (P) in L 2 (Q). Next, we show a kind of reciprocal of this result: strict local solutions of (P) can be approximated by solutions of the discretized problems. Finally, we are able to show an order of convergence for this approximations. Through this section we will assume n ≤ 2, since we use several results from [14]. Nevertheless, B. Vexler has proved recently that the stability results and the error estimates also hold for Ω ⊂ R 3 . A paper with the details of the proof is in preparation. Using his results we can extend the analysis of this section to the three-dimensional case.
First of all, we need to show boundness of the discrete optimal controls in the adequate norm. Lemma 5.3. Letū σ be a local solution of (P σ ). Then there exists C ∞ > 0 independent of σ such that ū σ L ∞ (0,T ;L 2 (Ω h )) ≤ C ∞ Proof. The result follows from a bootstrapping argument using the stability results in [14]. First, we have that where y σ (0) is the discrete state related to the control u σ ≡ 0. Now, from the classical stability estimate (see, for instance, the second part of [14, Theorem 4.1]) we have that there exists C 2 > 0 independent of σ such that ȳ σ L ∞ (0,T ;L 2 (Ω h )) ≤ C 2 .
Analogously, from the discrete adjoint state equation we deduce the existence of a constant C 3 > 0 independent of σ such that and hence, taking into account that π h is a projection in L 2 (Ω h ) and (46), we get and the result follows for C ∞ = C 3 /ν.

Remark 4.
If we further suppose that y d ∈ L p (Q) for some p > n, a slight modification of the proof of the previous Lemma allows us to conclude using [14, Th 3.1 and Th 4.1] that there exists some µ c > 0 independent of h such that φ σ L ∞ (Q h ) ≤ µ c . Using this, (45), and the the fact that π hφσ L ∞ (Q h ) ≤ φ σ L ∞ (Q h ) , we can deduce the existence of a critical value µ c such thatū σ ≡ 0 for all µ > µ c . For the analogous property for the continuous solution, see [7, Remark 2.10].
Lemma 5.4. Let (ū σ ) σ be a sequence of solutions of (P σ ) with σ → (0, 0). Then there exist subsequences of {ū σ } σ , still denoted in the same way, converging weakly* in L ∞ (0, T ; L 2 (Ω)). Ifū σ ū weakly* in L ∞ (0, T ; L 2 (Ω)), thenū is a solution of (P), lim σ→(0,0) J σ (ū σ ) = J(ū) = inf (P) and lim Since u σ is not defined on all Q, we have to specify what we mean when we say that u σ converges weakly* to u in L ∞ (0, T ; L 2 (Ω)). It means that Notice that since we suppose that |Ω \ Ω h | → 0 this is the same as saying that the extension to Q \ Q h of u σ by a function in L ∞ (Q), converges weakly* to u. In the following proof, we will consider that the elements of U σ are extended, for instance, by zero to (0, T ) × (Ω \ Ω h ).
Proof. From Lemma 5.3 we know that {ū σ } σ is bounded in L ∞ (0, T ; L 2 (Ω h )) We can extract a subsequence, still denoted in the same way, such thatū σ ū weakly* in L ∞ (0, T ; L 2 (Ω)). We are going to prove thatū is a solution of (P). Letũ be a solution of (P) and let u σ be its projection onto U σ in the L 2 (Q) sense. Denotinḡ y = yū, we have thatū σ ū weak* in L ∞ (0, T ; L 2 (Ω)) impliesū σ ū weakly in L 2 (Q) and yū σ →ȳ in L 2 (Q); see Theorem 2.1. On the other hand, (41) implies that y σ (ū σ ) − yū σ → 0 in L 2 (Q), so we have that y σ (ū σ ) → yū in L 2 (Q). This leads to J(ū) ≤ lim inf where we have used the weak lower semicontinuity of the control cost terms in J σ . Let us proof now the strong convergence of the optimal controls in L 2 (Q). We have just proved that J σ (ū σ ) → J(ū). This, together with the strong convergencē y σ →ȳ, implies that lim σ→(0,0) On the other hand, using the convexity of j(u) and the weak convergenceū σ ū, we have that j(ū) ≤ lim inf Using (50) and (51) we have ν 2 ū 2 from where we readily deduce the strong convergence in L 2 (Q).
In the following we will extend the elements of U σ byū in Q \ Q h , whereū is a fixed local solution of (P). Notice that using the sparsity property of the control (21) and the zero boundary condition of the adjoint state equation, we have that for h > 0 small enough,ū = 0 in Q \ Q h .
Proof. Suppose nowū is a strict local minimum of (P). This means that there exist ε 0 > 0 such thatū is the unique solution of (P ε0 ) min Associated to this problem, we consider (P ε0 σ ) min Let u σ = π τ π hū be the projection ofū onto U σ in the L 2 (Q h ) sense. We extend u σ to Q by taking u σ (x, t) =ū(x, t) in Q \ Q h . Since u σ →ū in L 2 (Q), there exist h 1 > 0 and τ 1 > 0 such that u σ ∈ U σ ∩B ε0 (ū) and hence this set is not empty for every h < h 1 , τ < τ 1 and therefore (P ε0 σ ) has a solutionū σ . Moreover, from the definition of the projection we infer that u σ L ∞ (Q) ≤ ū L ∞ (Q) . Now let us considered a subsequence, still denoted in the same way, converging weakly in L 2 (Q) toũ. Arguing as in the proof of Lemma 5.4, we have thatũ is a solution of (P ε0 ), and the convergence is strong. Sinceū is the unique solution of this problem, we have thatũ =ū. Since all the convergent subsequences converge to the same point, the whole sequence converges toū. Finally, this strong convergence implies that there exist h 0 > 0 and τ 0 > 0 such thatū σ ∈ B ε0 (ū) for every h < h 0 , τ < τ 0 and thereforeū σ is also a local solution of (P σ ).
Theorem 5.7. Letū be a solution of (P) such that J (ū; v 2 ) > 0 for all v ∈ Cū \{0} and letū σ be the solution of (P σ ) and τ 0 and h 0 be as described in Lemma 5.5. Let us assume that there exists h 1 > 0 such that y d ∈ L ∞ (Q \ Q h ) ∀h ≤ h 1 . Then, for every h ≤ min{h 1 , h 0 } and every τ < τ 0 , we have Proof. Using Lemma 5.6, we have to estimate J(ū σ ) − J(ū). We split into the following parts We choose u σ = π τ π h u σ , the L 2 (Q h )-projection ofū to the space of piecewise constant functions. We extend u σ to Q by taking u σ (x, t) =ū(x, t) in Q \ Q h . We also recall that u σ L ∞ (Q) ≤ ū L ∞ (Q) . Because of optimality we have for (53) To obtain the estimates for the terms in (52) and (54) we use the assumption y d ∈ L ∞ (Q \ Q h ), the existence of C > 0 independent of σ such that yū σ L ∞ (Q\Q h ) + y uσ L ∞ (Q\Q h ) ≤ C and assumption (39), together with estimate (41) to obtain It remains to estimate term (55).
Hence, we finally find with Lemma 5.6

Remark 5.
It remains an open question whether our error estimate O( √ τ + h) is sharp. There are several facts that suggest that the order of convergence for the error should be O(τ + h): the finite element error for the state equation is O(τ + h 2 ); the H 1 (Q)-regularity of the optimal controls implies that they can be approximated by elements of U σ with an approximation error O(τ + h) (using L 2 (Q)-projections, for instance); the experimental order of convergence found in our numerical experiment also supports this idea; finally, the available error estimate in [14] for a problem governed by a semilinear parabolic equation and quadratic differentiable functional is also O(τ + h).
Nevertheless, we have not been able to prove such an estimate for our problem. Sharp estimates for problems involving differentiable functionals make use of the second derivative and the mean value theorem, which are not applicable in our setting, since we deal with a non-differentiable functional. 6. Numerical experiments. We report on two numerical experiments. In the first one, we describe an example with known solution and show error estimates (cf. Theorem 5.7). In the second one, we show how the sparsity properties of the solution change as µ changes; cf. Remark 4 and [7, Remark 2.10].
6.1. Experiment 1. Error estimates for an example with known solution.
Let Ω = (0, 1) ⊂ R and let T = 1. We are going to describe all the parameters, data and solution, of a model example for (P) when a(x, t, y) ≡ 0 and y 0 ≡ 0.
Consider two real numbers 0 < a 1 < a 2 < 1 and a continuous function U (x) supported in [a 1 , a 2 ]. For instance Consider also a continuous function V (t) such that V (T ) = 0. For simplicity, we will choose one such that V L 2 (0,T ) = 1. In our example V (t) = √ 2 sin(2πt). The optimal control isū (x, t) = U (x)V (t). With an expression forū, we can compute (an approximation of)ȳ.
We have that Ωū = (a 1 , a 2 ) and also, since U (x) ≥ 0, Therefore, we can define the element of the subdifferential and the adjoint state in Ωū according to Theorem 3.3 as We have just to defineφ(x, t) for x ∈ Ω 0 u .φ has to satisfy some conditions: An easy way to achieve all these requirements is to look for an adjoint state that is also in C 1 (Q). We will build an adjoint state of the form The parameters A i , B i , C i , i = 1, 2 are univocally determined by the boundary conditions and the conditionφ ∈ C 1 (Q).
ν − a 1 a 2 ν + a 1 a 2 2 ν)/(a 2 − 1) 2 Once this numbers are obtained, the condition φ(x) ≤ µ if x ∈ Ω 0 u will give us a lower bound for the values of µ that we can select.
Now that we have the adjoint state and (an approximation of) the state, we can define (an approximation of) the desired target y d using the adjoint state equation. We get is not continuous in x and neither is y d . We fix the following parameters. The resulting desired state and the optimal control are represented in Figure 1. A similar superconvergence in τ is observed in the experiments performed in [12, §5.1]. In that reference, the authors obtain an experimental order of convergence slightly better than the predicted one, concretely O(τ 0.8 ). This observation is based on an experiment with 512 time steps. Motivated by this, we have performed our experiments using 8192 time steps. We take two families of uniform partitions in space and time, with h = 2 −i , i = i 0 : I, and τ = 2 −j j = j 0 : J for some values of I and J big enough. We have been able to achieve I = J = 13 in a PC with Matlab. To solve the discrete problems, we use a semismooth Newton method as described in [11].
Let us denote σ i,j = (h i , τ j ). We perform three tests: 1. σ i,i , i = i 0 : I. This is h = τ 2. σ i,J , i = i 0 : I * . This is fix small τ and refine only in space.
3. σ I,j , j = j 0 : J * . And this is fix small h and refine only in time. To measure the error, we compute e σ = ū σ −π σū L 2 (Q) whereπ σū =π τπhū . The operatorπ τ is the numerical approximation of the L 2 (0, T ) projection onto the set of piecewise constant functions given by the midpoint rule: π τ f = Nτ j=1 f ((t j−1 + t j )/2)χ (tj−1,tj ) . The operatorπ h is the numerical approximation of the L 2 (Ω) projection onto the set of piecewise constant functions given by the midpoint rule. The experimental order of convergence is measured as in the first cases and analogously in the other cases. For the first test (h = τ ), we obtain the results shown in Table 1.  Table 1. Results for h i = τ i = 2 −i .
It looks a lot like ū σ −ū L 2 (Q) ≤ C(τ + h) for τ = h For the second test (τ fixed and small, refinements only in the space step), we get the results summarized in Table 2. The error due to τ = 2 −13 is small, but not zero. So the values obtained for the error due to the discretization in space are not of the form Ch i , but of the form Ch i ± E τ J . So it seems reasonable to discard the  Table 2. Results for fixed τ = 2 −13 and decreasing h i = 2 −i results for which the error in time starts to be big enough. For i ≥ 10 it maybe more than 10% of the error, so we stop at I = 9 * . We obtain an order of convergence of O(h), as expected.
In Table 3 we show the results for the third test (h fixed and small, refinements in the time step). Since the spatial error is not zero, we discard the results for which  Table 3. Results for fixed h = 2 −13 and τ j = 2 −j .
it is at least the 10% of the global error and stop at J * = 8. We obtain an order of convergence close to O(τ ).  We solve the problem in a rough mesh with h = τ = 2 −4 . In Figure 2, we show the support of the optimal control for the values µ = M µ 0 , M = 0 : 8. For µ = 0, we have no sparsity pattern for the control. Then we see how the control is directionally sparse for µ > 0 and how the support of the control is smaller as µ increases. After a few essays, we find thatū ≡ 0 for µ ≥ 7.4540µ 0 . As expected, the value of the objective functional increases as µ increases. You may find the obtained numerical values for J σ (ū σ ) in Table 4.  Table 4. Experiment 2. Value of the objective functional as the parameter µ increases