On the switching behavior of sparse optimal controls for the one-dimensional heat equation

An optimal boundary control problem for the one-dimensional heat equation is considered. The objective functional includes a standard quadratic terminal observation, a Tikhonov regularization term with regularization parameter $\nu$, and the $L^1$-norm of the control that accounts for sparsity. The switching structure of the optimal control is discussed for $\nu \ge 0$. Under natural assumptions, it is shown that the set of switching points of the optimal control is countable with the final time as only possible accumulation point. The convergence of switching points is investigated for $\nu \searrow 0$.


Introduction
In this paper, we investigate the switching behavior of optimal controls for the following sparse optimal control problem with terminal observation: Bang-bang and switching properties for the solutions of optimal boundary control problems were extensively discussed in the 70ties. If ν = µ = 0, then it is well known that the optimal control is of bang-bang type provided that y Ω is not attained by the optimal state. This result was discussed in several papers for linear parabolic equations, see [9,11,19], cf. also [24]. For the case of the maximum norm as objective functional, the finite bang-bang principle was proved in [10]. Bang-bang principles for nonlinear parabolic equations were discussed in [17,20].
For ν > 0 but µ = 0, the switching behavior of optimal controls was investigated in [7,23]. In particular, the convergence of switching points for ν → 0 was addressed. Numerical examples and numerical methods exploiting the switching structure were presented in [6,14,18]. Bang-bang properties for time-optimal parabolic boundary control problems were studied, e.g., in [8,15,28,12,29]. This list of references on bang-bang principles and switching properties is by no means exhaustive. We also refer to the references of these papers.
The main novelty of our paper is the discussion of the switching structure for sparse optimal controls of parabolic boundary control problems (i.e., for the case µ > 0). To our best knowledge, the switching properties of sparse optimal boundary controls for parabolic problems were not yet discussed in the literature. In particular, this refers to the convergence of switching points for the limit ν → 0. In addition to proving convergence of switching points, we obtain also convergence rates with respect to ν for the approximation of switching points for ν ց 0.
However, the general bang-bang structure of optimal sparse controls has already been investigated in a sequence of papers on semilinear elliptic control problems. We refer to [2]. Our paper was inspired by these general results.
Assumption 1 (Data). In this setting, real numbers T > 0, ν ≥ 0, µ > 0, α ≥ 0, and a < 0 < b are fixed. The sign restrictions on a and b are needed only for some of the structural properties of the optimal control, neither for the existence of optimal controls nor for the necessary optimality conditions. The parameter µ is the so called sparse parameter.
Moreover, we fix a desired final state function y Ω ∈ C[0, 1]. We require y Ω ∈ C[0, 1] in order to have continuity of the adjoint state up to the boundary. (For the optimality conditions, y Ω ∈ L 2 (0, 1) would suffice.) Let us denote the state y associated with u by y u . The control-to-state mapping u → y u is linear and Let us introduce the functionals j : and f ν, Then the reduced objective functional F is given by Introducing the set of admissible controls by we can re-write the optimal control problem (1.1)-(1.3) in the short form The functional F ν is continuous and convex, hence weakly lower semicontinuous. Moreover, the set U ad is weakly compact and non-empty. Therefore, there exists at least one optimal control of the problem (P ν ), which will be denoted by u ν to indicate the correspondence to the Tikhonov parameter ν. By y ν := y uν we denote the optimal state associated with u ν . In the case ν = 0 we will drop the index 0 and writeū := u 0 andȳ := y 0 . If ν > 0, then F ν is strictly convex and hence in this case the optimal control is unique. Under a natural assumption, we show later that this uniqueness also holds for µ > 0. In addition, in the case ν = 0 the optimal state is uniquely determined due to the strict convexity of f ν with respect to y u .

Fourier expansion for (1.2), Green's function
For convenience of the reader, we recall some known facts on the representation of the weak solution by a Green's function G. We consider the inhomogeneous initial-boundary value problem in (0, 1), where f ∈ L 2 (Q), y 0 ∈ L 2 (0, 1), and u ∈ L 2 (0, T ) are given. We mention also for the case α = 0, but later we will concentrate on positive α. The weak solution of (2.1) can be represented by a Green's function G = G(x, ξ, t) as where G is given by the following Fourier expansions: Here, (ρ n ) is the monotone increasing sequence of non-negative solutions to the equation are normalizing constants. The numbers n π and ρ n are the eigenvalues of the differential operator ∂ 2 /∂x 2 subject to the homogeneous boundary conditions in (2.1) for α = 0 and α > 0, respectively. For the eigenvalues, we know that ρ n ∼ (n − 1)π, n → ∞. The functions x → cos(nπx) and x → cos(ρ n x) are associated eigenfunctions, respectively. After normalization by the factors N n , they form a complete orthonormal system in L 2 (0, 1); cf. [25]. Notice that, for our case f = 0 and y 0 = 0, the term y(x, T ) in the objective functional has the series representation 3 Necessary optimality conditions

The variational inequality
It is well known that the derivative of the differentiable functional f ν can be represented in the form where ϕ u is the adjoint state associated with u. It is the unique weak solution to the adjoint equation Notice that y Ω is assumed to be continuous. Moreover, the function x → y u (x, T ) is also continuous, because u ∈ L ∞ (0, T ). Therefore, we have ϕ u ∈ C(Q) and the continuity of the function t → ϕ u (1, t) on [0, T ].
Theorem 1 (Necessary optimality condition). Let ν ≥ 0 be given, and let u ν ∈ U ad be optimal for the problem (P ν ). Then there exists a function λ ν ∈ ∂j(u ν ), such that the variational inequality is satisfied with the adjoint state ϕ ν := ϕ uν .
This result is completely standard for the case µ = 0, where F ν is smooth, see [13]. If µ > 0, the associated tools from subdifferential calculus can be found, for instance in [3,Thm. 3.1]. It is fairly obvious how these methods can be transferred to our problem. Therefore, we omit the proof. An equivalent formulation is obtained by replacing the subdifferential in (3.2) by directional derivatives, which is By a standard argument, we find that this variational inequality of integral type implies the following pointwise inequality: for a.a. t ∈ (0, T ). Here, we denote by  ′ (u, h) the directional derivative of the real function (u) := |u| at u ∈ R in the direction h ∈ R.

The case ν = 0, bang-bang-bang properties
In the case ν = 0, a detailed discussion of the variational inequality (3.2) leads to nice structural properties of the optimal controlū. We recall that for ν = 0 the optimal control and the optimal state are denoted byū andȳ, respectively.
Associated withū, we introduce the following measurable sets The pointwise discussion of the variational inequality (3.2) yields the following result.
This discussion showed how the function t →φ(1, t) + µλ(t) depends on the sign ofū. Another investigation will reveal the switching structure ofū related to the function t →φ(1, t) + µλ(t). To this end let us define the open sets For almost all t ∈ [0, T ], the following implications hold true: Proof. We first prove the claim for Φ + . The equationū(t) = a is obtained as follows: We have |λ(t)| ≤ 1, hence is satisfied a.e. in Φ + . Now the variational inequality (3.2) impliesū(t) = a a.e. in Φ + . The continuity of the function t →φ(1, t) yields that Φ + is an open set. The proof for Φ − is analogous. The statement of Lemma 1 shows that Φ 0 ∩ (E + ∪ E − ) has measure zero. Hence it holdsū = 0 almost everywhere on Φ 0 .

Switching points ofū
The switching behavior of the optimal control depends on the solutions of the two equationsφ To estimate their number, we need the following result: It can be extended to a holomorphic function in the complex half plane {z ∈ C : ℜ(z) < T }.
The fact that the function t →φ(1, t) can be extended to a holomorphic function follows from its the Fourier expansion. By the transformation of time τ := T − t, we find from (2.2) and (2.3) that The eigenvalues ρ n behave asymptotically like (n − 1)π, n → ∞. Therefore, the factor e −ρ 2 n (T −t) converges very fast to zero as n → ∞ provided that t < T . For t ≤ T − ε, ε > 0 fixed, the convergence of the series (3.7) and of all of its derivatives w.r. to t is uniform in −∞ < t ≤ T − ε. The same holds true for the complex extension Therefore, the series defines a holomorphic function in the half plane {z ∈ C : ℜ(z) < T } andφ(1, t) is obtained as its real part.
Let us re-write the expansion ofφ(1, t) in the shorter form where the numbers correspond to the Fourier coefficients of d (the exact Fourier coefficients are given by d n / √ N n ). The decisive result for the switching behavior ofū is the following: Lemma 4. Let 0 < ε < T be given and assume ȳ(·, T ) − y Ω (·) L 2 (0,1) > 0. Then the equationsφ have at most finitely many solutions in [0, T − ε]. Therefore, in [0, T ] these equations have at most countably many solutions that may accumulate only at t = T .
Proof. Let us consider only the first equation, Assume to the contrary that it has infinitely many solutions in [0, T − ε]. Then they must have an accumulation pointt ∈ [0, T − ε]. By the identity theorem for holomorphic functions, we deducē

Differentiating this equation, we obtain
We multiply (3.10) by e ρ 2 1 (T −t) and get Now, we pass to the limit t → −∞. Since ρ 2 n > ρ 2 1 holds for n ≥ 2 and the series is uniformly convergent, it follows exp (−(ρ 2 and hence d 1 = 0. Notice that cos(ρ n ) = 0 holds for all n ∈ N. Therefore, the first item in (3.10) is zero. Multiplying (3.10) by e ρ 2 2 (T −t) and passing to the limit t → ∞, we find d 2 = 0. Repeating this method infinitely many times, it follows d n = 0 for all n ∈ N.
The system of functions {cos (ρ n ·) : n ∈ N} is complete in L 2 (0, 1), hence d = 0 must hold in the sense of L 2 (0, 1). This contradicts the assumption that d =ȳ(·, T ) − y Ω = 0.  In addition, it follows that the complement of Φ + ∪ Φ 0 ∪ Φ + , which is the set of solutions of (3.9), is countable. Hence, the switching conditions of Lemma 2 uniquely defineū almost everywhere on (0, T ). This implies thatū almost everywhere attains values from the discrete set {a, 0, b}. Moreover,ū is piecewise constant on [0, T − ǫ) for all ǫ > 0 with discontinuities only located at the solutions of (3.9). These points will be called switching points in the sequel. Definition 1. All points t ∈ (0, T ), where one of the two functions t →φ(1, t)− µ and t →φ(1, t) + µ changes the sign, are said to be switching points ofū.
Then the following switching properties hold true: (i) For each 0 < ε < T , the number of switching points ofū in [0, T − ε] is finite. Therefore, the number of switching points ofū in [0, T ] is at most countable and switching points can only accumulate at t = T .
Between two subsequent switching points, the optimal controlū is identically constant and equal to one of the values b, a or 0.
is fulfilled.
(iii) Ifφ(1, T ) = µ, then, in a certain neighborhood (T − δ, T ], the optimal control can switch at most countably many times between a and 0. In the caseφ(1, T ) = −µ it can switch at most countably many times between b and 0 in (T − δ, T ]. (iv) Switching-over ofū between a and b cannot happen.
Proof. (i) Switching points can only be boundary points of the sets Φ + , Φ − , and Φ 0 . Therefore, they must solve one of the two equations (3.9). By Lemma 4, the number of their solutions is at most countable and can accumulate only at t = T . Between switching points,ū can only attain the values b, a, and 0, cf. Lemma 2. This proves (i).
Proof. Let two optimal controlsū andv be given. Due to strict convexity of f ν , the optimal state is unique, which givesȳ = yū = yv. Thanks to Theorem 3, both controls must be of bang-bang-bang type: Almost everywhere and in open intervals, they admit only the values a, b or 0. Since the control problem is convex, every convex combination θū + (1 − θ)v is an optimal control and bang-bang-bang. This is only possible ifū =v holds almost everywhere.

The case ν > 0
Now we assume ν > 0 and consider the problem (P ν ), i.e., the problem We recall that y u is defined as solution of the equation (1.2) associated to u. Again, this problem has an optimal control u ν with associated optimal state y ν := y uν . By strict convexity of the functional in (3.13), the optimal control is unique. The associated adjoint state is ϕ ν := ϕ uν . The necessary optimality condition is stated in Theorem 1.
By a detailed pointwise discussion of the variational inequality (3.2), the following result is deduced completely analogous to a result of [3,22] for a class of elliptic equations. We also refer to a later result for a parabolic problem in [4]. In the theorem, the projection function P [s1,s2] : R → [s 1 , s 2 ] is defined by P [s1,s2] (s) = max{s 1 , min{s, s 2 }}.
Theorem 5. For almost all t ∈ [0, T ], the following equations are fulfilled: The relation (3.15) expresses the sparsity of the optimal control, while (3.16) extracts a single element out of the subdifferential of j(u ν ). We skip the proof, because it is completely analogous to the one in [3].
As a simple conclusion Theorem 5 we get that, for ν > 0, the functions λ ν and u ν are continuous on [0, T ]: Indeed, the function t → ϕ ν (1, t) is continuous, hence (3.16) yields the continuity of λ ν . Inserting this in (3.14), we see the continuity of u ν .
Let us determine the structure of u ν . We might follow the presentation in [5], but for the convenience of the reader we prove the results again in our framework. Inserting (3.16) in (3.14), we find Discussing this representation, we find the following result: Theorem 6. Assume ν > 0. Then the implications Proof. (a) The implication (3.20) follows immediately from (3.15).
This theorem reveals that the solutions of the four equations determine the switching behavior of u ν . In other words, u ν can only switch in the zeros of the four functions standing in the left-hand side of (3.25)-(3.28).
Definition 3. Any t ∈ (0, T ), where one of the functions in the left-hand side of (3.25)-(3.28) changes its sign, is said to be a switching point of u ν .
Lemma 5. If y ν (·, T ) − y Ω L 2 (0,1) > 0, then each of the equations (3.25)-(3.28) can have at most countably many solutions that can accumulate only at t = T . Therefore, u ν can have at most countably many switching points that can accumulate only at t = T .
Proof. The proof is almost identical with that of Lemma 3, since the adjoint state ϕ ν solves the same adjoint equation asφ, but with terminal value y ν (·, T )− y Ω . Therefore, we have where d ν = y ν (·, T ) − y Ω . Now we proceed as in the proof of Lemma 3. , respectively, for all j ≥ 1 that may occur.

Pass to the limit ν → 0
In this section, we discuss the convergence of controls u ν for ν ց 0. In addition, the convergence of switching points can be shown. First, we discuss the convergence of the sequences (u ν ), (y ν ) and (ϕ ν ) of optimal quantities for (P ν ).
Proof. (i) Since U ad is weakly compact, the existence of a weakly convergent subsequence (u ν ) with weak limitû ∈ U ad is obvious. Letū be optimal for (P 0 ). Then we have Passing to the limit, we obtain from (4.1) where we used the weak lower semicontinuity of F 0 to get the last inequality. Therefore,û must also be optimal for (P 0 ). From optimality of u ν andû for (P ν ) and (P 0 ), respectively, we get and hence, dividing by ν/2 we obtain This implies convergence of norms and strong convergence u ν →û in L 2 (0, T ) for ν ց 0. The strongly convergent subsequence (u ν ) in L 2 (0, 1) is transformed to a uniformly convergent subsequence (y uν ), i.e., y uν → yû in C(Q). Therefore, we also have ϕ uν → ϕû in C(Q), because the mapping associating the solution of the adjoint equation to the final datum is continuous from C[0, 1] to C(Q) and we have ϕ uν (T ) = y uν (T ) − y Ω .
(ii) If the optimal control of (P 0 ) is unique, sayū, then all subsequences of u ν contain a subsequence converging weakly to the same limitū. Then the whole sequence (u ν ) converges toū. This transfers to the sequences (y uν ) and (ϕ uν ).
Lemma 6. For each k ∈ N and ε ∈ (0, T ), there is a constant c > 0 such that holds for allû ∈ U ad .
Proof. In [0, T − ε], the formally differentiated Fourier series is given by It is uniformly convergent, since the series n ε with M = sup u∈U ad y u C(Q) is a convergent majorant. Letû ∈ U ad be given. Then d dt ϕû(1, t) has an analogous series representation. Hence, we can estimate which proves the claim for k = 1. The proof can be completed by an induction argument with respect to k.
The norm y uν (·, T ) − yû(·, T ) L 2 (Ω) can be estimated with the help of the following result.
Combining these two results, we obtain a convergence rate for ν ց 0 for the adjoint states.
Lemma 8. Letû be optimal for (P 0 ). Then for each k ∈ N and ε ∈ (0, T ), there is a constant c > 0 such that Proof. This is consequence of the previous two lemmas and the boundedness of U ad . Now we are able to prove the convergence of switching points of optimal controls for ν ց 0.
With obvious modifications, the result can be proven if Λ < 0 or t = t −µ i holds.
(ii) For n = 1, we immediately obtain the following particular case of Theorem 8: Assume that t µ j is a switching point ofū such that (4.2) is satisfied with n = 1. Then for all 0 < ν < ν 1 exactly two switching points t µ,ν j and t µ−νa j of u ν exist that solve equation holds for all u ∈ U ad , where n and τ are as in Theorem 8. Here we used again the notation (u) := |u|.
Then the conclusion of Corollary 1 is valid for the interval (T − τ, T ), and the proof of Theorem 8 remains valid with minor modifications.

Extensions
The results of this paper can be easily extended to the following slightly more general situations: (i) We considered problems with homogeneous initial condition y(·, 0) = 0. All results remain true for the non-homogeneous initial condition y(·, 0) = y 0 (·) with y 0 ∈ L 2 (Ω). To see this, we solve the heat equation with homogeneous boundary data and initial condition y(·, 0) = y 0 and denote the solution byŷ. Then y(x, T ) − y Ω = y u (x, t) − (y Ω −ŷ(x, T )) so that the results can be proven withŷ Ω := y Ω −ŷ(x, T ).
(ii) For distributed controls of the form f (x, t) = e(x)u(t) that act in the right-hand side of the heat equation with homogeneous boundary conditions, the solution y is given by the series representation y(x, T ) = In this way, the Fourier coefficients e n replace the numbers cos ρ n in (2.4). For proving the switching properties, in (3.11) we used the fact that cos ρ n = 0 holds for all n ∈ N. Therefore, an easy inspection of the proofs shows that all results of the paper remain true for distributed controls of the form f (x, t) = e(x)u(t) with fixed e ∈ L 2 (Ω), if the condition 1 0 cos(ρ n ξ)e(ξ) dξ = 0 ∀n ∈ N is fulfilled. In other words, the theory remains true for functions e where all Fourier coefficients with respect to the system cos(ρ n x) are non-vanishing.