Error estimates for Dirichlet control problems in polygonal domains

The paper deals with finite element approximations of elliptic Dirichlet boundary control problems posed on two-dimensional polygonal domains. Error estimates are derived for the approximation of the control and the state variables. Special features of unconstrained and control constrained problems as well as general quasi-uniform meshes and superconvergence meshes are carefully elaborated. Compared to existing results, the convergence rates for the control variable are not only improved but also fully explain the observed orders of convergence in the literature. Moreover, for the first time, results in non-convex domains are provided.


Introduction
In this paper we will study the finite element approximation of the control problem (P) subject to (Su, u) ∈ H 1/2 (Ω) × L 2 (Γ), where Su is the very weak solution y of the state equation − ∆y = 0 in Ω, y = u on Γ, (1.1) the domain Ω ⊂ R 2 is bounded and polygonal, Γ is its boundary, a < b and ν > 0 are real constants, and y Ω is a function whose precise regularity will be stated when necessary. We assume that 0 ∈ [a, b] and comment on the opposite case in Remark 5.4. Abusing notation, we will allow the case a = −∞ and b = +∞ to denote the absence of one or both of the control constraints.
First order optimality conditions read as (see [1,Lemma 3.1]) Lemma 1.1. Suppose y Ω ∈ L 2 (Ω). Then problem (P) has a unique solutionū ∈ L 2 (Γ) with related stateȳ ∈ H 1/2 (Ω) and adjoint stateφ ∈ H 1 0 (Ω). The following optimality system is satisfied: −∆ȳ = 0 in Ω,ȳ =ū on Γ, in the very weak sense, (1.2b) −∆φ =ȳ − y Ω in Ω,φ = 0 on Γ, in the weak sense. (1.2c) The variational inequality (1.2a) is equivalent tō where Proj [a,b] denotes the pointwise projection on the interval [a, b]. The aim of this paper is to investigate a finite element solution of the system (1.2a)-(1.2c), in particular to derive discretization error estimates. The precise description of the regularity of the solution of the first order optimality system is an important ingredient of such estimates. They were proven in our previous paper [1]; we recall these results in Section 2. There were two interesting observations which we may illustrate in the following example. Example 1.2. Consider the L-shaped domain. The 270 • angle leads in general to a singularity of type r 2/3 in the solution of the adjoint equation; the regularity can be characterized byφ ∈ H s (Ω) with s < 2 3 . Hence, the control has a r −1/3 -singularity in the unconstrained case,ū ∈ H s (Γ) for all s < 1 6 . In the constrained case, however, the control is in general constant in the vicinity of the singular corner since the normal derivative by Γ j the side of Γ connecting x j and x j+1 , and by ω j ∈ (0, 2π) the angle interior to Ω at x j , i.e., the angle defined by Γ j and Γ j−1 , measured counterclockwise. Notice that Γ 0 = Γ M . We will use (r j , θ j ) as local polar coordinates at x j , with r j = |x − x j | and θ j the angle defined by Γ j and the segment [x j , x]. In order to describe the regularity of the functions near the corners, we will introduce for every j = 1, . . . , M a positive number R j and an infinitely differentiable cut-off function ξ j : R 2 → [0, 1] such that the sets N j = {x ∈ R 2 : 0 < r j < 2R j , 0 < θ j < ω j }, satisfy N j ⊂ Ω for all j and N i ∩ N j = ∅ if i = j and ξ j ≡ 1 in the set {x ∈ R 2 : r j < R j }, ξ j ≡ 0 in the set {x ∈ R 2 : r j > 2R j }.
For every j = 1, . . . , M we will call λ j the in general leading singular exponent associated with the operator corresponding to the corner x j . For the Laplace operator it is well known that λ j = π/ω j . Since in general the regularity of the solution of a boundary value problem depends on the smallest singular exponent, it is customary to denote λ = min{λ j : j = 1, . . . , M } and p D = 2 1 − min{1, λ} . (2.1) Our main estimates are for data y Ω ∈ W 1,p * (Ω) for some p * > 2. To get these estimates it is key to use the sharp regularity results of the optimal control, state and adjoint state provided in [1]. For both the control and the state it is enough to know the Hilbert Sobolev-Slobodetskiȋ space they belong to, but for the adjoint state we will need to know with some more detail the development in terms of powers of the singular exponents. To write this development, we must proceed in two steps in order to be able to define the effectively leading singularity in each corner.
Our first result concerns the regularity of the adjoint state and is a consequence of [1, Theorem 3.2 and Theorem 5.1]. For m ∈ Z, t ∈ R and 1 < p ≤ +∞ we define J m t,p = j ∈ {1, . . . , M } such that 0 < mλ j < 2 + t − 2 p and mλ j / ∈ Z .
(2.2) Lemma 2.1. Suppose y Ω ∈ L ∞ (Ω). Letφ ∈ H 1 0 (Ω) be the optimal adjoint state, solution of (1.2c). Then, there exist a unique functionφ r ∈ W 2,p (Ω) and unique real numbers (c j,m ) j∈J m 0,p , for all p < +∞ for constrained problems and p < p D for unconstrained problems, such thatφ =φ r + 3 m=1 j∈J m 0,p c j,m ξ j r mλ j j sin(mλ j θ j ). (2.3) Note that p D = +∞ in convex domains such that we obtain for constrained as well as for unconstrained problems the same regularity of the optimal adjoint state. However, in non-convex domains, the control and hence the state, as part of the right hand side of the adjoint equation, may be unbounded in the unconstrained case, which leads to the restriction p < p D for the regularity ofφ r . Moreover, it may happen that the effectively leading singularity corresponding to a corner x j is not the first one. This means that the associated coefficient c j,1 in the asymptotic representation (2.3) is equal to zero. However, this will be of interest only for constrained problems in case of nonconvex corners x j , i.e., λ j < 1. To be able to cover this, we define the numbers for each corner. In addition, we introduce In convex domains, λ = Λ will determine the regularity of both the optimal control and state. This holds for unconstrained as well as for constrained problems. However, in nonconvex domains, different cases may appear. If we have no control constraints then the regularity of the optimal control and state will again be determined by λ. If the problem is constrained then in the vicinity of any corner x j , where the coefficient of the corresponding first singularity c j,1 is unequal to zero, the optimal control is flattened there due to the projection formula and consequently smooth. This is the usual case. If c j,1 = 0 then the optimal control in the neighborhood of such a corner is at least as regular as the normal derivative of the corresponding second singular function. In the control constrained case, Λ will determine the regularity of the optimal control, at least in a worst case sense. The regularity of the optimal state may depend on λ as well since singular terms may occur within its asymptotic representation independent of the adjoint state. For unconstrained problems the following regularity result holds, see [ Assume that the optimal control has a finite number of kink points. Thenū We also have the following result from [1, Proof of Theorem 3.4].
Finally, we can write the representation of the adjoint state for regular enough data. For m ∈ Z, t ∈ R and 1 < p ≤ +∞ we will also need The following result is a consequence of [1,Corollary 4.4].
Lemma 2.5. Suppose that Ω is convex or −∞ < a < b < +∞, and that y Ω ∈ W 1,p * (Ω) with p * > 2. Then, for p > 2 such that there exist a unique functionφ r ∈ W 3,p (Ω) and unique real numbers (c j,m ) J m 1,p and (d j,m ) L m 1,p , such that Notice that the coefficients c j,m that appear in both expansions in Lemmata 2.1 and 2.5 coincide, due to the uniqueness of the expansion. In the expansion of Lemma 2.5 new terms appear that belong to W 2,p (Ω) for all p < +∞ but not to W 3,p (Ω) for p > 2 satisfying the conditions in Lemma 2.5.

A general discretization error estimate
In this section we will present a general discretization error estimate in Theorem 3.2. The terms in this general estimate have to be estimated in particular cases. This work will be done in later sections.
For the discretization, consider a family of regular triangulations {T h } depending on a mesh parameter h in the sense of Ciarlet [11]. Notice, that a triangulation E h of the boundary is naturally induced by T h . We assume that the space Y h is the space of conforming piecewise linear finite elements. The space U h is the space of piecewise linear functions generated by the trace of elements of Y h on the boundary Γ. We denote the subspace of Y h with vanishing boundary values by Y 0,h .
We also introduce the discrete solution operator S h : U → Y h . For u ∈ U the function S h u ∈ Y h is defined as the unique solution of We emphasize that on the boundary S h u coincides with the L 2 -projection of u on U h . Thus we get S h u h = u h on Γ for u h ∈ U h . Notice as well that (3.1) is not a conforming discretization of the very weak formulation of the state equation. However, according to [3,8], its applicability is guaranteed. In our discretized optimal control problem we aim to minimize the objective function The first order optimality conditions of this problem were derived in [9] and are stated in the next lemma.
Lemma 3.1. Problem (P h ) has a unique solutionū h ∈ U h ad , with related discrete statē y h = S hūh and adjoint stateφ h . The following discrete optimality system is satisfied where the discrete normal derivative ∂ h nφh ∈ U h is defined as the unique solution of An important tool in the numerical analysis is the construction of a discrete control u * h ∈ U h ad which interpolatesū in a certain sense, see Lemma 4.2 and Lemma 5.6, and satisfies If the optimal controlū ∈ H s (Γ) with s < 1 then we use a quasi-interpolant introduced by Casas and Raymond in [9]: Denote the boundary nodes of the mesh by x j Γ , 1 ≤ j ≤ N (h), and let e j , 1 ≤ j ≤ N (h), be the nodal basis of U h . We set According to [9,Lemma 7.5] the function u * h belongs to U h ad . Moreover, it is constructed such that u * h =ū on the active set, and it fulfills (3.4). Ifū ∈ H s (Γ) with s ≥ 1, we use a modification of the standard Lagrange interpolant I hū ofū, again denoted by u * h ∈ U h ad , which is defined by its coefficients as follows cf. [10,Section 2]. Of course, if we consider control problems without control constraints, that is −a = b = ∞, the interpolant u * h is just the Lagrange interpolant. In case of control bounds a, b ∈ R, in order to get an unique definition of u * h , we need to assume that on each element only one control bound is active. However, due to the Hölder continuity ofū, which we have forū ∈ H s (Γ) with s ≥ 1, there exists a mesh size h 0 > 0 such that for all h < h 0 the above definition of the interpolant is unique. Obviously, this interpolant belongs to U h ad . Moreover, it satisfies (3.4) by construction. Indeed, whenever νū(x) − ∂ nφ (x) = 0, we have u * h (x) −ū(x) = 0. As already announced, we conclude this section by stating a general error estimate for the control and state errors which will serve as a basis for the subsequent error analysis.
Theorem 3.2. For the solution of the continuous and the discrete optimal control problem we have Proof. First, let us define the intermediate error To deal with the third term, we take into account the continuity of S h :  (3.8). We begin with estimating the second one, but as we will see this also yields an estimate for the fourth term. There holds Next, we consider the second term of (3.10) in detail. By adding the continuous and discrete variational inequalities (1.2a) and (3.2a) with u =ū h ∈ U ad and u h = u * h ∈ U h ad , respectively, we deduce Rearranging terms and using (3.4) leads to (3.11) By collecting the estimates (3.10) and (3.11) we obtain From the Young inequality we can deduce Finally, the assertion is a consequence from (3.8), (3.9) and (3.13).

Problems without control constraints
In the rest of the paper, we will always assume that {T h } is a quasi-uniform family of meshes. However, if the underlying mesh has a certain structure then it is possible to improve the error estimates. These special quasi-uniform meshes are called superconvergence meshes or O(h 2 )-irregular meshes; for the precise definition we refer to Definition 4.5. The main result of this section is the following one.
Theorem 4.1. Suppose that either λ < 1 and y Ω ∈ L 2 (Ω), or y Ω ∈ W 1,p * (Ω) for some p * > 2. Then it holds where r is equal to one for λ − 1 2 ∈ (1, 3 2 ] and equal to zero else. If, further, For the proof, we are going to estimate the three terms that appear in the general estimate of Theorem 3.2. Whereas the first two terms in (3.7) can be estimated by standard techniques, the third one needs special care. Analogously to the derivation of (3.11), this term can formally be rewritten as where ∂ h n R hφ is defined as in (3.3) just by replacingȳ h withȳ andφ h with the Ritzprojection R hφ ofφ on Y 0,h . Thus, we are interested in the error between the normal derivative of the adjoint state and the corresponding discrete normal derivative of its Ritz-projection. In order to estimate the above term, we will pursue two different strategies. The first one relies on local and global W 1,∞ -discretization error estimates. In case of general quasi-uniform meshes, this will result in a convergence order of O(h s | log h| r ) for all s ∈ R such that s < λ − 1 2 and s ≤ 1, where r is equal to one for λ − 1 2 ∈ (1, 3 2 ] and equal to zero else. The second strategy will rely on special super-convergence meshes as introduced in [6]. The idea to use such meshes in the context of Dirichlet boundary control problems originally stems from [13]. In contrast to the setting in that reference, we are not concerned with smoothly bounded domains but with polygonal domains. For that reason we need to extend the corresponding estimates to that case, that is, we have to deal with less regular functions due to the appearance of corner singularities. This will yield an approximation rate of O(h s ) with s < min{ 3 2 , λ − 1 2 }, which results in an improvement for domains with interior angles less than 2π/3.
Proof. We know from Lemma 2.2 that the control satisfiesū ∈ H s (Γ) for all s < min{ 3 2 , λ − 1 2 }. If s < 1, we choose u * h as defined in (3.5), and the estimate for the control follows from [9, Eq. (7.10)] by setting we haveū ∈ H s (Γ) ֒→ C 0,s− 1 2 (Γ) due to the Sobolev embedding theorem. Thus, the modified Lagrange interpolant u * h from (3.6) is well-defined. Actually, in the present case, u * h is just the Lagrange interpolant. As a consequence, the error estimate for the control is given by a standard estimate for the Lagrange interpolant.
Again from Lemma 2.2, the optimal state satisfiesȳ ∈ H s+ 1 2 (Ω), for all s < min{ 3 cf. for instance [7]. Since s + 1 2 can be chosen greater than 1 2 , we have the desired result in case that λ ≥ 1. For λ < 1 we do not haveȳ ∈ H 1 (Ω) such that standard techniques for estimating finite element errors fail. However, in this case we can directly refer to Remark 5.4 of [3].
where r is equal to one for λ − 1 2 ∈ (1, 3 2 ] and equal to zero else. Proof. As above, we denote by R h the operator that maps a function of H 1 0 (Ω) to its Ritzprojection in Y 0,h . In addition, we introduce the extension operatorS h which extends a function belonging to U h to one in Y h by zero. Using the norm equivalence in finite dimensional spaces on a reference domain we easily infer for any ψ h ∈ U h and q ∈ [1, ∞] Since S h ψ h is discrete harmonic, we obtain together with the orthogonality properties of the Ritz-projection the identity where we employed that (S h −S h )ψ h belongs to Y 0,h . Now, we distinguish the three cases ω i < π/2, ω i < π and ω i < 2π for i = 1, . . . , M .
Next, we consider the case ω i < π for i = 1, . . . , M . For simplicity, we assume that the domain has only one corner with an interior angle greater or equal to π/2. However, the proof extends to the general case in a natural way. In the following, that corner is located at the origin. Furthermore, we denote its interior angle by ω 1 , the distance to that corner by r 1 , and the corresponding leading singular exponent by λ 1 = π/ω 1 . According to Lemma 2.5, the optimal adjoint state admits the splittinḡ ϕ =φ r +φ s , (4.7) whereφ r belongs to W 3,q (Ω) with some q > 2. Combining (4.5) and (4.7) yields the identity For the latter term, we can argue as in (4.6) to show first order convergence, i.e., In order to estimate the singular term, we decompose the neighborhood of the critical corner in subdomains Ω J which are defined by We set the radii d (4.10) Arguing as in (4.4), we get From [22,Corollary 3.62] (setting τ = 1 2 and γ = 2 − λ there) we know that for c I large enough there holds Notice that in that reference problems with Neumann boundary conditions are considered. However, the proof for the present problem is just a word by word repetition. Next, the Cauchy-Schwarz inequality and basic integration yield By collecting the results from (4.13)-(4.16), we obtain which yields together with (4.9), (4.8) and (4.5) the assertion in the second case. Finally, we consider the case ω i < 2π for i = 1, . . . , M . Similar to the foregoing considerations, we assume that only the angle ω 1 is greater or equal to π and hence 1/2 < λ 1 ≤ 1. According to (4.5), the Cauchy-Schwarz inequality, (4.4), and a standard finite element error estimate, we obtain for all s < λ 1 − 1/2. This ends the proof.
Remark 4.4. Related results to those of Lemma 4.3, which are established by using similar techniques, can be found in [4,15,19,22].
According to the previous lemma, the critical term in the general estimate (3.7) converges with an order close to one provided that the interior angles are less 2π/3. However, it is possible to improve the convergence rate if we assume a certain structure of the underlying mesh. The following definition for superconvergence meshes can be found in [6]. Those have been used in [13] in the context of Dirichlet boundary control problems in the case of smoothly bounded domains. (a) The set of interior edges E of the triangulation T h is decomposed into two disjoint sets E 1 and E 2 which fulfill the following properties: • For each e ∈ E 1 , let T and T ′ denote the two elements of the triangulation T h that share this edge e. Then the lengths of any two opposite edges of the quadrilateral T ∪ T ′ differ only by O(h 2 ).
The set of the boundary vertexes P is decomposed into two disjoint set P 1 and P 2 which satisfy the following properties: • For each vertex x ∈ P 1 , let e and e ′ be the two boundary edges sharing this vertex as an endpoint. Denote by T and T ′ the elements having e and e ′ , respectively, as edges and let t and t ′ be the corresponding unit tangents. Furthermore, take e and e ′ as one pair of corresponding edges, and make a clockwise traversal of ∂T and ∂T ′ to define two additional corresponding edge pairs. Then |t − t ′ | = O(h) and the lengths of any two corresponding edges only differ by O(h 2 ).
• |P 2 | = c with a constant c independent of h.
where I h f ∈ Y h denotes the piecewise linear Lagrange interpolant.
Lemma 4.7. Suppose that either λ < 1 and y Ω ∈ L 2 (Ω), or y Ω ∈ W 1,p * (Ω) for some Proof. First we observe that since S h represents the discrete harmonic extension operator andφ has zero boundary conditions. If at least one interior angle ω i is greater or equal to 2π/3, we have λ ≤ 3/2 and therefore λ − 1/2 ≤ 1. Consequently, there is no advantage in taking a superconvergence mesh and we can apply the result for quasi-uniform meshes. If ω i < π/2 for i = 1, . . . , M , and hence λ > 2, we can directly apply the results of Lemma 4.6 sincē ϕ ∈ W 3,q (Ω) for some q > 2 according to Lemma 2.5. For these reasons, we focus in the following only on the case 3/2 < λ ≤ 2. We are in this case if the largest interior angle, denoted by ω 1 in the following, fulfills π/2 ≤ ω 1 < 2π/3. For simplicity, we assume as in the proof of Lemma 4.3 that the remaining angles are less than π/2. However, the proof again extends to the general case in a natural way.
Since the number of elements in Ω h,1 is bounded independently of h and ∂Ω h,1 ∩ Ω = ∂Ω h,2 ∩ Ω, we have that ∂Ω h,2 ∩ Ω ∼ h. Using this fact, the Hölder inequality, and a discrete Sobolev inequality, we obtain DefineS h as the zero extension operator as in the proof of Lemma 4.3. Since S h ψ h denotes the discrete harmonic extension of ψ h , we infer Using this in combination with the Poincaré inequality yields where we used (4.4) in the last step. Next, we observe that the third derivatives ofφ s behave like r λ−3 1 such that we can conclude for some arbitrary ε > 0 (depending on q) which represents the desired result for the subdomain Ω h,2 . Finally, for the subdomain Ω h,1 , we conclude by inserting a standard interpolation error estimate and the a priori estimate for the operator S h as before that After observing that the second derivatives ofφ s behave like r λ−2 1 or log r 1 , respectively, if λ = 2, and that max x∈Ω h,1 r 1 (x) ∼ h, we get the desired result for the subdomain Ω h,1 . Remark 4.8. As we commented in the introduction, it is possible, though unlikely, that the coefficient c j,1 of the leading singular exponent vanishes. In this case, we can replace the parameter λ in Theorem 4.1 by min{Λ j }.

The control constrained case
This section is devoted to the numerical analysis of control constrained Dirichlet control problems. As we will see, the convergence rates in convex domains coincide with those for the unconstrained problems. More precisely, we will prove the following theorem.
Theorem 5.1. Suppose that either λ < 1 and y Ω ∈ L 2 (Ω), or y Ω ∈ W 1,p * (Ω) for some p * > 2. Moreover, assume that the optimal control has a finite number of kink points. Then it holds ∀s ∈ R such that s < λ − 1 2 and s ≤ 1, where r is equal to one for λ − 1 2 ∈ (1, 3 2 ] and equal to zero else. If, further, {T h } is O(h 2 )-irregular according to Definition 4.5, then The proof of this theorem is postponed to Section 5.1. As already observed, this is exactly the result which we have proven in the unconstrained case. However, if the underlying domain is non-convex, the approximation rates in the control constrained case can be improved. In this regard, one of our results relies on a structural assumption on the discrete optimal control which we formulate next. Through this section we will shortly write H = {j : λ j < 1 and c j,1 = 0}.
Assumption 5.2. There exists some h 0 > 0 such that for every j ∈ H, there exists Let us comment on Assumption 5.2. In Lemma 2.4 it was established that in the neighbourhood of a non-convex corner, the optimal control will normally be constant and either equal to the lower or the upper bound. Assumption 5.2 says that this property is inherited by the discrete optimal control.
One of our main results in the constrained case is now given as follows.
Theorem 5.3. Suppose y Ω ∈ W 1,p * (Ω) for some p * > 2. Moreover, let either λ > 1 or Assumption 5.2 be satisfied, and assume that the optimal control has a finite number of kink points. Then there is the estimate where r is equal to one for Λ − 1 2 ∈ (1, 3 2 ] and equal to zero else. Remark 5.4. We only consider the case a < 0 < b. This is because it is known that for those corners such that Λ j > 1 we have that ∂ nφ (x j ) = 0. In the case a < 0 < b, the projection formula (1.3) implies that in a neighbourhood of x j , the optimal control will satisfyū(x) = −∂ nφ (x), and hence its regularity will be determined by that of the adjoint state. If 0 ∈ [a, b], then the same projection formula implies that in a neighborhood of x j ,ū(x) will be equal to some of the control bounds. If we suppose, as in Assumption 5.2 that this property is inherited by the solutions of the discrete approximations, we have that the conclusions of Theorem 5.3 remain valid.
The proof of Theorem 5.3 is postponed to Section 5.1. Since Λ > 1 and λ > 1/2, we always have a convergence rate greater than 1/2. This is a real improvement compared the unconstrained case since in the latter it may happen that the convergence rates tend to zero as the largest interior angle tend to 2π. However, one may ask for a justification of Assumption 5.2. In Lemma 5.10 we will see that there exist constantsρ 1,j andρ 1,2 greater than zero for all j ∈ H, and a constant h 0 > 0 such that Thus, we could relax Assumption 5.2 to an h-dependent neighborhood of those corners x j with j ∈ H. Moreover, due to (5.3), it is even possible to show the following improved result in non-convex domains without any structural assumption on the discrete optimal control, i.e., we can always expect a convergence rate close to 1/2 in non-convex domains.
Theorem 5.5. Suppose y Ω ∈ W 1,p * (Ω) for some p * > 2, and assume that the optimal control has a finite number of kink points. Then it holds The proof of Theorem 5.5 is given in Section 5.2.

Proof of Theorems and 5.3
The results of Theorem 5.1, and Theorem 5.3 for λ > 1 directly follow from the general error estimate given in Theorem 3.2, the estimates for the adjoint state provided in Section 4 in Lemmata 4.3 and 4.7 and the error estimates for the control and the state established below in Lemma 5.6.
Lemma 5.6. Suppose y Ω ∈ H t (Ω)∩L 2 (Ω) for all t < min{1, λ − 1} and assume that the optimal control has a finite number of kink points. Then Proof. The proof starts exactly following the lines of the proof of Lemma 4.2, using the regularity stated in Lemma 2.3. In this way, if s < 1 we again obtain the desired estimate for u * h , as defined in (3.5), from [9, Eq. (7.10)]. If s ∈ [1, 3 2 ), u * h is given by (3.6). Since control constraints are now present, we have to derive error estimates for the modified Lagrange interpolant. To this end, let us consider two adjacent boundary elements E j−1 and E j belonging to E h which are determined by the line segments (x j−1 Γ , x j Γ ) and (x j Γ , x j+1 Γ ), respectively. Since we assume a finite number of kink points ofū due to the projection formula (1.3), we have to deal with the following situations (at least for h small enough): First, no kink is contained in E j−1 ∪ E j , second, there is exactly one kink ofū in E j−1 ∪ E j due to the projection formula. In the first case, we have that u * h coincides with the Lagrange interpolant on E j−1 ∪ E j such that the desired estimate on these elements is obtained by standard discretization error estimates for the Lagrange interpolant employing the regularity results from Lemma 2.3, i.e., with s < min{3/2, Λ−1/2}. In the second case, we can assume without loss of generality Using the regularity of the optimal controlū ∈ H s (Γ) ֒→ C 0,s−1/2 (Γ) with s < min{3/2, Λ − 1/2} from Lemma 2.3, we now estimate the interpolation error on each of the elements E j−1 and E j . For the error on E j−1 we obtain by means of the Hölder continuity ofū Next, recall that the nodal basis function associated with x j Γ is denoted by e j . Then we deduce for the error on E j where we again used the Hölder continuity ofū. Since we assume a finite number of kink points, the desired interpolation error estimate for u * h on Γ in case that s ∈ [1, 3 2 ) is just a combination of (5.5)-(5.7).
Since Λ ≥ λ, a straightforward application of Theorem 3.2, and Lemmata 5.6, 4.3 and 4.7 leads to an order of convergence identical to the one we have for unconstrained problems. Notice that Lemmata 4.3 and 4.7 can be used since bounds on the control do not play any role there. Thus, Theorem 5.1 and Theorem 5.3 for λ > 1 are proved.
For the results of Theorem 5.3, in case that λ < 1 and Assumption 5.2 is valid, we use the above error estimates for the control and the state, and we show in Lemmata 5.8 and 5.9 below how to improve the result for the adjoint state. Then an adaptation of the general error estimate, see Theorem 5.7, which we are going to prove next, can finally be used to combine these results. Let us definẽ Under the structural Assumption 5.2 it is clear that e h = u * h −ū h ∈ V h , so we have the following modification of the general error estimate (3.7).
Theorem 5.7. Suppose Assumption 5.2 holds. Then Proof. Since e h = u * h −ū h ∈ V h due to Assumption 5.2, the result can be obtained in the same way as in the proof of Theorem 3.2 just by replacing Next, we are concerned with discretization error estimates for the critical term in the general estimate of Theorem 5.7. First, we deal with estimates for general quasi-uniform meshes. Afterwards we show improved estimates if we assume O(h 2 )-irregular meshes.
Proof. To be able to localize the effects in the neighborhood of all corners x j with j ∈ H, we introduce a cut-off function η 1 which is equal to one in a fixed neighborhood of these corners and decays smoothly. In addition, we set η 0 = 1 − η 1 . Then we infer for the quantity of interest For the first term on the right hand side of this inequality, we directly apply Lemma 4.3 to conclude ∀s ∈ R such that s < Λ − 1 2 and s ≤ 1, (5.10) where r is equal to one for Λ − 1/2 ∈ (1, 3/2] and equal to zero else, having in mind the regularity results of Lemma 2.5 for the adjoint state and noting that the singular terms coming from the corners x j with j ∈ H do not have any influence due to the cut-off function η 0 . To deal with the second term in (5.8), letS h denote the extension operator which extends a piecewise linear function ψ h on the boundary by zero to a function in Y h . Thus,S h ψ h is equal to zero inΩ := {x ∈ Ω : |x − x j | <ρ j /2 if j ∈ H} for any ψ h ∈ V h . Moreover, let R h be the operator that maps a function in H 1 0 (Ω) to its Ritz-projection in Y 0,h . Due the properties of the discrete harmonic extension S h and the Ritz-projection R h , we obtain By applying the Hölder inequality, local W 1,∞ -discretization error estimates for the Ritz-projection from [14, Corollary 1], and (4.4), we obtain Having regard to the regularity results for the optimal adjoint state from Lemma 2.5 and by using standard interpolation error estimates and a standard finite element error estimate, we deduce which is valid for all s ∈ R such that s < 2λ and s ≤ 2. Combining (5.8)-(5.14) ends the proof.
Proof. As before, we introduce the circular sectors Ω := {x ∈ Ω : |x − x j | <ρ j /2 if j ∈ H}, For technical reasons we also need the circular sector Let the operatorsS h and R h be defined as in the proof of Lemma 5.8. Moreover, let η 1 be a smooth cut-off function which is equal to one inΩ ′′ with supp η 1 ⊂Ω ′ . In addition, we choose η 1 such that supp I h η 1 ⊂Ω ′ which is possible without any restriction for h small enough. We set η 0 := 1 − η 1 . Analogously to the foregoing proof, we infer Observe that η 0φ is equal to zero in a fixed neighborhood of all corners x j with j ∈ H. Consequently, Lemma 4.6, applied as in the proof of Lemma 4.7, yields for the latter term in (5.15) with s ∈ R such that s < Λ − 1 2 and s ≤ 3 2 . By applying the Hölder inequality, local W 1,∞ -discretization error estimates for the Ritz-projection from [14,Corollary 1], and (4.4), we obtain for the first term in (5.15) where we used that η 1φ and I h (η 1φ ) are equal to zero in Ω\Ω ′ . Usual error estimates for the Ritz-projection and a standard embedding yield which is valid for all s ∈ R such that s < 2λ and s ≤ 2. This ends the proof.
Finally, an application of Theorem 5.7, and Lemmata 5.6, 5.8 and 5.9, yield the results of Theorem 5.3 where λ < 1 and Assumption 5.2 is satisfied.

Proof of Theorem 5.5
In this subsection we show the results of Theorem 5.5. That is, we show a convergence rate close to 1/2 for the optimal controls and states in the constrained case if the domain is non-convex and even if the structural Assumption 5.2 does not hold. For that purpose, let us recall that {x j } denotes the corners of Γ, {x i Γ } is the set of boundary nodes of the mesh and {e i } is the basis of U h such that e i (x k Γ ) = δ ik . Thus, every function u h ∈ U h can be written as By testing the discrete variational inequality appropriately, we deducē Lemma 5.10. For each interior angle ω j > π, where c j,1 from (2.3) is unequal to zero, there are two constantsρ 1,j andρ 2,j greater than zero such that Proof. In the following we focus only on one non-convex corner x j . Without loss of generality let c j,1 be greater than zero. Hence the normal derivative ofφ is negative, and the lower bound of the control is active, and νū − ∂ nφ > 0 in the vicinity of this corner. We need to show that there are two constantsρ 1,j andρ 2,j such that for all nodes x h,i with |x h,i − x j | ∈ [ρ 1,j h| log h| 1/2 ,ρ 2,j ]. According to [20,Theorem 3.4], we know that where the function ζ j,1 is of the form where ξ j denotes the cut-off function introduced at the beginning of Section 2 and the function z j,1 denotes a function which solves −∆z j,1 = [∆, ξ j ]π −1/2 r −λ j j sin(λ j θ j ) in Ω, z j,1 = 0 on Γ, and [a, b] = ab − ba denotes the commutator. According to Theorem 5.1, we deduce the existence of an constant h 0 > 0 such that for all h < 0 there holds due to the assumption c j,1 = (ȳ − y d , ζ j,1 ) L 2 (Ω) > 0. Using this result, we will show that the singular part of the functionφ which solves −∆φ =ȳ h − y d in Ω,φ = 0 on Γ, behaves like the singular part ofφ. Indeed,φ admits the splitting ϕ =φ r +φ s (5.17) according to Lemma 2.1. The regular partφ r belongs to W 2,q (Ω), at least for some q > 2, sinceȳ h − y d belongs to L q (Ω) due to the convergence result of Theorem 5.1. The singular partφ s can be written as where the constantc j,1 is greater than zero according to (5.16). Assuming that |x i Γ − x j | is already small enough such that ξ j ≡ 1 on supp e i , we get by basic calculationŝ where we used that ū h L ∞ (Γ) ≤ max{|a|, |b|} and that ∂ nφr is uniformly bounded in L ∞ (Γ) due to the embedding W 2,q (Ω) ֒→ W 1,∞ (Ω) for q > 2. As before, let us denote byS h the operator which extends any function of U h to one in Y h by zero. Also observe thatφ h is the Ritz-projection R hφ ofφ. Then integration by parts, the definition of ∂ h nφh in (3.3) and (5.17) yield where we employed a standard discretization error estimate for the Ritz-projection, (4.4) and an inverse inequality in the last step. Now we proceed as in the proof of Lemma 4.3 between (4.13) and (4.17). Let σ and the subdomains Ω J be defined as in that proof and let the index J be chosen such that suppS h e i ⊂ Ω J . Assume that |x i Γ − x j | ≥ c I h with a constant c I large enough such that local W 1,∞ -error estimates for the Ritz-projection are applicable. Then those estimates of [14, Corollary 1] and (4.4) yield Now, we redefine the setsΓ andΩ bỹ and we set again Moreover, let Γ c := Γ\Γ. We have the following modification for the general error estimate Theorem 5.12. For the solution of the continuous and the discrete optimal control problem we have Note that the first term on the left hand side of (5.22) is a norm with respect to Γ c .
Proof. We proceed as in the proof of Theorem 3.2. In contrast, we will test the optimality conditions with different functions. For that purpose, let us introduceũ ∈ U ad and u h ∈ U h ad byũ = ū a.e. inΓ u h a.e. in Γ c andũ h = ū h a.e. inΓ u * h a.e. in Γ c .
Note thatū and u * h are constant, even coincide, onΓ and thatū h is equal toū at least for all x h,i with |x h,i − x j | ≤ [ρ 1,j h| log h| 1/2 ,ρ 2,j ] for j ∈ H according to Lemma 5.10. Next, we define the intermediate error e h :=ũ h −ū h , which is equal to zero inΓ. Then, we obtain To deal with the third term, we take into account the continuity of S h : Accordingly, we only need estimates for the second and fourth terms in (3.8). We begin estimating the second one, but as we will see this also yields an estimate for the fourth term. There holds Next, we consider the second term of (5.25) in detail. By adding the continuous and discrete variational inequality with u =ũ and u h =ũ h , respectively, we deduce Rearranging terms and using (3.4) leads to By collecting the estimates (5.25) and (5.26) we obtain From the Young inequality we can deduce Finally, the assertion is a consequence from (5.23), (5.24) and (5.27).
Since the aim of the experiment is to measure the order of convergence of the L 2 (Γ) error in the control variable, we have solved the problems in two quasi-uniform families of J nested meshes obtained by diadic refinement from a rough initial mesh. One of them is built such that it does not have the superconvergence property (see Figure 1), while the other is obtained using regular refinement, which results in a O(h 2 )-irregular family which has the superconvergence property (see Figure 2). The finest mesh has between 1 million and 3.15 million nodes, depending on the geometry of the domain. Notice that these fine meshes induce boundary meshes that only have between 4 thousand and 7 thousand nodes only. To solve the optimization problem, we have used a semismooth Newton method; see [16] for the details.
In the examples where the optimal control is continuous, we measure the error at the mesh at level j = 1, . . . , J as whereū h j is the solution of (P h j ) and I h j : C(Γ) → U h j is the nodal Lagrange interpolation operator. If the exact solution is singular at the point x 0 = (0, 0), we simply approximate I h jū (0, 0) ≈ū(ǫ, 0), for some ǫ > 0 small enough.
Notice that for the case 5π/4 < ω 1 < 3π/2 we have that ω 6 = 2π − ω 1 ∈ (π/2, π) and for the case 7π/4 < ω 1 < 2π we have ω 7 = 5π/2 − ω 1 ∈ (π/2, π), so when we choose λ = λ 1 in the definition ofφ and solve a constrained problem, the leading singular exponent to be taken into account should be, respectively, λ 6 or λ 7 . Nevertheless, the exact adjoint state has been chosen in such a way that in the first case c 6,m = 0 and in the second case c 7,m = 0 for m = 1, 2, 3, so for this example we need not take this into account.
We fix ν = 1. For constrained problems, we will consider a = −1/λ 1 and b = 1. We choose a so the asymptotic behavior of the error shows up for the mesh sizes used. If |a| were two big, the problem would behave like an unconstrained one for our meshes; on the other hand, were |a| too small, we would be approximating an optimal control very similar to a constant and the experimental orders of convergence would be too high for our meshes.
Graphs with the experimental results can be found in figures 3 and 4. It is remarkable that experimental results are quite in agreement with theoretical estimates. π/3 π/2 2π/3 π 4π/3 2π