Convexification for a 1D Hyperbolic Coefficient Inverse Problem with Single Measurement Data

A version of the convexification numerical method for a Coefficient Inverse Problem for a 1D hyperbolic PDE is presented. The data for this problem are generated by a single measurement event. This method converges globally. The most important element of the construction is the presence of the Carleman Weight Function in a weighted Tikhonov-like functional. This functional is strictly convex on a certain bounded set in a Hilbert space, and the diameter of this set is an arbitrary positive number. The global convergence of the gradient projection method is established. Computational results demonstrate a good performance of the numerical method for noisy data.


INTRODUCTION
We call a numerical method for a Coefficient Inverse Problem (CIP) globally convergent if there exists a rigorous guarantee that this method delivers at least one point in a sufficiently small neighborhood of the exact solution without an assumption that the starting point of iterations is located sufficiently close to that solution. We construct here a globally convergent numerical method for a CIP for a 1D hyperbolic PDE. This CIP has a direct application in standoff imaging of dielectric constants of explosive-like targets using experimentally collected data. Our numerical method is a version of the so-called convexification concept. The convexification method for our CIP was not constructed in the past. Thus, we develop some new ideas here. Just as in all previous publications about the convexification, which are cited below, we work with the data resulting from a single measurement event. Thus, our data depend on one variable.
It is the CIP (1.3), (1.4) for which we develop here the convexification method. It is well known that, given (1.2), functions f 0 (t), f 1 (t) for t ∈ (0, 2) (i.e. for T = 2) uniquely determine the function a(x) and also the Lipschitz stability estimate holds, see Theorem 2.6 Section 3 of Chapter 2 of [34] as well as Figure 1(b).
We start by applying a widely known change of variables, see e.g. [21,34].
x ↔ y ⇒ x(y) = Equations (1.7) look exactly as equations (1.3)- (1.4). Hence, we have reduced the CIP (1.5)-(1.6) to our CIP (1.3)- (1.4). This justifies the applied aspect of our CIP. On the other hand, due to the presence of the unknown coefficient c(y) in the principal part of the hyperbolic operator of (1.5), the CIP (1.5)-(1.6) is harder to work with than the CIP (1.3)- (1.4). Therefore, it makes sense, as the first step, to develop a numerical method for the CIP (1.3)-(1.4). Next, one might adapt that technique to problem (1.5)- (1.6). This first step is done in the current paper.
The CIP (1.5)-(1.6) has application in acoustics [8]. Another quite interesting application is in inverse scattering of electromagnetic waves, in which case c −2 (y) = ε r (y), where ε r (y) is the spatially distributed dielectric constant. Using the data, which were experimentally collected by the US Army Research Laboratory, it was demonstrated in [21,23,31] that the 1D mathematical model, which is based on equation (1.5), can be quite effectively used to image in the standoff mode dielectric constants of targets, which mimic explosives, such as, e.g. antipersonnel land mines and improvised explosive devices. In fact, the original data in [22,23,31] were collected in the time domain. However, the mathematical apparatus of these references works only either with the Laplace transform [31] or with the Fourier transform [22,23] with respect to t of equation (1.5). Unlike the latter, we hope that an appropriately modified technique of this paper would help us in the future to work with those experimental data directly in the time domain.
Of course, the knowledge of the dielectric constant alone is insufficient to differentiate between explosives and non-explosives. However, we believe that this knowledge might be used in the future as an ingredient, which would be an additional one to the currently existing features which are used in the classification procedures for such targets. So that this additional ingredient would decrease the current false alarm rate, see, e.g. page 33 of [31] for a similar conclusion.
Any CIP, including the CIP (1.3)-(1.4), is both nonlinear and ill-posed. These two factors cause the phenomenon of multiple local minima and ravines of conventional Tikhonov least squares cost functionals for CIPs, see, e.g. the work of Scales, Fischer and Smith [35] for a convincing numerical example of this phenomenon. On the other hand, any version of the gradient method of the minimization of that functional stops at any local minimum. Therefore, a numerical reconstruction technique, which is based on the minimization of that functional, is unreliable.
As to other globally convergent numerical methods for the 1D CIPs for the wave-like equations, we refer to the Gelfand-Levitan method, see works of Kabanikhin with coauthors [10][11][12] for both 1D and 2D cases. Another globally convergent numerical method is developed in works of Korpela, Lassas and Oksanen [29,30], where a CIP for equation (1.5) is studied without the above change of variables. The data of [29,30] depend on two variables since those are the Neumann-to-Dirichlet data.
Being motivated by the goal of avoiding the above discussed phenomenon of multiple local minima and ravines of conventional least squares Tikhonov functionals, Klibanov with coauthors has been working on the convexification since 1995, see = [4,[16][17][18]21] for the initial works on this topic. The publication of Bakushinskii, Klibanov and Koshev [1] has addressed some questions, which were important for the numerical implementation of the convexification. This opened the door for some follow up publications about the convexification, including the current one, with a variety of computational results [13,23,23,24,27,28]. In addition, we refer to the work of Baudouin, de Buhan and Ervedoza [2], where a different version of the convexification is developed for a CIP for the hyperbolic equation u tt = ∆u + a(x)u. However, there is a significant difference in statements of CIPs between [2,4,26,28] and the above mentioned publications on the convexification. More precisely, [2,4,26,28] work exactly within the framework of the Bukhgeim-Klibanov method. One of requirements of this method is that one of initial conditions in the hyperbolic case of [2,4] would not vanish in the entire domain of interest. In the parabolic case of [26,28] that requirement is that the solution of the corresponding initial boundary value problem is not vanishing at a certain moment of time, which is not the initial moment of time. On the other hand, all other publications on the convexification, including the current one, do not use that "non vanishing" condition, even though they still use the ideas of [7].
As to the Bukhgeim-Klibanov method, it was originated in [7] with the only goal at that time (1981) of proofs of global uniqueness theorems for multidimensional CIPs with single measurement data. This method is based on Carleman  estimates. The convexification extends the idea of [7] from the initial purely uniqueness topic to the more applied topic of numerical methods for CIPs. Many publications of many authors are devoted to the method of [7] being applied to a variety of CIPs, again with the goals of proofs of uniqueness and stability results for those CIPs. Since the current paper is not a survey of that technique, we now refer only to a few of such publications [3,14,15,18,20].
All functions below are real valued ones. In Section 2 we derive a boundary value problem for a quasilinear integro-differential equation. In Section 3 we describe the convexification method for solving this problem. We formulate our theorems in Section 4. Their proofs are in Section 5. Numerical results are presented in Section 6.

QUASILINEAR INTEGRO-DIFFERENTIAL EQUATION
Let H (x) be the Heaviside function. Problem (1.3) is equivalent to the following integral equation, see Section 3 of Chapter 2 of [34]: It follows from (2.2) and (1.2) that the first line of (2.1) can be rewritten as [34]: In fact, (2.3) is a linear integral equation of the Volterra type with respect to the function u(x,t) [34]. This equation can be solved as: . Also u(x,t) = 0 for t < |x| and u(x,t) ∈ C 3 {(x,t) | t ≥ |x|}. Furthermore, lim t→|x| + u(x,t) = 1/2 and inequality (2.6) holds.

Integro-differential equation
It follows from (2.9) that we can consider the function q(x,t) = ln v(x,t). Using (2.8)-(2.10), we obtain Equation (2.11) has two unknown functions, q(x,t) and a(x), which is inconvenient. On the other hand, the function a(x) is "isolated" in (2.11) and it is independent on t. Therefore, we follow the first step of the method of [7]. More precisely, we differentiate equation (2.11) with respect to t, thus, eliminating the unknown coefficient from this equation and obtaining an integro-differential equation this way.

Let
w(x,t) = q t (x,t). (2.14) Then (2.12) and (2.14) imply Define the quasilinear integro-differential operator L as As to the domain Tr in (2.17), it is clear that the change of variables (2.7) transforms the rectangle D(0,t) of Figure 1(b) in the triangle Tr, see Figure 1(c), Hence, we can uniquely determine the functions w(x,t) and q(x,t) only for (x,t) ∈ Tr.

Absorbing boundary conditions
Lemma 2.2. For every two numbers A ≥ 1 and B > 0, the function u (x,t) satisfies the absorbing boundary conditions: Remark 2.2. Engquist and Majda have proposed to impose the absorbing boundary conditions for the numerical simulations of the propagation of waves [9]. Lemma 2.2 implies that, unlike [9] , in the case of problem (1.3), this condition should not be imposed, since it holds automatically.

Reconstruction of the unknown coefficient
It follows from (2.11), (2.12) and (2.14) that Hence, we focus below on the numerical solution of the boundary value problem (2.17), (2.18).

Convexification in brief
Given a CIP, the first step of the convexification follows the first step of [7], in which the unknown coefficient is eliminated from the PDE via the differentiation with respect to such parameter from which that coefficient does not depend. In particular, in our case, we have replaced equation (2.11) containing the unknown coefficient a(x) with a quasilinear integro-differential equation, which does not contain that coefficient. Next, one should solve the corresponding boundary value problem, which is similar with problem (2.17), (2.18). To solve that boundary value problem, a weighted Tikhonov-like functional J λ is constructed, where λ ≥ 1 is a parameter. The weight is the Carleman Weight Function (CWF), which is involved in the Carleman estimate for the principal part of the operator of that integro-differential equation. In our case, that principal part is the operator ∂ 2 x − 2∂ x ∂ t , see (2.16) and (2.17).
The above mentioned functional is minimized on a convex bounded set with the diameter 2d, where d > 0 is an arbitrary number. This set is a part of a Hilbert space H k . In our case, k = 3. The key theorem is that one can choose a sufficiently large value λ (d) ≥ 1 of the parameter λ such that the functional J λ is strictly convex on that set for all λ ≥ λ . Next, one proves that, for these values of λ , the gradient projection method of the minimization of the functional J λ converges to the correct solution of that CIP starting from an arbitrary point of the above mentioned set, as long as the level of the noise in the data tends to zero. Given that the diameter 2d of that set is an arbitrary number and that the starting point is also an arbitrary one, this is the global convergence, by the definition of the first sentence of Introduction.
It is worth to note that even though the theory says that the parameter λ should be sufficiently large, our rich computational experience tells us that computations are far less pessimistic than the theory is. More precisely, in all our numerically oriented publications on the convexification, including the current one, accurate numerical results are obtained for λ ∈ [1, 3] [1,13,19,23,24,26,27].

The Tikhonov-like functional with the Carleman weight in it
We construct this functional to solve problem (2.17), (2.18). Let the number α ∈ (0, 1/2). Our CWF is where λ ≥ 1 is a parameter, see Theorem 4.1 in section 4 for the Carleman estimate with this CWF. Even though we can find the function w(x,t) only in the triangle Tr in (2.20), it is convenient for our numerical study to integrate over the rectangle R The absorbing boundary condition (2.24) for A = 1 (2.7) and (2.14) imply that Let d > 0 be an arbitrary number. Define the set B(d, p 0 , p 1 ) as Let β ∈ (0, 1) be the regularization parameter and L(w) be the operator defined in (2.16). Our weighted Tikhonov-like functional is:

Estimating an integral
We use Lemma 3.1 in the proof of Theorem 4.2 (section 4).
Lemma 3.1. For any function g ∈ L 2 (R) the following estimate is valid: Proof. Using (3.1), integration by parts and the Cauchy-Schwarz inequality, we obtain Here, we have used the fact that the first term in the third line of the above is negative. Hence, we have obtained that Dividing both sides of (3.7) by √ I and squaring both sides of the resulting inequality, we obtain (3.6).

THEOREMS
Introduce the subspaces In the CWF ϕ λ (x,t) given in (3.1), let α ∈ (0, 1/2). Then there exist constants C = C(α) > 0 and λ 0 = λ 0 (α) ≥ 1 depending only on α such that for all functions u ∈ H 2 0 (R) and for all λ ≥ λ 0 the following Carleman estimate is valid:  Indeed, in Carleman estimates, usually one cannot ensure signs of integrals over hypersurfaces. In particular, using (2.25), it is shown below that the positivity of these two terms is quite helpful in the reconstruction of the unknown coefficient a(x), as soon as an approximation for the function w x (x, 0) is found.

PROOFS
Below in this section (x,t) ∈ R, where R is the rectangle defined in (3.2).

Proof of Theorem 4.1
In this proof C = C (α) > 0 denotes different constants depending only on α. We assume in this proof that the function u ∈ C 2 R ∩ H 2 0 (R) . The more general case u ∈ H 2 0 (R) can be obtained from this one via density arguments.
Introduce a new function v (x,t) = u (x,t) e −λ (x+αt) (5.1) and express u xx − 2u xt via derivatives of the function v (x,t) . We obtain: Hence, We estimate from the below in two steps two products in the second line of (5.2) involving v x and v t .
Step 1. Estimate Thus, we have obtained on the first step: Summing up (5.3) with (5.4) and taking into account (5.2), we obtain Hence, by Young's inequality Thus, in order to ensure the positivity of both terms in the right hand side of (5.6), we should have 1 2 < ε < 1−α. We take ε as the average of lower and upper bounds of these two inequalities, Hence, (5.6) becomes Note that since u ∈ C 2 R ∩ H 2 0 (R) , then by (5.1) v (0,t) = v x (0,t) = 0. Hence, integrating (5.5) over R and taking into account (5.7), we obtain (5.8) We now replace in (5.8) the function v with the function u via (5.1). We have Hence, (5.8) implies the following estimate, which is equivalent with (4.1):

Proof of Theorem 4.6
The existence of the number θ ∈ (0, 1) as well as convergence rates (4.15) and (4.16) follow immediately from a combination of Theorem 4.2 with Theorem 2.1 of [1]. Convergence rate (4.17) follows immediately from the triangle inequality, (4.13) and (4.15). Similarly, convergence rate (4.18) follows immediately from the triangle inequality, (4.13) and (4.16). The finite difference approximations of differential operators in (2.16) are used on the rectangular mesh with h = (h x , h t ). Denote w(x i ,t j ) = w i, j . We write the functional J λ ,β (w) in (3.5) in the finite difference form as:

NUMERICAL IMPLEMENTATION
(6.1) Next, we minimize functional (6.1) with respect to the values w i, j of the unknown function w (x,t) at grid points (x i ,t j ). To speed up computations, the gradient of the functional (6.1) is written in explicit form, using Kronecker symbols, as in [19]. For brevity, we do not bring in these formulas here.

Remarks 6.1
1. In fact the functional (6.1) is used to conduct numerical studies is a slightly modified finite difference version of (3.5). In our computations, we took the Tikhonov regularization term in the finite difference analog of H 2 (R ) instead of H 3 (R ). Note that since the number of grid points is not exceedingly large here (N x = 60, N t = 50), then all discrete norms are basically equivalent. Additionally, the boundary term with the coefficient µ >> 1 is added in (6.1) to ensure that the minimizer satisfies boundary condition (3.3).
2. We choose parameters λ , α, β and µ so that the numerical method provides a good reconstruction of a reference function a(x) of our choice depicted on Figure 3(a). The values of our parameters were found by the trial and error procedure. It is important though that exactly the same values of those parameters were used then in three subsequent tests. Those values were: λ = 2, α = 0.5, β = 10 −4 , µ = 10 2 .
Recall that by Theorem 4.1 one should have α ∈ (0, 1/2) . However we have computationally observed that α = 0.5 provided the best numerical reconstructions. We also note that even though the parameter λ has to be sufficiently large, λ = 2 worked quite well in our numerical experiments. This is similar with all above cited works about numerical studies of the convexification. The topic of optimal choices of these parameters is outside of the scope of this paper.
3. Even though Theorem 4.6 guarantees the global convergence of the gradient projection method, we have observed in our computations just straightforward gradient descent method works well. This method is simpler to implement since one does not need to use the orthogonal projection operator P B 0 in (4.14). In other words, Thus, we have not subtracted the function G from the function w and minimized, therefore, the functional J λ ,β instead of the functional I λ ,β . In other words, (4.14) was replaced with w n = w n−1 − γJ λ ,β (w n−1 )), n = 1, 2, . . .

(6.2)
Note that J λ ,β ∈ H 3 0 (R ) . This means that all functions w n of the sequence (6.2) satisfy the same boundary conditions p 0 , p 1 (2.18). We took γ = 10 −5 at the first step of the gradient descent method and adjusted it using line search at every subsequent iteration.  4. We choose the starting point w 0 (x,t) of the process (6.2) as w 0 (x,t) = −(p 1 (t)x 2 )/2 + p 1 (t)x + p 0 (t), it is easy to see that the function w 0 (x,t) satisfies boundary conditions (2.18) as well as the absorption boundary condition (3.3). Hence, at the first step of the minimization procedure a 0 (x) = 2(w 0 ) x (x, 0) = 2p 1 (0)(1 − x); In most cases p 1 (0) = 0, thus at the initial solution a 0 (x) ≡ 0.
5. The stopping criterion for the minimization process is a n+1 − a n L 2 (0,1) / a n L 2 (0,1) where a n = a n (x) = 2(w n ) x (x, 0) and the function w n (x,t) is computed on the n-th step of the minimization procedure.
Next, we differentiate so smoothed functions. Our numerical experience tells us that this procedure works quite well. Similar observations took place in all above cited works on the convexification.

Numerical results
We have calculated the relative error of the reconstruction on the final iteration n = n * of the minimization procedure: Error = a n * − a * L 2 (0,1) / a * L 2 (0,1) where a n * (x) is the computed solution and a * (x) is the true test function.
We have conducted our computations for the following four tests: Test 1. a(x) = x 2 e −(2x−1) 2 . This is our reference function for which we have chosen the above listed parameters. In the remaining Tests 2-4 we have used the same parameters.   Note that functions on the Figures 3(c),(d) do not attain zero values at x = 1 as required by condition (1.2). Also note that the function a (x) in Test 4 is not differentiable at x 0 = 0.876 − π −1 ≈ 0.558, and has infinitely many oscillations in the neighborhood of the point x 0 . Nevertheless numerical reconstructions on Figures 3(a),(d) are rather good ones, also, see Table 6.2. Graphs of exact and computed functions a(x) of Tests 1-4 are presented on Figures 3 (a)-(d). Table 6.2 below summarizes the results of our computations.
We have used the 12-core Intel(R) Xeon(R) CPU E5-2620 2.40GHz computer. The average computational time for tests 1-4 was 159.4 seconds with the parallelization of our code. And it was 1114.3 seconds without the parallelization. Thus, the parallelization has resulted in about 7 times faster computations. We observe that the L ∞ norm of the functional J h λ ,β ,µ as well as the L ∞ norm of the gradient this functional decreases at least by the factor of 100 in all tests.