FREQUENCY-SPARSE OPTIMAL QUANTUM CONTROL

. A new class of cost functionals for optimal control of quantum systems which produces controls which are sparse in frequency and smooth in time is proposed. This is achieved by penalizing a suitable time-frequency representation of the control ﬁeld, rather than the control ﬁeld itself, and by employing norms which are of L 1 or measure form with respect to frequency but smooth with respect to time. We prove existence of optimal controls for the resulting nonsmooth optimization problem, derive necessary optimality conditions, and rigorously es- tablish the frequency-sparsity of the optimizers. More precisely, we show that the time-frequency representation of the control ﬁeld, which a priori admits a continuum of frequencies, is supported on only ﬁnitely many frequencies. These results cover important systems of physical interest, including (inﬁnite- dimensional) Schr¨odinger dynamics on multiple potential energy surfaces as arising in laser control of chemical reactions. Numerical simulations conﬁrm that the optimal controls, unlike those obtained with the usual L 2 costs, concentrate on just a few frequencies, even in the inﬁnite-dimensional case of laser-controlled chemical reactions.


1.
Introduction. This paper is motivated by application problems of current interest in quantum control [11], which range from steering chemical reactions [2] over creating excited or ionized states [12] to faithfully storing and manipulating bits of quantum information [30].
To identify control fields which achieve a suitable goal, such as transfer of the quantum system from one eigenstate to another, and which are convenient to implement experimentally, one typically minimizes a cost functional which promotes goal achievement and penalizes unsuitable controls [16]. Theoretical work up to now has focused on cost functionals involving the L 2 , L 1 or H 1 norm of the control in time-space parameterization, (see e.g. [10,7] for finite-dimensional problems and [17,37] for infinite-dimensional ones). This approach has been very successful in achieving a wide spectrum of control goals. But it has been remarked that "[the] resulting optimized electric fields are often very complex, thus it is nearly impossible to understand the underlying processes involved in the observed control" [16]. We may therefore ask the question: Why time-space parameterization?
We demonstrate in this paper that using certain more sophisticated costs leads to "simpler" controls. The control fields are sparse in frequency, picking out the system's Bohr frequencies without any special ansatz, while at the same time having slowly varying amplitude envelopes, thereby sharing an important feature of laser pulses designed by experimentalists which are commonly and successfully used in the laboratory. The sparse time-frequency structure is achieved via two key ideas.
First, we do not penalize the time profile of the field amplitude, nor just the frequency spectrum as this would rule out that a relevant frequency can be switched on smoothly, but a suitable time-frequency representation of it. While such representations are a familiar tool to interpreting or analyzing a given field in control and signal analysis, they here acquire center stage already in the design of the controls.
Second, we build upon recent advances in the optimal control theory of elliptic and parabolic systems related to the basic idea [38,33,9,5,14] of sparsity-enhancing L 1 or measure-norm costs. More specifically, we build upon the idea of functionvalued measures to achieve directional sparsity in parabolic control [20]. The novelty as compared to the latter advances is two-fold: first, in quantum control, unlike in parabolic control, the target for sparsity should not be the field amplitude, but its frequency structure, and second, one is dealing with a bilinear instead of a linear control problem.
(2) Here H 0 is the Hamiltonian of the quantum system, H 1 is a coupling operator, B is a 'synthesis operator' which assembles the control field from a time-frequency representation u(ω, t), and · M is an L 1 or measure norm with respect to frequency but a smoothness-promoting norm with respect to time. A prototypical choice is the time-frequency synthesis operator (Bu)(t) = Ω u(ω, t) e iωt dω, where Ω ⊂ R is a region of admissible frequencies, and u M = Ω u(ω, ·) H 1 (0,T ) dω. (4) Note that the control, a priori, can use the whole available continuum of frequencies, with each frequency possessing its own time profile. A main result of this paper (Theorem 4.10) is that the optimizers utilize only finitely many frequencies, even when the quantum dynamics is a full infinite-dimensional Schrödinger dynamics on multiple potential energy surfaces (Ex. 2.2 and Section 5.3). Equations (3)-(4) replace the standard approach in quantum optimal control initiated by [26] to penalize just the L 2 or H 1 norm of the field amplitude (see [17,36,15] for mathematical results).
Recently, there has been independent and complementary research on time-sparse control of finite dimensional quantum systems [7,6]. There one finds control fields which are zero in subintervals of the time domain. By contrast, our approach yields sparsity in a frequency domain. This necessitates a more involved measure space setting, but results in an even more dramatic reduction of the support, from a continuum to a finite set.
Numerical simulations presented in Section 5 below illustrate that the optimizers concentrate on just a few frequencies, even when the quantum dynamics is a full infinite-dimensional Schrödinger dynamics. Thus our controls share an important feature of laser pulses designed by experimentalists.
The plan of this paper is as follows. In the following section we first introduce two important physical examples of controlled quantum dynamical systems which motivated this work and to which our sparsity results apply. Subsequently, the control theoretic setting is introduced. In Section 3 we introduce and discuss concrete cases of time-frequency control spaces and associated control operators. In particular, the choices (3)-(4) allow for frequency-sparsity with global time profiles, and appropriate modifications result in frequency-sparsity with local time profiles. Section 4 is devoted to the mathematical analysis of the non-smooth optimal control problem (1)- (2). We recall the relevant well known results on the existence of dynamics, establish existence of optimal controls, derive necessary optimality conditions, and prove that optimal controls are supported on only finitely many frequencies. Finally, in Section 5 we numerically calculate optimal controls and compare them to those obtained from the usual L 2 penalization of the field amplitude. Specifically, we present a 3-level example which arises in atomic excitation problems, and an example of Schrödinger dynamics on two potential energy surfaces as arising in laser-controlled chemical reaction dynamics. 2. Control system.

Evolution equation.
We consider the evolution of general quantum systems with controls, of the form where the state ψ(t) belongs to some Hilbert space H, H 0 is a (bounded or unbounded) self-adjoint operator on H, the operators H are bounded and self-adjoint on H, and v (t) are real-valued scalar functions. Physically, the v are amplitudes of components of applied electric or magnetic fields, the operator H 0 is the Hamiltonian of the system in the absence of fields, and the operators H describe the system-field coupling. It is well known that under very weak assumptions on the control v (which are alway satisfied in our optimal control setting introduced below) eq. (5) possesses a unique mild solution in the state space of continuous paths in the Hilbert space, C([0, T ]; H). See Section 4 for a precise statement.
We note that eq. (5) already contains two important approximations which are valid in many situations of interest. First, quantum fluctuations of the field amplitudes have been neglected, so that one is dealing with classical fields and does not need to move to the much more complicated framework of quantum field theory. Second, spatial variations of the applied fields have been assumed to occur at much larger wavelengths than the localization length of the state ψ(t), so that it is sufficient to assume that the fields depend on time only ('dipole approximation', see e.g. [32]).
We now give two examples of (5) of physical interest. These examples are simulated numerically in Section 5, and motivated the general theory in Section 4.
Example 2.1. (Electronic states of atoms in laser fields) A standard reference in the physics literature is [32]. The Hilbert space is H = C d , and the evolution equation is where N is the number of electrons with associated energy levels E j ∈ R.
Here finite-dimensionality means that one has approximated the atomic wavefunction by ψ(t) ≈ d j=1 a j (t)ψ j and has neglected the coupling with the rest of the system. The function v(t) is the scalar amplitude of a unidirectional electric field E(t) = v(t)E 0 with direction E 0 ∈ R 3 , and the µ ij are coupling matrix elements of the form where D is the dipole operator defined by D(x 1 , .., Here e is the charge of the electron and the x i ∈ R 3 are electronic position coordinates. We remark that the off-diagonal element µ mn vanishes whenever ψ m and ψ n have the same parity. This is a simple example of a "forbidden transition". A typical control goal consists in inducing such a forbidden transition, i.e. one in which the two states of interest are only coupled indirectly via other states. Example 2.2. (Laser-guided chemical reactions) This model is of central interest in photochemistry, but to our knowledge has not hitherto been considered in the mathematical literature. The Hilbert space is infinite-dimensional, H = L 2 (R 3M ; C 2 ). The evolution is given by the two-level Schrödinger system for the nuclear wave Here R = (R 1 , .., R M ) ∈ R 3M is the position vector of the atomic nuclei, the E j (R) are electronic eigenvalues (often called potential energy surfaces) of the system which depend on the positions of the nuclei, and the µ ij (R) are dipole moment reaction coordinate functions which depend on the associated electronic eigenstates ψ More precisely, the E j (R) and ψ (R) j are eigenvalues and eigenstates of the where D (R) is the dipole operator defined by D (R) (x 1 , . . . , Here, as in the previous example, e is the charge of the electron (−1 in atomic units), and −eZ α are the nuclear charges (+Z α in atomic units). Note that for a single nucleus located at the origin, D (R) reduces to the dipole operator in the previous example. The restriction to two potential energy surfaces means that one has approximated the joint wavefunction of electrons and nuclei by a truncted Born-Oppenheimer expansion is the vector of electron positions and spins (for a careful mathematical account of this expansion in the absence of control fields and for smooth interaction potentials see [34]).
The typical situation in laser control of chemical reactions is the following, see Figure 1. The system starts in a stationary state of electrons and nuclei. The laser then induces a transition to a different electronic state. As a result the nuclei now see a different potential energy surface with respect to which they are no longer in equilibrium; for instance the new surface may no longer contain a barrier to a desired target position. Once the nuclei have moved barrier-free to the target position, the laser induces a transition back to the original surface so as to also put the electrons in the target state.
2.2. Control space. We assume that the control field v in eq. (5) is the image of a control u -in practice, a time-frequency representation of v, see Section 3 -under a control operator B. The control u is assumed to belong to a measure space of form M(Ω; U), where Ω is a locally compact space (typically a closed interval of admissible frequencies), and U is a separable Hilbert space of time-dependent functions. Admitting general separable Hilbert spaces is useful because the freedom to choose U can be exploited to generate nice optimal controls. See the examples in Section 3 and in particular the discussion after Example 3.2. The space (10) is the space of Borel measures u on Ω with values in U of finite mass norm u M . The mass norm of the measure u will appear in the cost functional J in (14) below. The space (10) is the dual of the separable space C 0 (Ω; U) of continuous functions on Ω with values in U which can be uniformly approximated by functions with compact support. The duality pairing is given by (see [24]) where u is the Radon-Nikodym derivative of u with respect to the total variation measure |u| (see [21,VII Thm. 4.1]). Note that the inner product in the integral is the Hilbert space inner product in U. This duality will be very useful.

Control operator.
We assume that the control field v in eq. (5) is obtained from the control u belonging to the above measure space via a bounded linear control operator B, more precisely: Moreover we assume that B has a bounded linear predual operator B * : L q (0, T ; R L ) → C 0 (Ω; U), by which we mean a bounded linear operator such that B * f, u C0,M = f, Bu L q ,L p for all f ∈ L q (0, T ; R L ) and all u ∈ M(Ω; U) where 1 p + 1 q = 1. Existence of a bounded linear predual operator implies the weak- * -weak(- * ) continuity of B. That is, weak- * convergence of u n to u in M(Ω; U) implies weak convergence of Bu n to Bu in L p (0, T ; R L ) if 1 < p < ∞, and weak- * convergence if p = ∞. The case p = 1 has to be excluded in our framework since we will make use of the reflexivity of L p (0, T ; R L ). Note that the operator B * depends on the Hilbert space structure of U, see the examples below.

2.4.
Cost and optimal control problem. Associated to the evolution equation (5), we consider the optimal control problem min ψ, u J(ψ, u) subject to (5), (12) where The functional J consists of the term 1 2 ψ(T ), Oψ(T ) H which describes the control goal, and a cost term α u M(Ω; U ) which contains a measure norm explained above and forces the solution to be sparse in a suitable sense. The observable O specifying the control goal can be any bounded self-adjoint operator on H. Typically O is the orthogonal projection onto the subspace we want to leave. In this case, the target constraint contribution 1 2 ψ(T ), Oψ(T ) H to the cost lies in the interval [0, 1/2]. The value 0 corresponds to a 100% achievement of the control objective, and the value 0.5 to a 0% achievement.
3. Examples of measure norm costs and control operators. We now list some examples for choosing the frequency domain Ω, the Hilbert space U of timedependent functions, and the control operator B. The first example is the prototype for generating frequency-sparse controls. It motivated the general functional analytic setting proposed above, and naturally incorporates physically relevant controls containing finitely many pulses of particular frequencies [4,1,35,31,29].
The Hilbert space U, and the Hilbert space H of quantum states, typically contain complex valued functions. However, the space U is mapped by the control operator to real valued functions. For this reason, all spaces below are equipped with a real Banach or Hilbert space structure. For complex spaces this amounts to always using the real part of the natural complex valued scalar product. To avoid confusion we often explicitly write Re ·, · H for the scalar product in H. In particular, for mappings linear always means R-linear and scalar products are real-valued and R-bilinear.
Throughout this section it will be convenient to set L = 1.
Example 3.1. (Two-scale synthesis, smooth functions of time) Here u will be a two-scale time-frequency representation of the laser field amplitude and B will generate the corresponding field. The control u can a priori contain a continuum of active frequencies, each with its own smooth envelope. This can be modeled mathematically as follows. Let Ω be a closed subset of By approximation, the expression can be extended to measure-valued controls. This extension is important in practice, because it allows sharp concentration on a small number of frequencies, and has the following mathematically rigorous integral representation: where u is the Radon-Nikodym derivative of u with respect to |u|. We remark that the above space of controls naturally contains the physically motivated finite-dimensional ansatz spaces of [1,35,31,19] in which a finite number of distinct frequencies can be switched on or off by few-parameter modulation functions: the field v(t) = n j=1 v j (t) cos(ω j t + φ j ) corresponds to Bu with B as in (16) and u(ω, t) = n j=1 δ(ω − ω j )e iφj v j (t). We allow U to contain complex-valued functions. This allows phase shifts in the different frequencies without leaving the linear setting.
The predual operator B * : is the solution operator of the second-order boundary value problem i.e. B * f = u. As we shall see in Section 4.3 the operator B * is essential for the characterization of the optimal controls. Note that the definition of the predual operator is based on the dual pairings ·, · C0,M and ·, · L 1 ,L ∞ . The equations (17) are not coupled for different ω. The operator B * is continuous and has the additional regularity B * f ∈ C 0 (Ω; H 2 ∩ H 1 0 ). Here it is important that Ω is closed. Otherwise B might not be weak- * -weak(- * ) continuous.
Example 3.2. (Gabor synthesis, L 2 functions of time) In this example we design the control operator B and the Hilbert space U so that the predual operator B * has a nice form. We take Ω ⊂ R + closed, U = L 2 (0, T ; C), p = ∞, and define B by where k : [0, T ] 2 → R is a smooth symmetric kernel. Roughly, this operator corresponds to a pre-processing of the envelopes, with only smoothed envelopes entering the equation. The predual operator B * becomes the Gabor transformation which is a time-frequency representation of the control field. Example 3.1 above with U = H 1 0 leads to a time-frequency representation u = B * f which is global in time. That is, it has a global window equal to the Green's function G of the one-dimensional boundary value problem (17) On the other hand, Example 3.2 leads to a time-frequency representation which is local in time, but the definition of B is somewhat complicated. Note that Example 3.2 can be reformulated in terms of the control operator from Example 3.1 by using a different control space. To see this, let K : L 2 (0, T ) → L 2 (0, T ) be the compact operator given by convolution with a Gaussian kernel k, (Kf )(t) = T 0 k(t, s)f (s) ds. Then K is injective and selfadjoint and has an unbounded inverse A : D(A) → L 2 . Define U k := D(A) with the induced scalar product u, v U = Au, Av L 2 . Then U k is a Hilbert space and the predual operator B * of B is (B * f )(ω, t) = K 2 (f e −iω· ) (s). This construction also works for window functions other than a Gaussian. The equivalence to the choice U = L 2 (0, T ; C), with B as in (18) follows since the dual of the map X : C 0 (Ω; L 2 (0, T ; C)) → C 0 (Ω; U k ) defined by (Xϕ)(ω) = Kϕ(ω) is an isometric isomorphism between the control spaces that preserves the image under the corresponding control operators.
Our next example shows that our setting also covers interesting cases when U does not contain time-dependent functions.
The function u(ω) here can be viewed as a time-frequency representation u(ω, t) which is independent of time t. The predual operator is, up to a constant factor, the Fourier transform of functions restricted to [0, T ], An alternative approach to achieving sparsity of a time-global frequency decomposition via an L 2 cost combined with iterative 'frequency sifting' is given in [28].
for the ansatz function g ω,s (t) = e − (t−s) 2 2σ 2 e iω(t−s) for some σ > 0. This defines a suitable extension of the control operator from Example 3.2 to measures in both time and frequency. With this control operator, each Dirac measure u = δ ω,t corresponds to a Gaussian wave packet centered at time t with frequency ω. The predual of the control operator is given by These examples by no means exhaust our framework, but are meant to give an idea of its flexibility. We also remark that frequency constraints could easily be included by restricting Ω, compare [22]. 4. Theory of the optimal control problem. In this section we will study the optimal control problem (13). We first show well-posedness of the problem. In contrast to [20] the main difficulty does not lie in the low regularity of the control since we assume sufficient smoothing of the control operator. It rather lies in the bilinearity of the state equation together with the low regularity of the state. Subsequently we derive necessary optimality conditions. We shall show that the support of the optimal measure can be influenced. Theorem 4.8 is the natural analog of Theorem 2.12 in [20]. Differences arise due to the bilinearity of the equation and the non-trivial control operator. We shall also note interesting relationships between the choices for B and U. Throughout this section we will stay in the setting of mild solutions. This suggests to develop a derivation of the necessary optimality conditions which only requires integral manipulations and no further regularity discussion for the primal and dual state are necessary.

4.1.
Existence and compactness properties for the state equation. Here we recall well-known properties of the state equation (5) needed to prove the existence of solutions to the optimal control problem. Throughout this subsection, for a given control field v ∈ L 1 (0, T ; R L ) we consider mild solutions to the state equation (5) i.e. functions ψ which satisfy ψ ∈ C([0, T ]; H) and Here G is the unitary group generated by the skew-adjoint operator −iH 0 ,H = (−iH l ) l , and v(s) ·H = l v l (s)(−iH l ). Also, here and below ϕ(t),Hψ(t) H denotes the vector with components ϕ(t),H l ψ(t) H . Existence of mild solutions as well as their differentiability in the direction of the field follow from [3, Thm. 2.5], and a convenient expression for the derivative is given in [23,Chapter 2].
with the evolution operator G defined through G(t, s)ψ(s) = ψ(t) for t ≥ s.
We remark that the evolution operators G(t, s) are unitary. This can easily be shown by working in the rotating frame, i.e. by studying the function t → G * (t)ψ(t) [27], but it will not be needed in the following.
One downside of the mild solution framework is that the typical compactness arguments based on additional spatial regularity [17,15] cannot be used. Fortunately there is a powerful replacement that works in a very general setting. We will use the following compactness result from [3, Theorem 3.6].

4.2.
Existence of optimal controls. We will now prove the existence of solutions to the optimal control problem (13). Proof. The result follows from standard reasoning in the calculus of variations. Let (ψ n , u n ) be a minimizing sequence, lim n J(ψ n , u n ) = inf J(ψ, u).
Since O is bounded and α > 0, the sequence (u n ) is bounded in M(Ω; U). Together with the fact that M(Ω; U) = C(Ω; U) * is the dual of a separable Banach space, this implies that the sequence (u n ) n has a weak- * convergent subsequence still denoted by (u n ) n with limitū ∈ M(Ω; U). The weak- * -weak(- * ) continuity of B implies that Bu n converges to Bū weakly in L p (0, T ; R L ) for some p > 1 and thus also for p = 1. By Proposition 4.2 the corresponding sequence of states (ψ n ) n satisfies ψ n (T ) → ψ(T ). Thus the first summand of J converges, ψ n (T ), Oψ n (T ) → ψ (T ), Oψ . The second summand of J is weak- * lower semi-continuous as it is a norm in a dual space. Thus we obtain lim n J(ψ n , u n ) ≥ J(ψ,ū). Together with (21) this implies the claim.

4.3.
Necessary optimality conditions. For theoretical and numerical purposes we will use the reduced form of (13), Here ψ(u) denotes the solution of (5), (12) for the control u.
The reduced cost functional can be split into two parts, a nonlinear differentiable part f (v) = ψ(T ), Oψ(T ) H (23) with v = Bu, and a nondifferentiable convex part In the next lemma we will see that the derivative of the differentiable part f is given by where ϕ is the mild solution of the dual equation Using the evolution operator this can be rewritten as Under suitable assumptions, the representation (23) of f can be derived using a Lagrange functional approach, see [26,37]. Since in our setting a variational formulation is not readily available, we will give a short proof in the present setting of mild solutions.
Lemma 4.5. Let v, δv ∈ L 1 (0, T ; R L ), and let ψ, ψ and ϕ be the corresponding solutions of (19), (20) and (24), respectively. Then the mapping f : Proof. The continuous differentiability of the state and the product rule give continuous differentiability of f and Using (20) and (24)  We can now derive the following optimality condition. It is partially inspired by the optimality condition derived in [20, Thm. 2.11] for a linear parabolic control problem of form ∂ t ψ − ∆ψ = u. and Proof. Since we can split our functional into a sum of a nonconvex and a nonsmooth part, the result can be deduced from the general differential calculus of Clarke [8]. Because this calculus is somewhat intricate and due to the importance of the optimality system, we prefer to give a more elementary proof.
Letū be a local minimizer of problem (22) and letψ andφ be the corresponding solutions of (19) and (24). We first show the variational inequality Sinceū is locally optimal, we have for u ∈ M(Ω; U) and h ∈ (0, 1) sufficiently small. Using the decomposition j = f • B + g and convexity of g, this implies Since f is differentiable, taking the limit h → 0 yields (27).
Testing (27) with u = 0 and u = 2ū gives Substituting (28) into (27) gives for all u ∈ M(Ω; U). Using Lemma 4.5 on the derivative of f , equation (28) gives which proves (25). From (29)  Remark 4.7. Proposition 4.6 provides only a necessary condition for local optimality. Due to the nonlinear structure of the problem (13), we expect that there also exist non-optimal critical points of j, as well as local optima that are not global.
For specific control operators B, equation (31) implies additional regularity for u (ω). For example, in the case of the control operator from Example 3.1 we obtain the regularityū (ω) ∈ H 2 (0, T ).
The relation (30) for the support of the optimal measure gives us the following corollary.
Proof. Since B * Re φ,Hψ ∈ C 0 (Ω; U) we know that there is a compact set K ⊂ Ω such that B * Re φ,Hψ ≤ α/2 for all ω ∈ K. Using (30) this implies supp|ū| ⊂ K. Therefore supp|ū| is compact as a closed subset of the compact set K.
It says that although the frequency domain Ω might be unbounded, optimal solutions will always have bounded support. Controls from experiments, of course, always have this property, because arbitrarily fast oscillations are not realizable. But it is remarkable that such oscillations in our theoretical controls can be ruled out rigorously.
For most of the specific control operators considered in this work, a significantly stronger result holds. Proof. The results follows from (30) by an analyticity argument. In the setting of the Examples 3.1-3.3, for every f ∈ L q (0, T ), the map ω → (B * f )(ω) is real analytic. This is standard for the Fourier and Gabor transformation, and follows from analytic dependence on ω of the right hand side in (17) for the dual two-scale operator. If Ω = R then B * f can be extended in the natural way to an analytic function on R. In all cases, (B * f )(ω) → 0 if ω → ±∞.
For f = Re φ,Hψ H we obtain the analyticity of ω → (B * Re φ,Hψ H )(ω) 2 U , which implies that the value α 2 is attained either everywhere in R or only at discrete points. Since the function vanishes at infinity and the support is compact by Corollary 4.9, the support condition (30) implies the finiteness of supp|ū|.
This result is of significant physical interest. It implies that the optimal control is in fact a finite sum of Dirac measures in frequency with values in U. We thus recover controls of the form used by experimentalists without explicitly requiring any simple structure.
Comparing Theorem 4.10 to [20, Corollary 2.13], we do not assume that the support is discrete but derive it from identifying and using smoothing properties of the adjoint control operator. Example 3.4 cannot be treated with the same technique. Although the map ω → (B * f )(ω) is complex analytic for f ∈ L q (0, T ) if Ω is identified with a subset of C, the analyticity does not carry over to to the absolute values. 5. Numerical results. In this section we apply the framework for sparse timefrequency control to different quantum systems. First we will describe our numerical approach. This includes a short discussion of the discretization and the regularization of the optimal control problem. Then we will present two examples. The first example is the finite-dimensional system from Example 2.1. It serves to illustrate basic effects of sparse control of quantum systems. The second example is the two-level Schrödinger system from Example 2.2. The focus in this more challenging example will be the effect of different control spaces on the resulting optimal controls. 5.1. Numerical approach. Our numerical approach relies on the following three steps. First, we discretize the measure space by a finite sum of Dirac measures with values in a finite-dimensional Hilbert space. Then we Huber-regularize the corresponding finite-dimensional nonsmooth problem. The regularization turns regions with zero frequency amplitude into regions with amplitude below an explicit threshold, see Theorem 5.1 below. Finally, we solve the resulting smooth optimization problem with a quasi-Newton method.
The first step depends on Ω, of course. In our examples we always have Ω ⊂ R k for k ∈ {1, 2} and Ω is an interval or the product of two intervals. We fix a uniform (tensor) grid Ω h and choose measures supported at those points as our ansatz space. Those measures can always be written as finite sums of Dirac measures. In this discrete setting the measure norm reduces to an 1 norm for the coefficients from U multiplying the Dirac measures. To obtain a discrete problem we also need to discretize the Hilbert space U and the quantum system. In our examples we have U = H 1 0 (0, T ; C) or U = L 2 (0, T ; C) where we use piecewise linear finite elements on a uniform grid with the appropriate discrete norms, or U = C were we do not need to discretize. The discrete Hilbert space is denoted by U h . The control operator B h maps discrete controls to piecewise linear functions. The discretization of the quantum system depends on the system at hand. We approximate the time evolution of the discretized quantum system by a generalized Suzuki-Trotter method, see [13]. Together we obtain a finite-dimensional optimization problem that is non-smooth and non-convex.
To deal with the nondifferentiability of the norm at the origin we Huber-regularize this non-smooth problem. We replace the norm of U in the 1 (U h ) norm by the function h θ : U → R given by for some regularization parameter θ. The function h has the following two important properties: it is differentiable, and the derivatives of h θ and the derivatives of the norm of U have the same behavior outside of a small neighborhood of zero. The first property makes the cost functional differentiable. The second property preserves the sparsity of solutions, in the sense that frequency regions with zero control amplitude turn into regions with control amplitude below the explicit threshold θ. More precisely we obtain the following theorem which holds e.g. in the case Ω = [ω − , ω + ]∩ hZ for some mesh size h > 0.
Theorem 5.1 (Characterization of frequency support after frequency discretization and Huber regularization). Let Ω be a discrete set of admissable frequencies and choose any Huber regularization parameter θ > 0. Consider the optimal control problem (13) with the norm u M(Ω;U ) replaced by the Huber regularized 1 (U) norm ω∈Ω h θ (u(ω)). Then optimal controlsū satisfy Proof. In this case the cost is differentiable and the control is a function rather than a measure with respect to frequency, and the first order optimality condition 0 = j (ū)(δu) acquires the simple form 0 = (B * Re φ,Hψ )(ω) + α∇h θ (ū(ω)) for all ω ∈ Ω.
Since ∇h θ (z) equals z/ z U for z > θ and z/θ for z U ≤ θ, the assertion follows.
In the numerical examples below, only a few frequencies were contained in the superlevel set on the left which numerically replaces the frequency support of u. See e.g. Figure 3.
We solved the resulting C 1 -smooth problem with a quasi-Newton method, accepting the fact that the Dennis-Moré condition cannot be expected to hold, since the functional is not guaranteed to be C 2 at the solution. This would be required for local superlinear convergence, see e.g. [18]. Since the dimension of the control space can become quite large with our approach we chose the memory efficient L-BFGS method, see [25]. For the numerical realization gradients for the discretized problems were used. The optimization method was terminated as soon as the 2 (U h ) norm of the gradient relative to the largest gradient was below a tolerance of at least 10 −6 .
The cost parameter is always chosen such that one achieves at least 95% of the control objective, i.e. 1 2 ψ(T ), Oψ(T ) H ≤ 0.025. An automated choice of α would be helpful to obtain useful cost parameters for a variety of problems.
The result of this nonlinear optimization problem also depends on the initial guess for the control. For small α, the initial guess for the control uses a fixed element in U h together with a random complex phase for the different frequencies.
For larger α, where such a generic initial guess leads to convergence of the method to suboptimal critical points near the origin, we use optimal solutions for smaller α as initial guess. 5.2. Three level system. As our first example we chose a typical finite-dimensional model of an atom in a laser field, see Example 2.1. We use the matrices This corresponds for instance to a 1s, 2s, and 3p state. Note that the coupling matrix elements (7) between the 1s and 2s state vanishes, i.e. the transition 1s → 2s is "forbidden" since these states have equal parity. On the other hand the transitions 1s → 3p and 3p → 2s are allowed. The control objective is to reach the third eigenstate starting from the ground state. The initial condition and the observation operator are then given by ψ 0 = (1, 0, 0) and O = diag(1, 1, 0). We use a frequency band Ω = [2,5]  given by (15)-(16), and a cost parameter of α = 0.1. The control objective was achieved to more than 99.999%.
The great advantage of the measure space control is that we can decompose the field into simple components. Figure 2 shows the optimal control. We see that only two frequencies have a visible contribution. They correspond to the two Bohr frequencies ω 1 and ω 2 . The time profiles are smoothly switched on and off and remain active during the whole time. This is consistent with the choice U = H 1 0 (0, T ; C), which promotes smoothness and non-locality in time. The field for the first transition reaches its maximum before the field for the second transition. This reflects the intuitive idea that we have to induce the transition between levels one and three before that between levels three and two. In fact, this decomposition into two pulses bears considerable resemblance to the simple and intuitive fewparameter pulses which have been used by laser physicists for a long time. Compare, in particular, the two pulses in Figure 9 of [4], whose achievement of the control objective was beautifully demonstrated experimentally. Figure 3 gives a more detailed numerical illustration of the frequency sparsity of the optimal control. Due to the effect of Huber regularization with θ > 0, the numerical analog of the support of u is the superlevel set { ω | u(ω) U > θ } (see equation (34)), which is seen to consist of just two frequencies.
5.3. Schrödinger dynamics on two potential energy surfaces. In this second example we consider a Schrödinger system on two potential energy surfaces as arising in the laser control of chemical reactions, see Example 2.2. The spectral gap between the two potential energy surfaces varies depending on the position of the nuclear wave function and therefore a much larger variety of frequencies is potentially useful for successful control. We also expect an additional time structure in the controls due to the movement of the densities in space. Therefore it is much more challenging to obtain simple controls for this example compared to the previous one. We focus on a comparison between controls generated for different choices of the space U of time profiles and the control operator B.
For simplicity we limit ourselves to one reaction coordinate, i.e. space dimension d = 1. The control objective is to reach the potential well on the right starting from the potential well on the left, see Figure 1. The initial state ψ 0 is a Gaussian located  Figure 3(A) it is depicted that the numerical optimal control (dots in (A)), drops by three orders of magnitude below the threshold given by the numerical regularization parameter θ (dashed line in (A)), away from the coincidence set (B * Re ϕ,Hψ )(ω) U = α, precisely as theoretically predicted by equation (34). In Figure 3(B) we illustrate the quantifiers of Theorem 5.1, which asserts that the optimal control should vanish off the frequencies where the norm (B * Re ϕ,Hψ )(ω) U (dots in (B)) reaches the cost parameter α (solid line in (B)).
in the lower well. The observation operator O is the projection on the complement of functions with support on the lower surface on the right of the potential barrier. Instead of the physical coupling given by (9) we use a coupling operator H 1 given by the multiplication with the reaction coordinate on the diagonal and the identity on the off-diagonal.
The energy differences between the two potential energy surfaces measured at the local minima of the lower surface are approximately 0.074 and 0.048. Therefore we expect optimal controls to contain the two frequencies ω 1 ≈ 0.074 and ω 2 ≈ 0.048. We chose a time horizon of T = 3000 and a time grid with 2048 points. The time horizon was chosen large enough to allow for sufficiently many periods with the Bohr frequencies ω 1 and ω 2 , and for sufficient movement of the wave function in space. For the discretization in space we use a simple finite difference scheme on the domain [−4, 4] with 256 grid points. We validated that this discretization is sufficient for the range of parameters relevant to this application.
We compare the resulting optimal fields for different choices of the frequency domain Ω, the Hilbert space U of admissible time profiles, and the control operator B. We also compare them to the optimal field for the classical Hilbert space cost functional with L 2 (0, T ; R) norm. In particular we choose the following setups.   ψ , Oψ for different control spaces. A value of f (ū) = 2.5 · 10 −2 corresponds to a 95% achievement of the control goal. and B as in Example 3.2, with the Gaussian kernel k suitably adapted to generate homogeneous Dirichlet boundary conditions. We discretize Ω and U as before. For the evaluation of B we explicitly construct the matrix K corresponding to the kernel k. For the simulations we chose α such that probability to end up in the desired subspace is near the value of 95%, see Table 1.
To compare with the standard L 2 controlv, we also computed a sparse representation u of the latter, by a posteriori minimization of Bu −v 2 L 2 (0,T ) + α u M(Ω;H 1 (0,T ;C)) with the two-scale synthesis operator B and a value α = 10 −4 chosen to achieve a good approximation of the optimal field. Figure 4 shows the optimal controls. We see that their structure depends on the different cost functionals, and that the measure cost (14) for the two-scale and Gabor time-frequency representation leads to an extremely small number of active frequencies. In the following we will discuss the most significant differences and similarities.
In the second column of Figure 4 we show the total magnitude of the control as function of frequency. For smooth controls u = u(ω, t), and before discretization, this contribution is defined as u(ω, ·) U for the first three choices of the cost, as T 0 |u(ω, ·)| dt for the time-frequency Gabor approach, and as u(ω, ·) H 1 (0,T ;C) for the L 2 control in order to compare it to the first cost. For the frequencysparse controls, and after discretization, the contribution becomes u(ω, ·) U h , see Section 5.1, respectively -in the time-frequency Gabor case -the discrete 1 norm of u(ω, ·) on the time grid. We focus our attention to the regions around       u(ω, t)). The dashed red line in the middle column, rows 1 to 3, indicates the Huber threshold, and the nonzero contributions below it are an artifact of the Huber regularization (see Theorem 5.1). In the rightmost column, the absolute values of the optimal measures are plotted in the time-frequency plane. Note that in column 1, 2, and 4 far fewer frequencies are active compared to the standard L 2 control (column 5). the two Bohr frequencies ω 1 and ω 2 . The regions correspond to the transitions up from the first well and down into the second well, respectively. The expected and desired behavior is that only those regions contain active frequencies. This is clearly obtained in the four choices of the sparsity promoting cost functionals. The timefrequency representation of the L 2 control contains many more active frequencies compared to the other controls.
It is remarkable how simple the structure of the control becomes in the two cases where the Gabor transform is used. The two-scale synthesis and the Fourier cases lead to slightly larger number of active frequencies.
The choice of a specific cost functional may be guided by the concrete application under consideration. 6. Conclusion. Measure valued costs imposed on time-frequency representations of the electric field as introduced in this paper systematically produce frequency sparse controls, in contrast to standard L 2 costs. The resulting controls resemble physically intuitive controls designed by experimentalists. We hope that the measure-space, time-frequency approach will reduce the current gap between numerical controls on the one side and experimental implementation and physical intuition on the other. The flexibility of the approach can be further exploited to construct controls suited for concrete experimental setups.