FEEDBACK NECESSARY OPTIMALITY CONDITIONS FOR A CLASS OF TERMINALLY CONSTRAINED STATE-LINEAR VARIATIONAL PROBLEMS INSPIRED BY IMPULSIVE CONTROL

. We consider a class of rightpoint-constrained state-linear (but non convex) optimal control problems, which takes its origin in the impulsive con- trol framework. The main issue is a strengthening of the Pontryagin Maximum Principle for the addressed problem. Towards this goal, we adapt the approach, based on feedback control variations due to V.A. Dykhta [4, 5, 6, 7]. Our necessary optimality condition, named the feedback maximum principle, is expressed completely in terms of the classical Maximum Principle, but is shown to discard non-optimal extrema. As a connected result, we derive a certain form of duality for the considered problem, and propose the dual version of the proved necessary optimality condition.

1. Introduction. In the paper, we address a particular class of state-linear optimal control problems with a simplest form rightpoint condition. This class of ordinary problems draws from an impulsive trajectory extension of a dynamical system, which is affine with respect to (w.r.t.) an unbounded control input: Here, T . = [0, T ] is a given finite control period, x 0 ∈ R n is an initial state position, U ⊂ R m is a given compact set, and A, B : U → R n×n , a, b : U → R n are given matrix and vector functions. System (1) is driven in two principally different ways: by an ordinary compact-valued measurable control u, and an L 1 -control v with unbounded values.
Integral relation in (2) is a sort of constraints that are sometimes called "soft" or "energetic bounds" [10], and makes sense of an a priori estimation (qualitatively expressed by a given positive real M ) of the total resource of controller over the time period T .
Systems of type (1), (2) and related control problems arise in a series of real life applications of mathematical control theory in mechanics, economics and management (see, e.g., the monographs [8,14,22], and the bibliography therein).
In fact, when regarding conditions (1), (2), one can immediately note that the model is ill-posed, since the control inputs v can be arbitrarily close in L 1 to distributions of Dirac type, and therefore, the respective state trajectories may tend to discontinuous functions. This assumes that related optimization problems could lack the existence of solutions, and therefore, are not correctly stated. Such an incorrectness can be overcome by a compactification of the trajectory tube of (1) in a relevant weak topology, weaker than the natural topology of uniform convergence (this could be the topology of pointwise convergence [2], or weak* topology of the Banach space of functions with bounded variation [14], or the topology defined by convergence of trajectories' graphs in the sense of Hausdorff distance [18]). The compactification leads to an impulsive dynamical system, i.e., a system with, possibly, discontinuous trajectories of uniformly bounded variation. Under certain natural convexity assumptions, the extended model takes the form of a dynamical system driven by distributions or measures: Here, x(0 − ) is the left one-sided limit of function x at zero (we agree that the trajectories of (3) are right-continuous), µ is a signed Borel measure, and |µ| denotes the total variation of µ. In fact, one should note that (3) is admitted to contain the product of measurable (discontinuous) function and a point-mass distribution, which is an incorrect operation. In this case, conditions (3), (4) are rather formal, though still give an intuition about the nature of the discussed trajectory extension. In general, the "control → trajectory" mapping of (3), (4) is set-valued, and a correct form of the desired trajectory extension is a more sophisticated measuredriven system with extra controls describing the way of approximation of Dirac type measures by ordinary controls. We refer to [1,2,14,15] for further details on impulsive trajectory extensions of dynamical systems, and pass to the wellknown but notable fact stating that the extended (relaxed) model (1), (2) can be equivalently transformed to an ordinary control system acting on the time interval S .
The cornerstone of this transformation is an absolutely continuous parametrization s → t(s) of the original time variable t = t(s) by a new "extended" time s. The original and reduced states are related by the (in general, discontinuous) time change x(t) = x(t ← (t)), where the symbol ← denotes the pseudo-inverse function defined by the relations: t ← (t) = inf{s ∈ S : t(s) > t} for t ∈ [0, T ) and t ← (T ) = T + M . This approach was independently invented by [17] and [21], and comprehensively treated in [14]. We note that system (5)-(7) has a specific dynamical structure, is weighted by a particular form terminal constraint, and is not convex in general. Based on the given notes, one can regard systems of type (5)- (7) and related variational problems as a substantial mathematical object, independently of their impulsive prototypes. The object of our present investigation is, thus, an optimal control problem for ordinary control systems of structure (5)- (7), and the main issue is a constructive strengthening of the Maximum Principle by virtue of technique [4,5,6,7], based on certain "feedback control variations". This strengthening is given by necessary optimality conditions of relatively new type. The conditions are extensions of the so-called "feedback minimum principle" to the discussed terminally constrained dynamics. The background of feedback optimality conditions is the technique of modified Lagrangians majorating the cost increment (the increment of an objective function), which -for state-linear problems -leads to an exact increment formula. These majorants are designed with the use of auxiliary functions, that are weakly monotone w.r.t. the system's dynamics [3]. By to the known principle of extremal aiming, such weakly monotone functions produce specific feedback controls, which can be used to define ordinary control processes that potentially "improve" a given reference solution. A wonderful fact is that feedback necessary optimality conditions can practically discard nonoptimal extrema of the Maximum Principle, and furthermore, can be thought of as iterative algorithms for optimal control.
2. Model statement. This section contains the setup of the main object of our study. Given a finite interval T = [0, T ], a compact set U ⊂ R m ; continuous matrix and vector functions A, B : U → R n×n , and a, b : U → R n ; vectors c, x 0 ∈ R n , and a positive real y T , consider the following optimal control problem (P ): As usual, ·, · denotes the scalar product in R n . A collection σ . = (z, w) .
= (x, y, u, v) is said to be a control process of system (8), (9), where 1], and • trajectories are absolutely continuous functions z .
A process σ is called admissible if it satisfies all the conditions (8)-(10).
3. Classical and feedback maximum principles. Letσ = (z,w) denote an admissible reference process, whose optimality is the question of our interest. Let us stress that problem (P ) is not assumed to be convex. It means that the Maximum Principle [16] does not turn here into a sufficient optimality condition, and therefore can be, potentially, improved. A rather challenging goal is to strengthen the classical result by earning an extra piece of information immediately from its standard relations. As we will show just below, such a strengthening can 204 STEPAN SOROKIN AND MAXIM STARITSYN be provided by feedback controls of a specific "extremal" structure, that remains in the formalism of the Maximum Principle.
Finally, note that the set of Carathéodory feedback solutions can be empty, while a sampling solution of (8), (9) does always exist. We denote Z(w) the set of solutions of both types (a) and (b).

3.2.
Formalism of the Maximum Principle. Introduce some necessary objects related to the Maximum Principle for problem (P ): The Pontryagin function (the non-maximized Hamiltonian) is written as are the "partial Hamiltonians" (note that H is independent of y); the adjoint (dual) system takes the form: The "variable" ξ = const is dual of y; note that, by the Maximum Principle, for extremal processes, ξ is not defined by a transversality condition. For processes, that do not satisfy the Maximum Principle, ξ can be thought of as a free parameter.
In what follows, this property will be essentially used in the definition of auxiliary feedback controls, which potentially discard local extrema. Denote Then, the maximized Hamiltonian takes the form: The maximizers are the following extremal multifunctions: (Here and on, Sign is the multivalued signature with Sign 0 = {−1, 1}.) Notice that the Maximum principle for the reference processσ = (z,w), in fact, reduces to the existence of an adjoint solution (ψ,ξ) such that the following inclusions hold a.e. on T : The idea of [4,5,6,7] consists in employing feedback controls w defined by a formal release of state position in inclusions (13): , v(t, x) ∈ Vξ(x,ψ(t)).

Terminal constraint.
It is clear that a solution of (8), (9) in any sense does not, generically, enjoy the condition y(T ) = y T . To take into account the terminal constraint, we introduce the following "corrected" multifunctions (in their definition, we employ the obvious description of the controllability set of trajectory component y to the point (T, y T )): Let W ξ denote the ξ-parametric, ξ ∈ R, set of feedback controls w = (u, v), which are selections of multivalued maps (14), (15) contracted to the dualψ of the reference trajectoryx, i.e., u(t, z) ∈Ǔ ξ (t, z,ψ(t)), and v(t, z) ∈V ξ (t, z,ψ(t)).

3.4.
Feedback maximum principle. Introduce the following accessory variational problem (AP ): The assertion below is a trivial implication of inclusions (13): Lemma 3.1. Letσ = (z,w) be a Pontryagin extremal for (P ). Then,z is admissible for (AP ), i.e., there exist ξ ∈ R and w ∈ W ξ such thatz coincides with a Carathéodory feedback solution z ∈ Z(w).
Proof. First, note thatσ is admissible for (AP ) by Lemma 3.1. Assume thatσ is optimal for (P ), but there exists ξ ∈ R, a feedback w ∈ W ξ and a respective feedback solution z = (x, y) ∈ Z(w) such that c, x(T ) < c,x(T ) . If z is a Carathéodory feedback solution, the contradiction is obvious. Let z be a sampling solution. Then, by its definition, there is a sequence z ρ of polygonal arcs converging to z in the uniform norm as diam ρ → 0. Since z ρ is a Carathéodory solution of (8), (9) produced by a piecewise constant open-loop control w ρ satisfying (10), (z ρ , w ρ ) is a control process of (P ), which still can violate the terminal constraint y(T ) = y T within accuracy diam ρ, i.e. |y ρ (T ) − y T | ≤ diam ρ. By shifting the rightmost point of ρ within its final segment T ρ (note that the length of T ρ does not exceed diam ρ!) we can perturb z ρ such that the perturbed functionz ρ = (x ρ ,ỹ ρ ) does satisfyỹ ρ (T ) = y T . Clearly,z ρ is also a Carathéodory solution of (8), (9) under another inputw ρ satisfying (10) (in fact w ρ = w ρ beyond T ρ ). Given ε > 0, consider a partition ρ such that diam ρ < ε and x(T ) − x ρ (T ) ≤ ε. Since the control perturbation is provided on a set of measure not exceeding diam ρ, standard arguments based on the Gronwall's inequality ensure that the quantity x ρ (T ) −x ρ (T ) has order ε, i.e., there exists a constant K > 0 independent of ε such that x ρ (T ) −x ρ (T ) ≤ K ε. Therefore, x(T ) −x ρ (T ) ≤ (K + 1) ε. Now, let ε > 0 be chosen such that c,x(T ) − x(T ) > (K + 1) ε c . Then, by the Cauchy-Bunyakovsky-Schwarz inequality, c,x ρ (T ) < c,x(T ) . Thus, we have defined a (P )-admissible process of a smaller cost. This contradicts the optimality ofσ. 4. Discussion and example. 1) Though the necessary global optimality condition proposed by Theorem 3.2 is hard for direct verification, even for a given process σ (not to say about applying this condition for searching "suspicious" processes), the feedback maximum principle is rather efficient in its counter-positive version, i.e., as a sufficient condition for non-optimality of a reference process. Indeed, for discardingσ as a non-optimal process, one can try to find a real ξ ∈ R and a feedback control w ∈ W ξ producing a feedback solution z = (x, y) ∈ Z(w) with the property: c, x(T ) < c,x(T ) = I(σ) (in this case, the admissibility ofz = (x,ȳ) for (AP ) does not require validation). Clearly, in such constructive form, the feedback maximum principle serves one as a conceptual kernel of iterative algorithms for problem (P ).
2) Note that the set of potentially discarding feedback controls and trajectories of (AP ) is extended due to the presence of the parameter ξ ∈ R. On the other hand, for practical implementation, the range of this parameter should be a priori estimated. In fact, an extremal processσ admits, generically, a number of adjoint trajectories (ψ,ξ) arising in Theorem 3.2. In general, one can identify the range of parameter ξ as follows: Assume we are given a multivalued map O(t) : T → R n , which contains the trajectory tube of system (8) started at (t, x) = (0, x 0 ). Then the parameter ξ is ranged in the interval [ξ − , ξ + ], where 3) The feedback maximum principle strengthen the classical Pontryagin Maximum Principle in the following sense: Assume that a functionz, being a Carathéodory solution of (8), (9) under a controlw satisfying (10), is admissible for (AP ). Then, σ = (z,w) is a Pontryagin extremal for (P ).
The absence of the inverse implication is shown by the following Example. Consider the following optimal control problem of type (P ): |v| ≤ 1.

STEPAN SOROKIN AND MAXIM STARITSYN
Finally, note that the considered model is a reduced version of the following optimal impulsive control problem:
It is clear that the cost functional of problem (P * ) can be reduced to the Mayer (terminal) form K(ψ, y, η, w) = η(0) + ψ(0), x 0 by introducing an extra state variable η satisfying the relationṡ After this transform, we can observe that the dual problem is also linear w.r.t. the state variable. Then, the developed feedback minimum principle can be adopted for problem (P * ).
The latter necessary optimality condition enjoys the same properties as the one proposed by Theorem 3.2. Note that, as certain examples show [7], similar conditions for free endpoint problems are not equivalent to each other, i.e. one of them does not imply the other one. 6. Conclusion. In the paper, we made a rather successful attempt to strengthen the Pontryagin Maximum Principle for a class of terminally-restricted variational problems. Though the derived optimality conditions -the feedback and dual feedback maximum principles -take variational forms, they can be though of as iterative algorithms for control improvement.
Note that, when implementing such algorithms, one inevitably faces similar issues in connection with the framework of discrete optimal control problems. In fact, similar results for discrete problems are recently obtained by the authors [19].
Finally, raising back the impulsive origin of the addressed specific class of models, one should point out the natural issue, which is left beyond this work. A rightful question here is a transcription of the obtained results into the terms of impulsive control. Such a transcription should exploit an appropriate correct notion of feedback impulsive control, which is not well introduced in the bibliography for measure-driven systems of the sort (3), (4). In this respect, the definition of feedback optimality conditions is a challenging issue of our nearest study.