NORMALITY AND UNIQUENESS OF LAGRANGE MULTIPLIERS

. In this paper we study, for certain problems in the calculus of variations and optimal control, two diﬀerent questions related to uniqueness of multipliers appearing in (cid:12)rst order necessary conditions. One deals with conditions under which a given multiplier associated with an extremal of a (cid:12)xed function is unique, a property which, in nonlinear programming, is known to be equivalent to the strict Mangasarian-Fromovitz constraint quali(cid:12)cation. We show that, for isoperimetric problems in the calculus of variations, a similar characterization holds, but not in optimal control where the corresponding condition is only suﬃcient for the uniqueness of the multiplier. The other question is related to the set of multipliers associated with all functions for which a solution to the constrained problem is given. We prove that, for both types of problems, this set is a singleton if and only if a strong normality assumption holds.


3170
KARLA L. CORTEZ AND JAVIER F. ROSENBLUETH called in that paper the "strict Mangasarian-Fromovitz constraint qualification" (introduced, according to [15], by Fujiwara, Han and Mangasarian in [10]) and the second order conditions then hold on the set of tangential constraints relative to a subset of the previous one which takes into account the sign of the corresponding Lagrange multipliers. Moreover, as shown first in [15] (by using theorems of alternative) and later, similarly, in [12], uniqueness of the Lagrange multipliers turns out to be equivalent precisely to that strict constraint qualification. Now, the definition of this strict version of the Mangasarian-Fromovitz constraint qualification requires the existence of Lagrange multipliers given beforehand. The set of tangential constraints where the second order conditions hold includes inequalities whenever these multipliers vanish, and equalities if they are positive. However, according to [12], it could not properly be considered a constraint qualification since the set of active indices which are positive is not known before the validation of the first order necessary conditions. A similar statement is made by Wachsmuth in [22], where uniqueness of the multipliers is studied in terms of (non-strict) constraint qualifications. There, it is shown that the linear independence constraint qualification is the weakest constraint qualification which ensures the existence and uniqueness of Lagrange multipliers.
In this paper we study similar questions for certain problems in the calculus of variations and optimal control. For these problems, we are interested in characterizing uniqueness of the multipliers appearing in first order conditions. In particular, we show that the results of [15] and [22] can be extended to problems in the calculus of variations involving isoperimetric inequality constraints. Also, we show how the results of [22] can be generalized, in terms of the corresponding linear independence constraint qualification, to optimal control problems with inequality and equality constraints in the control functions. However, for this kind of problems, we provide some examples showing that the characterization given in [15] of the uniqueness of Lagrange multipliers in nonlinear programming, in terms of the strict Mangasarian-Fromovitz constraint qualification, may not hold. In general, as we shall see, the corresponding constraint qualification implies uniqueness of the multipliers but the converse, contrary to the result of [15] in the finite dimensional case, is not necessarily true.

The finite dimensional case.
In this section we shall elaborate with more detail on some of the ideas mentioned above. The main object is to summarize (and explain with some detail) the results on uniqueness of Lagrange multipliers for the finite dimensional case established in [15] and [22]. We begin with a classical approach, based on the notion of "constraint qualifications," yielding the main results of those two references. We then provide a second approach, based on the notion of "regularity" as presented in [13,14], which allows us to explain the main ideas stated before in (what we believe to be) a clearer and succint way.
The nonlinear programming problem we shall deal with, which we label (N), is that of minimizing f on the set S, where f, g i : R n → R (i ∈ A ∪ B) are given functions, A = {1, . . . , p}, B = {p + 1, . . . , m}, and

• A classical approach
For this approach we shall assume, as in [5,15,22], that the functions defining the problem are continuously differentiable and, when second derivatives occur, In general, if x 0 affords a local minimum to f on S, the (KKT) conditions may not hold at x 0 , and some additional assumptions should be imposed to guarantee that Λ(f, x 0 ) ̸ = ∅. Assumptions of this nature are usually referred to as constraint qualifications (see [12]) since they involve only the constraints and are independent of the geometric structure of the feasible set S (a broader definition, in terms of critical directions, is given in [5]). Equivalently, they correspond to conditions which assure the positiveness of the cost multiplier λ 0 in the Fritz John necessary optimality condition, which can be stated as follows.
Based on the theory of augmentability, a simple proof of this result is provided by McShane [18]. In [12], the proof given uses Motzkin theorem of the alternative.
There are many well-known constraint qualifications. A detailed explanation of some of them is given in [12] which includes, to mention a few, those of Slater For all x ∈ S, denote by I(x) := {α ∈ A | g α (x) = 0} the set of active (or effective or binding) indices at x. In [15], two well-known constraint qualifications are mentioned, and they can be stated as follows.
(MF) Mangasarian-Fromovitz at x 0 . The set {g ′ β (x 0 ) | β ∈ B} is linearly independent and ∃h such that A third condition ("a new condition introduced in [10]") which is more restrictive than (MF) but less restrictive than (LI) corresponds to the following.
The characterization of uniqueness of Lagrange multipliers given in [15] corresponds to the following result.
A second result given in [15], connecting the strict Mangasarian-Fromovitz constraint qualification with second order necessary conditions, is stated as follows.
The proof of this result, given in [15], says: "The proof parallels exactly the proof of Theorem 3.3 in Ben-Tal [5] which in turn follows directly from a more general Theorem 3.2 in [5]." Let us mention that Theorem 3.3, as stated in [5], corresponds precisely to the false result mentioned in [12] and the correct result mentioned in [12] is precisely the one given in [15], that is, Theorem 2.4 above.
Explicitly, Theorem 3.3 in [5] states that, if x 0 affords a local minimum to f on S and (LI) holds, then ∃λ ∈ Λ(f, A simple counterexample to this result is given in [12]. Let us turn now to the results given in [22] treating also, though from a different viewpoint, the question of uniqueness of Lagrange multipliers. To do so, let us begin with the concept of tangent cone. When choosing a specific definition of this notion (also for modified cone approximations, including Clarke's tangent cone) it feels, as Aubin and Frankowska put it in [1], "like opening the door of a ménagerie of tangents, and facing the choice of a favorite pet!" For our purposes, we shall find convenient to choose the definition given by Hestenes [13] which, as shown in [12], is equivalent to the one introduced by Bouligand (1932), also known as the contingent cone to S at x 0 . Other authors such as Bazaraa, Goode, Nashed, Varaiya, Kurcyusz, Rockafellar, Saks, Rogak, Scott-Thomas, Elster, Thierfelder, and many more, have given various equivalent definitions of such a cone. Definition 2.5. We shall say that a sequence {x q } ⊂ R n converges to The tangent cone of S at x 0 , denoted by T S (x 0 ), is the (closed) cone determined by the unit vectors h for which there exists a sequence {x q } in S converging to x 0 in the direction h. Equivalently (see [14]), T S (x 0 ) is the set of all h ∈ R n for which there exist a sequence {x q } in S and a sequence {t q } of positive numbers such that Let us just briefly mention that the latter is the definition of tangent cone chosen by Wachsmuth in [22] except that the requirement that the sequence {x q } should belong to S is omitted. This is, however, crucial in the definition of tangent cone of a specific set S at a point. Now, clearly, if {x q } converges to x 0 in the direction h and f has a differential at x 0 , then If f has a second differential at x 0 , then From these facts, first and second order necessary conditions follow straightforwardly.
Proof. Let h ∈ T S (x 0 ) be a unit vector and {x q } ⊂ S a sequence converging to x 0 in the direction h. For large values of q we have f (x q ) ≥ f (x 0 ) and, therefore, .
Now, recall that, for any B ⊂ R n , the set is a closed convex cone, called the dual or polar cone of B. The dual cone T * S (x 0 ) of the tangent cone of S at x 0 is called the normal cone of S at x 0 . By the first part of Theorem 2.6, if x 0 is a local minimum point of a C 1 function f on a set S (actually, merely differentiability at x 0 is required) then the negative gradient From the theory of convex cones (see, for example, [13,14]) or using the Farkas-Minkowski theorem of the alternative (see [12]), it follows that is the set of vectors satisfying the tangential constraints at x 0 (see [13,14]), also called the linearized tangent cone or the cone of locally constrained directions (see [12]). Note that In [22], the term constraint qualifications corresponds to assumptions on the constraints which ensure that the condition −f ′ (x 0 ) ∈ R * S (x 0 ) is a necessary optimality condition for our problem. In view of the remarks given above, this coincides with our previous definition of constraint qualification. Now, as pointed out in [22], the constraint qualifications are independent of the objective function f . Hence, if a constraint qualification implies a certain property for the multipliers satisfying (KKT), this property would hold for all objective functions (for which x 0 affords a local minimum). With this in mind, define The result on uniqueness of Lagrange multipliers given in [22] is the following.
Note that this result and that of Kyparisis [15], Theorem 2.3, are quite different. In contrast with the former, the latter states that, Since the (SMF) condition relies on the existence of Lagrange multipliers and depends (indirectly) on the objective f , Wachsmuth [22] refrains from calling this a constraint qualification and points out that (LI) is indeed a constraint qualification which ensures the existence and uniqueness of Lagrange multipliers.

• A regularity approach
We shall now give a different approach which shows in a clear way how some of the constraint qualifications have emerged and how second order conditions can be easily established. It is based on the notions of regularity, normality and properness, and we refer to [13,14] for a full account of these ideas. Recall . This condition is also known as Abadie's constraint qualification (see [12]). The first order Lagrange multiplier rule is a consequence of Theorem 2.6 and the following auxiliary result on linear functionals derived in [13,14] through the theory of convex cones.
Proof. By Theorem 2.6 and regularity of x 0 relative to S, f ′ (x 0 ; h) ≥ 0 for all h ∈ R S (x 0 ). The result then follows by Lemma 2.8.
The second order Lagrange multiplier rule is also a straightforward consequence of Theorem 2.6.
Proof. Since x 0 minimizes F on S 1 and F ′ (x 0 ) = 0, the result follows by Theorem 2.6.
Note that S 1 defined above depends on the Lagrange multiplier λ ∈ R m and, if we set Γ = {α ∈ A | λ α > 0} as before, then Therefore, by definition of tangential constraints, we have Observe that, in the definition of regularity, sequential tangent vectors (elements of T S (x 0 )) are used. If we replace them with curvilinear tangent vectors, that is, elements of we obtain a modified regularity condition known as the Kuhn-Tucker constraint qualification. Note that C S (x 0 ) ⊂ T S (x 0 ) ⊂ R S (x 0 ) and the constraint qualification corresponds to the condition R S (x 0 ) ⊂ C S (x 0 ). An even weaker condition introduced in [14] is that of quasiregularity relative to S in case the outer normals of S at x 0 , that is, the elements w of T * S (x 0 ), are expressible as w = In general, it may be difficult to test for regularity and one usually requires some criteria that implies that condition. A simple criterion is that of normality. As pointed out by Hestenes [13], "it is customary in the calculus of variations to call a condition on the gradients g ′ 1 (x 0 ), . . . , g ′ m (x 0 ) a normality condition if it implies regularity at x 0 ." In [14], normality with respect to S is defined as follows.
Let us remark that the extended multiplier rule stated in Theorem 2.2 (Fritz John necessary optimality condition) yields in a natural way this definition of normality since in that theorem, if x 0 is also a normal point of S, then λ 0 > 0 and the multipliers can be chosen so that λ 0 = 1, implying the nonemptiness of Λ(f, x 0 ). Now, as shown in [14], the desired relation between normality and regularity does indeed hold. It is the basic result relating the notions of regularity and normality. The proof of this result, given in [14], relies strongly on the following characterization of normality.
A proof of the fact that normality and properness are equivalent is given in [14]. In [12] and the classical literature, both are known as constraint qualifications, the former due to Cottle-Dragomirescu and the latter to Mangasarian-Fromovitz. Let us mention that the proof of Theorem 2.3 given in [12,15] is based on this equivalence which is proved in both references by using theorems of alternative (see Motzkin in [12,Theorem 2.4.19]). Now, the notion of normality relative to S can certainly be applied to the subset S 1 of S. It yields the following condition.
The following fundamental result on second order necessary conditions is a consequence of Theorems 2.10 and 2.12. Note that this result and Theorem 2.4 are the same since the condition (SMF) is no other than properness relative to S 1 (λ) which is equivalent to normality relative to S 1 (λ). Thus, the correct result by Kyparisis [15] (mentioned in [12]) had been previously established by Hestenes in [14, Theorem 7.5, p 227 and Theorem 10.4 p 241]. Of course, also Theorem 2.3 can be stated in terms of the set S 1 (λ). Theorem 2. 16. Suppose x 0 ∈ S and λ ∈ Λ(f, x 0 ). Then x 0 is normal relative to Let us turn now to the linearly independence constraint qualification (LI) seen through this regularity approach. Let us first point out that, surprisingly, the definition of normality given by Hestenes in [13] is not equivalent to the previous one given in [14]. To explain this other definition, suppose x 0 ∈ S, and consider the set of equality constraints Applying Definition 2.11 to this set, A) and λ * g ′ (x 0 ) = 0] ⇒ λ = 0 (here ' * ' denotes transpose). This is equivalent to the condition that the linear equations g ′ i (x 0 ; h) = 0 (i ∈ I(x 0 ) ∪ B) in h be linearly independent, which is precisely the (LI) constraint qualification. And this is the way normality is defined in [14].
Note that, given x 0 ∈ S and λ ∈ R m with λ α ≥ 0 (α ∈ A), we have R S0 (x 0 ) ⊂ R S1 (x 0 ) ⊂ R S (x 0 ). Also, if x 0 is a normal point of S 0 , then it is a normal point of S 1 , and hence a normal point of S. Moreover, as mentioned before, normality relative to S implies regularity relative to S.
This definition has several implications in the theory of necessary optimality conditions. One of them, as explained in [5], is that in most textbooks (see a list of well-known references in [5]) a result weaker than Theorem 2.15 is cited. Namely, in that theorem, the assumption of normality relative to S 1 (λ) is replaced by (the stronger assumption of) normality relative to S 0 , and the set of tangential constraints R S1 (x 0 ) is replaced by (the, in general, smaller set of tangential constraints) Explicitly, this rather "well-worn result" (as Ben-Tal puts it [5]) is the following.
As pointed out in [5], "The source of this weaker result can be attributed to the traditional way of treating the active inequality constraints as equality constraints." Let us also explain a second implication of this condition related to first order conditions and uniqueness of multipliers. As mentioned before, one is interested in obtaining criteria for regularity and a simple one is that of normality. That is, regularity relative to S and normality relative to S. However, by a simple application of the implicit function theorem, one can easily prove that, given x 0 ∈ S, normality relative to S 0 implies regularity relative to S (see [13, Lemma 10.1, p 35]). By Theorem 2.9, if x 0 affords a local minimum to f on S and x 0 is normal relative to S 0 , then Λ(f, x 0 ) ̸ = ∅. Moreover, as one readily verifies, λ ∈ Λ(f, x 0 ) is unique (see [13, Theorem 10.1, p 36]). On the other hand, if normality is assumed relative to S, then there exists λ ∈ Λ(f, x 0 ), but λ may not be unique.
Let us end this approach with the main results of [15] and [22]. Let x 0 ∈ S. Theorem 2.3 states that, given f ∈ C 1 (R n , R) and λ ∈ Λ(f, x 0 ), x 0 is normal relative to S 1 (λ) if and only if Λ(f, x 0 ) = {λ}. Theorem 2.7 states that Λ(f, x 0 ) is a singleton for all f ∈ C 1 (R n , R) such that x 0 affords a local minimum to f on S if and only if x 0 is normal relative to S 0 (x 0 ).

Isoperimetric inequality constraints.
In this section we shall deal with a fixed endpoint problem of Lagrange in the calculus of variations posed over piecewise smooth arcs and involving inequality and equality isoperimetric constraints. We pose, as before, the question of uniqueness of Lagrange multipliers appearing in first order necessary conditions. As we shall see, there is a striking likeness between the main results stated in the previous section and those for this problem.
To state the problem, suppose we are given an interval T := [t 0 , t 1 ] in R, two points ξ 0 , ξ 1 in R n , functions L and L γ mapping T × R n × R n to R and scalars b γ in R (γ = 1, . . . , q). Denote by X the space of piecewise C 1 functions mapping T to R n and let . . , r}, Q = {r + 1, . . . , q}, and Consider the problem, which we label (V), of minimizing I on S, where Elements of X will be called arcs, of S admissible arcs, and an admissible arc x is a (local) solution to the problem (V) if (upon shrinking T × R n if necessary) any other admissible arc y satisfies I(x) ≤ I(y). Given x ∈ X we use the notation (x(t)) to represent (t, x(t),ẋ(t)). Also, we assume that the functions L, L γ are C 1 and, when second derivatives occur, they are C 2 .
The first and second variations of other integrals such as I γ are defined in a similar way. Define the set of admissible variations as Y := {y ∈ X | y(t 0 ) = y(t 1 ) = 0}.
The following result provides well-known first order necessary conditions (see, for example, [13]). It is the analogous of Theorem 2.2, the Fritz John necessary optimality condition for problem (N).
Based on this result, we define normality relative to S (compare with Definition 2.11) or weak normality, as follows.

Definition 3.2.
An arc x 0 will be said to be normal relative to S or weakly normal if λ = 0 is the only solution to i. λ α ≥ 0 and λ α I α (x 0 ) = 0 (α ∈ R). ii.
Clearly, if x 0 solves (V) and is weakly normal, then λ 0 > 0 in Theorem 3.1 and the multipliers can be chosen so that λ 0 = 1. In this event, the couple (x 0 , λ) ∈ S × R q will be called an extremal and we denote by E the set of all extremals.
, as it is well-known, is equivalent to the existence of c ∈ R n such that where F := L + ∑ q 1 λ γ L γ (see [13]).
By Theorem 3.1, we have the following first order necessary condition. Normality of a local solution to the problem relative to S implies nonemptiness of the set of Lagrange of multipliers but not uniqueness. The same occurs with the nonlinear programming problem. For problem (V), a stronger assumption which implies uniqueness of multipliers as well as second order necessary conditions is usually imposed.
To introduce this stronger assumption, denote the set of active indices at an admissible arc x 0 by I a (x 0 ) = {α ∈ R | I α (x 0 ) = 0} and, as in the previous section, consider the set Note that, by definition, x 0 is normal relative to S 0 if λ = 0 is the only solution to i. λ α I α (x 0 ) = 0 (α ∈ R). ii. ∑ q 1 λ γ I ′ γ (x 0 ; y) = 0 for all y ∈ Y . This condition is clearly equivalent to the linearly independence on Y of the first variations I ′ γ (x 0 ; y) (γ ∈ I a (x 0 ) ∪ Q) of I γ along x 0 , which is equivalent to the existence of y γ ∈ Y (γ ∈ I a (x 0 ) ∪ Q) such that We refer to [3] for these and other characterizations of normality relative to S 0 , a property which we shall call strong normality.
The way first and second order necessary conditions for problem (V) are usually established (see, for example, [13]) can be stated as follows. Both require the assumption of strong normality. Theorem 3.5. If x 0 solves (V) and is strongly normal then Λ(L, x 0 ) is a singleton, that is, there exists a unique λ ∈ R q such that (x 0 , λ) ∈ E. Theorem 3.6. Suppose λ ∈ Λ(L, x 0 ). If x 0 solves (V) and is strongly normal then J ′′ (x 0 ; y) ≥ 0 for all y ∈ Y satisfying In a recent paper (see [4]) the same second order condition of Theorem 3.6 is derived but under a weaker assumption. As in Theorem 2.15, it is expressed in terms of the set S 1 (λ) = {x ∈ S | J(x) = I(x)} where a multiplier λ ∈ R q with λ α ≥ 0 (α ∈ R) is given and the function J is as in Definition 3.3(ii), that is, If we define the set of tangential constraints at x 0 ∈ S 1 as the set of those y ∈ Y satisfying (a) and (b) of Theorem 3.6, that is, then the result obtained in [4, Theorem 1.5] can be stated as follows.
The proof of this result relies strongly on a characterization of normality, given in [4], in terms of the notion of properness (see below), similar to the one given in the previous section. It corresponds to a Mangasarian-Fromovitz type condition. In [4, Proposition 2.3], normality relative to S and properness relative to S are shown to be equivalent.
We are now in a position to state and prove the results of [15] and [22] corresponding to our isoperimetric problem. Note that, by [4], we can replace the condition 3.9(a) below with properness of x 0 relative to S 1 (λ).
Let us turn now to the result of [22]. Denote by F(x 0 ) the set of all C 1 functions L mapping T × R n × R n to R such that x 0 solves the problem V(L) of minimizing I(x) = ∫ t1 t0 L(t, x(t),ẋ(t))dt over S. The theorem on uniqueness of multipliers given in [22,Theorem 2] corresponds, in the context of isoperimetric constraints, to the following result.

Optimal control.
In this section we shall deal with an optimal control problem posed over piecewise C 1 trajectories and piecewise continuous controls, and involving inequalities and equalities in the control functions.
We shall encounter crucial differences between this and the previous optimization problems, mainly due to the fact that the constraints are no longer constant but depend explicitly on the time interval under consideration.
To state the problem, suppose we are given an interval T := [t 0 , t 1 ] in R, two points ξ 0 , ξ 1 in R n , and functions L and f mapping T × R n × R m to R and R n respectively, and φ = (φ 1 , . . . , φ q ) mapping R m to R q (q ≤ m). Denote by X the space of piecewise C 1 functions mapping T to R n , and by U k the space of piecewise continuous functions mapping T to R k (k ∈ N). Let Z := X × U m , and consider the following two sets: . . , r}, Q = {r + 1, . . . , q}. The problem we shall deal with, which we label (P), is that of minimizing the functional over S. Elements of Z will be called processes, of S admissible processes, and a process (x, u) solves (P) (locally) if (x, u) is admissible and (as in the previous section, upon shrinking T × R n if necessary) we have I(x, u) ≤ I(y, v) for all admissible process (y, v). Given (x, u) ∈ Z we shall use the notation (x(t)) to represent (t, x(t), u(t)), and ' * ' denotes transpose.
With respect to the functions delimiting the problem, we assume that, if F := (L, f ), then F (t, ·, ·) is C 1 for all t ∈ T and φ is C 1 ; F (·, x, u), F x (·, x, u) and F u (·, x, u) are piecewise continuous for all (x, u) ∈ R n × R m ; and there exists an integrable function α: T → R such that, at any point (t, x, u) ∈ T × R n × R m , These assumptions are standard for the derivation of first order necessary conditions (see, for example, [19,20]). Also, we assume that the q ×(m+r)-dimensional matrix has rank q on U (here δ αα = 1, δ αβ = 0 (α ̸ = β)), where This condition is equivalent to the condition that, at each point u in U , the matrix First order necessary conditions are well established and one version can be stated as follows (see [13,19]).
Based on this result, we define normality relative to S as we did in the two previous sections, that is, if the cost multiplier vanishes in the above system, the only solution is the null one.
The elements of E in the above definition will be called extremals. For all (x 0 , u 0 ) ∈ S, denote by Λ(L, x 0 , u 0 ) the set of all (p, µ) ∈ X × U q such that (x 0 , u 0 , p, µ) ∈ E. From the above definitions it follows that, in Theorem 4.1, if (x 0 , u 0 ) is a normal process of S, then Λ(L, x 0 , u 0 ) ̸ = ∅, that is, there exists (p, µ) ∈ X × U q such that (x 0 , u 0 , p, µ) is an extremal.