A modified Nelder-Mead barrier method for constrained optimization

An interior point modified Nelder Mead method for nonlinearly constrained optimization is described. This method neither uses nor estimates objective function or constraint gradients. A modified logarithmic barrier function is used. The method generates a sequence of points which converges to KKT point(s) under mild conditions including existence of a Slater point. Numerical results are presented that show the algorithm performs well in practice.


1.
Introduction. This paper describes a direct search method for inequality constrained nonlinear optimization. The proposed method is a synthesis of the ideas behind barrier methods [10] and a variant [26] of the Nelder Mead algorithm [22]. The latter means only function values are needed, and the use of barrier techniques allows the objective to be undefined or infinite at points which violate constraints. Hence the method can be applied to problems which lack available derivatives. More recent developments in direct search methods [6,15,23,25,31,32] allow an asymptotic convergence theory to be constructed. An instance of the method is also numerically tested on a variety of both smooth and non-smooth test problems.
The form of the nonlinearly constrained optimization problem considered here is where f is the objective function, and c = (c 1 , c 2 , . . . , c q ) T is the vector of constraint functions. It is assumed the objective and constraint functions are 'black-box,' with only function values able to be calculated at selected points.
In essence, barrier methods [10] construct a function B(x, w) with a minimizer near a solution of (1). This minimizer is dependent on the barrier parameter w > 0 and strictly satisfies the constraints (i.e. c i < 0, i = 1, . . . , q). Two well known examples of barrier functions for (1) are the inverse and logarithmic barrier functions and These two barrier functions both fit the standard form B = f + w i P (c i (x)) for a barrier function, where P is an increasing function of c i such that P (c i ) → ∞ as c i → 0 from below. Each P (c i (x)) term applies a penalty to B which increases to infinity as any point where c i (x) = 0 is approached. An immediate consequence of this is that any minimizers of B must have c i < 0 for all i = 1, . . . , q, and so are unconstrained.
A barrier method typically minimizes B(x, w) for each of a sequence {w m } of values of w, where w m → 0 as m → ∞. The reason a sequence of w values is needed is that the minimizer of each B(x, w m ) with respect to x serves as the starting point for the minimization of B(x, w m+1 ). This sequential process is much more robust than simply minimizing B(x, w) once with a very small w value because B almost always becomes increasingly ill conditioned as w → 0. Since each minimizer of B(x, w) has no active constraints, the process of minimizing B can, in principle, be done by an unconstrained optimization method such as a variant of Nelder Mead.
The Nelder Mead method was first introduced in 1965 [22], and was initially popular. Its fortunes later waned somewhat due to competition from gradient based methods, before enjoying a resurgence driven in part by an increasing interest in problems which lacked available gradients [30]. This resurgence included a deeper examination of Nelder Mead's characteristics [14,16,17], as well as a counterexample showing that the original Nelder Mead method is not guaranteed to converge even on twice continuously differentiable (C 2 ) problems in 2 dimensions [19]. In response to [19], several convergent variants of Nelder Mead for unconstrained optimization [21,26,29] have been developed. These methods provide motivation for a Nelder Mead variant adapted to the role of minimizing barrier functions.
Nelder Mead begins with a polytope (or simplex) of n+1 points in R n , where this initial polytope is typically large. At each iteration it seeks an improved polytope by modifying the current polytope. Preferentially, it tries to replace the highest point in the polytope with a lower point along the line through this highest point and the centroid of the remaining polytope points. If successful, this yields a new polytope, and completes an iteration. Otherwise, all polytope points except the lowest are shifted partway toward the lowest, yielding a new smaller polytope and completing an iteration. The strategy of replacing the highest polytope point is then resumed. The method terminates when the polytopes become sufficiently small.
A jejune approach to constructing a barrier based Nelder Mead method for solving (1) would be to use a variant of Nelder Mead to minimize B(x, w) for each value of w in the sequence {w m }. As such, a complete execution of the Nelder Mead variant is embedded inside each iteration of the barrier method. Much of each Nelder Mead execution might be wasted if Nelder Mead uses an initial polytope that is far larger than necessary for the current B(x, w), or if B is minimized to an unnecessarily high accuracy, or both. Moreover, how to transfer information from the very small polytope at the end of one minimization of B to the large initial polytope at the start of the next minimization is far from clear.
A more nuanced approach is to interlace changes in w with iterations of a single execution of the Nelder Mead variant. This provides scope for increased efficiency via a reduction in the total number of Nelder Mead iterations. It also retains the current polytope across changes in w, thus preserving valuable information.
This approach is the subject of this paper, which is organized as follows. The next section describes the barrier function and gives a template for the method. Section 3 details the modified Nelder Mead sub-algorithm. Section 4 establishes asymptotic convergence theory under somewhat more restrictive conditions than are needed to execute the method. These include continuity of f , that the constraint functions c i are locally Lipschitz, and strict differentiability of the objective and all active constraint functions at relevant limit point(s). Continuous differentiability is not required. Numerical results and a discussion are given in Section 5, with concluding remarks appearing in Section 6. The new method is named penmeco, which is short for 'Pseudo-Expand Nelder Mead for Constrained Optimization.' 2. The Nelder Mead Barrier Algorithm. The method uses a modified logarithmic barrier function which is finite on the set of points Ω which strictly satisfy all constraints. That is to say Ω is the set This set Ω is usually a strict subset of the feasible region F , which is defined as The fact that the constraints are black-box means it is non-trivial to distinguish between interior and boundary points of F at which one or more constraints equal zero. Hence Ω is used as a proxy for the interior F • of F , and the closure Ω of Ω in turn is used as a proxy for the feasible region F . An example where Ω = F • and The modified logarithmic barrier function B(x, w) used herein is given by Here w is the barrier parameter, and P is the modified (natural) logarithm function where r > 0 is a fixed parameter. A different value of r can be used for each constraint. The P in (3) is an increasing smooth function mapping (−∞, 0) into (0, ∞), where P ↑ ∞ as c i ↑ 0. The first two derivatives of P are which shows that P is positive and bounded above by 1/c 2 i when c i < 0. The nonnegativity of P means B has two important properties needed to show convergence: for all x and for all r > 0 and w ≥ 0; and The barrier function in (2) is defined at all points in R n , allowing an unconstrained optimization algorithm such as a modified Nelder Mead algorithm to be applied to it.
The barrier algorithm requires an initial point z 0 ∈ Ω. Each B(x, w m ) is approximately minimized with respect to x for each member of a non-strictly decreasing sequence {w m } of positive barrier parameter values. The initial point for the first minimization is z 0 , and for each subsequent iteration it is the approximate minimizer from the previous minimization. Since w is held fixed during each minimization, the notation B m (x) = B(x, w m ) is used frequently.
A modified Nelder Mead algorithm is used to approximately minimize each B m by generating a sequence of polytopes until descent is poor. The final polytope is reshaped if needed, and one extra point added to form a structure called a frame of n + 1 points around the polytope's lowest point. If completing this frame does not yield sufficient descent the minimization of B m ends, otherwise the Nelder Mead iterations recommence. The frames that do not yield sufficient descent are central to the convergence theory.
2.1. Frames. Frames have been discussed in depth before in [6,7,8], and so the treatment here will be brief. Herein all frames are defined using minimal positive bases [9], where a minimal positive basis is simply a set V + = {v 1 , . . . , v n+1 } ⊂ R n with the property that any vector in R n can be written as a linear combination of v 1 , . . . , v n+1 with non-negative coefficients. It can be shown [9] that all vectors in a minimal positive basis must be non-zero. Minimal positive bases have the useful property that, given g ∈ R n , if v T g ≥ 0 for all v ∈ V + , then g = 0. This is easily seen by writing −g = n+1 i=1 η i v i with η i ≥ 0 for all i = 1, . . . , n + 1, giving This result is the crux of frame based direct search methods. A minimal positive basis V + is used to form a frame Φ(z, h, V + ) around z, which is the set of points The vector z ∈ R n and scalar h > 0 are called the frame centre and frame size respectively, where z ∈ Φ. Each such frame used by the algorithm is constructed from the current Nelder-Mead polytope. The point in the polytope with the lowest B m value is the frame's centre z. All remaining polytope points are included in the frame Φ. One extra point is added to complete the frame. In mesh or grid based unconstrained optimization [1,28] a sequence of frames is generated, where each frame satisfies Such frames are called minimal. When B is smooth, a minimal frame yields finite difference approximations of the directional derivatives at z along each v of the form Loosely speaking, minimal frame centres are a discrete equivalent of local minimizers, and by taking limits [25] it can be shown that these minimal frame centres converge to stationary points of B under appropriate conditions. Herein (5) is relaxed slightly, giving a quasi-minimal frame [6], viz.
where > 0 is the maximum acceptable deviation from minimality. A frame is either quasi-minimal, or contains a point  Conditions must also be imposed on the sequence {V (m) + } ∞ m=1 to ensure that its limits are also positive bases. One convenient approach is to assume the members of each V (m) + have a specific order. This means limit(s) of a sequence of ordered positive bases can be simply defined as follows.
where m is restricted to some infinite subset of the natural numbers N.
Provided the members of all V are uniformly bounded, the sequence {V (m) + } must have at least one limit point. Hence it is assumed that ∃K such that ∀m ∈ N, and ∀v Finally, it is necessary that the positive bases do not collapse in the limit m → ∞. This motivates the following assumption. This assumption can be enforced by resetting collapsed or nearly collapsed polytopes whenever sufficient descent is not obtained [26], via any of the options given in [6]. Here sufficient descent means a reduction in B of more than .
Assumption 1 and the fact that no member of a minimal positive basis can be zero means v (m) j is not only bounded via (7), it is also bounded away from zero for all m ∈ N and j = 1, . . . , n + 1.

The Barrier Method.
A framework for a Nelder Mead based barrier algorithm is as follows. Step 1 is an initialization step that requires an initial point z 0 in Ω.
Step 2 performs iterations of the modified Nelder Mead method until a quasi-minimal frame is found. The condition B m (z m ) ≤ B m (z m−1 ) in step 2 can be ensured simply by including z m−1 in the new polytope in step 3 before minimizing B m is commenced.
Step 3 adjusts the frame size h, measure of sufficient descent , and barrier parameter w. Adjusting the former two must satisfy These can easily be achieved by using, for example, h m = h m−1 /2 and m = Gh 5/2 m , for a fixed G > 0. Simple choices also exist for the initial and new polytopes in steps 1 and 3. Herein the initial polytope is z 0 , together with points obtained from taking steps of length h 1 along each coordinate axis from z 0 . A similar construction can be used in step 3 with h m and z m−1 in place of h 1 and z 0 . A slightly more sophisticated approach is to then align one edge of this new polytope along a favourable direction by reflecting it through a plane containing z m−1 .

2.3.
Adjusting w. The basic strategy behind adjusting w is to reduce it towards zero sufficiently slowly so that the frames' sizes shrink to zero faster than the frames' centres approach the boundary of Ω. This means that the logarithmic singularities become distant on the scale of a frame, which is needed to show convergence. To this end the maximum difference D (m) i of each constraint's values between the frame and its centre is specified via Herein w is only reduced in step 3 of Algorithm 1 when for a fixed positive constant W .
Assumption 3. For any w 1 > 0 the updating process for w ensures (a) w m → 0 as m → ∞ provided w is updated infinitely often; and Part (b) of this assumption means the variation in c i values across a frame becomes small relative to the constraint values at the frame's centre for some subsequence of quasi-minimal frames. Specifically, for any m at which (8) holds, for all x ∈ Φ (m) and for all i = 1, . . . , q and the right hand side converges to zero as m → ∞ by Assumption 2. When W √ h m < K w , (9) implies c i (x) < 0 ∀i = 1, . . . , q and ∀x ∈ Φ (m) , giving Φ (m) ⊂ Ω.
3. The Nelder Mead sub-algorithm. The modified Nelder Mead sub-algorithm uses a polytope in R n consisting of n + 1 points x 0 , . . . , x n which have been sorted in increasing order of their B m values, with b i = B m (x i ) for i = 0, . . . , n. At each iteration the worst point x n is shifted along a line through it and the centroid of the remaining n points. When this fails to yield sufficient descent, a step called the pseudo-expand step [26] is attempted, which might require reshaping the polytope first. This pseudo-expand step either yields sufficient descent, or gives a quasiminimal frame. In the former case iterations of Nelder Mead are recommenced. In the latter, the sub-algorithm halts and returns the quasi-minimal frame.
In the following description of the modified Nelder Mead sub-algorithm, iteration numbers on various quantities are omitted in order to minimize clutter. An illustration of the points which could occur in one iteration of the modified Nelder Mead sub-algorithm is given in Figure 1.
Otherwise return the quasi-minimal frame {y 1 , . . . , y n , x pe } about x 0 and halt.
Steps 1 to 4 form a loop which executes iterations of the original Nelder Mead method (without shrinks) until sufficient descent from the polytope's worst point no longer occurs. The parameters α, β, and γ are the reflection, contraction, and expansion coefficients of standard Nelder Mead, and must satisfy 0 < β < min{α, 1} and γ > max{α, 1}.
Step 5 reshapes the polytope if necessary, so that the frame generated in step 6 via the pseudo-expand step satisfies the necessary conditions for convergence. Specifically, this reshaping together with condition (7) must ensure the underlying se- of positive bases satisfies Assumption 1. A strategy for identifying which polytopes require reshaping is given in [26]. A simpler approach is to reshape all polytopes in step 5 whether they need it or not. This uses more function evaluations, but these extra evaluations either provide additional evidence an approximate minimizer of B m has been found, or yield a point of sufficient descent.
Step 6 attempts the pseudo-expand step by assuming x 0 is the reflection point from a fictitious previous polytope, where the n best points in that fictitious polytope are y 1 , . . . , y n . The expansion point for this fictitious polytope is the pseudoexpand point x pe for the actual polytope. Continuing to treat x pe as an expansion point from a fictitious polytope, step 7 replaces x 0 with x pe if the latter gives sufficient reduction below x 0 . Otherwise Φ (m) = {y 1 , . . . , y n , x pe } forms a quasi-minimal frame around the frame centre x 0 = z m . 4. Convergence Results. In this section the convergence properties of methods conforming to the template given in Algorithm 1 are examined. The convergence results are asymptotic, and assume the method is performed without stopping in exact arithmetic. Assumptions 1, 2, and 3 can be guaranteed directly by the way an algorithm is constructed. The remaining assumptions are directly or indirectly problem dependent, and are listed here:   Once case (b) has occurred for the final time, case (a) occurs in all future iterations. Case (a) does not reset the polytope because step 5 is not performed. Hence all polytope points other than x n are retained from one iteration to the next, and the quantity n i=0 B m (x i ) is reduced by more than m in each such iteration. This also shows that the sequence of B m values on the polytopes are unbounded below, again yielding a contradiction. Hence {z m } must be an infinite sequence. Corollary 1. Assumption 4(a) implies the sequence of quasi-minimal frame centres {z m } has one or more limit points.
Theorem 4.1 also implies h is reduced infinitely often, which is important in satisfying Assumption 2. The next two results show w is also reduced infinitely often, which is needed if Assumption 3(a) is to hold.
where the left hand inequality is because B m (z m ) ≤ B m (z m−1 ) from step 2 of Algorithm 1. The right hand inequality is because B is a non-strictly increasing function of w, and w m ≤ w m−1 . Proof. The second part is a direct consequence of the first, by Assumption 3. The proof that inequality (8) holds infinitely often is by contradiction. Assume the final time inequality (8) holds is in iteration J. Then w m = w * > 0 for all m ≥ J. Now the first iterate z 0 ∈ Ω which means B(z 0 , w 0 ) is finite. Since B(z, w) ≥ f (z) for all w ≥ 0 and z ∈ Ω, and f is bounded below on F by f min , we have is used, where the ζ i play the role of Lagrange multiplier estimates. The barrier function B and the Lagrangian can be connected as follows. Using we have and so then ∆B and ∆L agree to the first order at z m . Consequently when a frame centre appears to be a discrete stationary point of B, it is also appears to be a discrete stationary point of L. This fact is used to establish the KKT conditions for selected limit points z * of the sequence of quasi-minimal frame centres {z m } under mild conditions. These conditions include that the sequence {ζ (m) } of Lagrange multiplier estimates remains bounded, where the notation ζ (m) = ζ as (x, y) → (z, z) over (x, y) ∈ Ω × Ω. The notation ∇ψ * will be used for g when ψ is strictly differentiable at the point z * .

in (7) implies
Strict differentiability of ψ at z * implies there exists a g ∈ R n such that Hence from (13), Dividing by h m and allowing m → ∞, this gives → v * j for all j = 1, . . . , n + 1 as m → ∞. Hence g = 0 by, for example, Theorem 2.1 of [7], or via (4).  Proof. First, for feasibility, z m ∈ Ω for all m implies z * ∈ Ω. However Ω ⊆ F , and so z * is feasible.
Second, the form of the Lagrange multiplier estimates (12) means ζ * ≤ 0. Third is complementarity. If constraint i is inactive at z * , then c i (z * ) < 0 and ) → 0 as m → ∞, giving ζ * i = 0, as required. Since ζ * i = 0 for all inactive constraints at z * , the ζ * i c i terms in L(z, ζ * ) for the inactive constraints all vanish. Hence L(z, ζ * ) is strictly differentiable at z = z * because f and all constraints with c i (z * ) = 0 are strictly differentiable at z * .
Finally, each V for all m ∈ M. This in turn implies for all m ∈ M and for all i = 1, . . . , q, there exist θ where ∆c i has been used for ∆c Noting that For all m ∈ M sufficiently large, inequality (9) implies |∆c i | < |c i |/2 for all i since h m → 0 as m → ∞. This, together with the bound P ( and hence inequality (8) and Assumption 2 yield The Lipschitz continuity of c i and the fact that ζ  Proof. To see this we start from inequality (14), which has been established without use of any property of {ζ (m) }. Additionally, inequality (9) implies |∆c i | < |c i |/2 for all i when m ∈ M is sufficiently large. This, and the bound P ( i for all i when m ∈ M is sufficiently large. Applying this bound to the second derivative term in (14) yields similarly for ∆c where ζ (m) i = −w m P (c i (z m )) has been used without further assumption. Now for inactive constraints c i (z * ) < 0 holds, which, together with w m → 0 as m → ∞, implies ζ The proof that {ζ (m) } remains bounded is by contradiction. Let ζ where the ∇f term has vanished because ζ The convergence theory only applies to an ideal algorithm which is run forever in exact arithmetic. Realisable implementations necessarily deviate from this ideal on both counts, and hence might fail on some problems. The behaviour of one such implementation is examined in depth in the next section.
5. Implementation and Numerical Testing. The algorithm tested is as described above, with initial values G = h = w = r = 1 and W = n. The values used for the Nelder Mead parameters were those recommended in [22]: the reflection, contraction, and expansion coefficients were α = 1, β = 1/2 and γ = 2 respectively. The values recommended by Gao and Han [11] were tried, but gave a slightly worse performance than those from [22]. The shrink coefficient σ ∈ (0, 1) for h was set at σ = 1/2, in line with [22]. Specifically h m+1 = σh m was used, along with w m+1 = σw m when condition (8) holds. 4. If h m+1 < h min halt. Otherwise increment m, form the new polytope using the reshaping process given below, and go to step 2.
Here e i is the i th unit coordinate vector. The algorithm also halted if a budget of 80,000 function evaluations was reached. In addition, a record of the best feasible point was kept, and this point returned as the solution found by penmeco.

5.1.
Reshaping the Polytope. In order to reduce algorithmic complexity in the implementation, every polytope x 0 , . . . , x n is automatically reshaped in step 5 of Algorithm 2 and in step 4 of Algorithm 3 as follows. Assuming the points x 0 , . . . , x n in the polytope are in increasing order of B m value, the vector s = x 1 − x 0 is calculated. A Householder matrix H = I − 2uu T is constructed, where u is the unit vector parallel to s ∓ s e 1 . Here e 1 is the first column of the identity matrix I and the ∓ sign is chosen to match the sign of the first element of s. This u gives Hs = s e 1 . Noting H −1 = H T = H, this also gives s = s He 1 . The points in the reshaped polytope x 0 , y 1 , y 2 , . . . , y n are given by where e i is the i th unit coordinate vector, and κ is the number of times the reshaping process has been performed. Hence each reshaped polytope has orthogonal edges from the vertex x 0 of equal length. This ensures both (7) and Assumption 1 hold. The (−1) κ factor means if the direction of the vector s = x 1 − x 0 remains fixed for several iterations, successive polytopes look along both each He i direction and its negative. This slightly increases the reliability of the method.

5.2.
Numerical Results and Discussion. The algorithm was numerically tested on a variety of smooth and nonsmooth constrained problems. Problems 1-8 are from [12] and problems 12-15, 17, 18, and 20 are from [18]. One of these (LV4.7) was modified because its original form in [18] has one equality constraint, which means barrier methods are not applicable to it. This equality constraint was used to eliminate x 1 from LV4.7 yielding the test problem used herein. Problem 9 is the crescent problem from [3]. Problems 10, 11 and 16 are modifications of those in [20]. The objectives for problems 21-26 are as given in [13]. Problems 21 and 22 have the Broyden set of constraints as given in [13,24], P23 and P24 have the single constraint 1 − x ∞ ≤ 0, and P25 and P26 have the sole constraint x 2 ≤ √ n. The initial points for P21-P24 are as listed in [13], and the initial points for P25 and P26 are x i = e for all i.
The modified Zakharov problem (P19) is with the initial point x i = i for i = 1, . . . , 8, and optimal point x * = 0. The modified Beale problem (P16) is where y 1 = 1.5, y 2 = 2.25 and y 3 = 2.625. The initial point was x i = 1 for i = 1, . . . , 5. The modified Variably Dimensioned problem (P11) is  Numerical comparisons are made with a variant v-orthomads of orthomads [4] programmed by this author. V-orthomads is an extreme barrier feasible point direct search method which asymptotically looks in all directions, and as such is a good comparator for penmeco. Several of the test problems have infeasible starting points. This was dealt with by both methods via a phase I which minimizes max(c) until a point in F (for v-orthomads) or Ω is found. This phase I solution was then used as the initial point when solving the actual problem.
Numerical results for the 26 test problems appear in Table 1, where the first 11 problems listed are smooth, and the remaining 15 have either a nonsmooth objective or nonsmooth constraints. Table 1's legend is as follows. Columns headed 'nf' and 'nc' list the number of function and constraint evaluations used to solve the problem. For penmeco the total number of constraint evaluations ('nc(all)') and the number used by the first phase to find a feasible point ('nc(I)') are listed separately. Columns headed 'E f ' list the relative error in f , which is (f − f * )/ max{1, |f * |}, where f * is the optimal value of f . The column headed 'max(c)' lists the maximum constraint value at the solution. This gives an indication of how close to the boundary of the feasible region penmeco was able to approach.
Numerical tests show penmeco is significantly more robust than v-orthomads, at the expense of significantly more function evaluations. The differences in function counts can be partially attributed to the fact that v-orthomads has a line search whereas penmeco does not. The discrepancy in robustness is partly due to the differences between the extreme and modified logarithmic barrier functions. The ability of the logarithmic barrier to initially keep iterates away from the feasible region's boundary reduces the risk of being trapped on the boundary and unable to find a feasible descent direction. The effect of this can be seen in the relative numbers of objective and constraint function evaluations for each method. For penmeco, the number of objective and constraint evaluations is similar on most problems, showing that penmeco largely moves through the interior towards the solution, only rarely encountering infeasible points. In contrast v-orthomads typically uses significantly fewer objective evaluations than constraint evaluations.
There were eight problems (P2, 3, 11-15, and 24) which both methods solved to an acceptable accuracy of E f < 10 −4 , and two further problems (P9 and P19) on which v-orthomads almost achieved E f < 10 −4 . The spectacular performance of v-orthomads on P2 and P3 is due to the fact that the initial points and solutions both lie on one of the coarse grids used by v-orthomads in its early iterations, allowing that method to step exactly to the solution quickly. As measured by each algorithm's stopping conditions (see Table 1), v-orthomads was significantly faster on all but one of these ten problems. If the methods are stopped when E f < 10 −4 is first reached, the situation becomes somewhat more balanced, but with v-orthomads still having the edge. Nevertheless, there are many problems which penmeco can solve, but v-orthomads can not. Table 1 also lists results for MATLAB's patternsearch algorithm. The default settings are used apart from the mesh and step tolerances were set at 10 −10 , and the constraint tolerance was set at 10 −18 . The former lines up with h min = 10 −10 for penmeco. The latter puts all three algorithms on the same footing with regard to constraint violations. The polling method used was 'MADSPositiveBasis2N', which is a Mesh Adaptive Direct Search [2,4] method that uses an augmented Lagrangian to handle the non-linear constraints [27]. In addition patternsearch used the inbuilt Nelder Mead search step 'searchneldermead'. Since patternsearch is a stochastic method, all results for patternsearch that are presented in Table 1 are 30 run averages.
The use of an augmented Lagrangian by patternsearch means that it can use infeasible iterates, in contrast to the other two methods. The nature of many test problems change depending on whether or not f can be evaluated at infeasible points, which makes comparison between patternsearch and the other two methods less straightforward. Disregarding this issue, Table 1 shows penmeco solves more problems than patternsearch, but is slower.
Overall, the performance of penmeco on the smooth and nonsmooth test problems is similar, with only two smooth and one nonsmooth not solved to an acceptable accuracy. Although the convergence theory does not extend to the nonsmooth problems, it shows the method has a significant capability on such problems.

Comparison on the Cones Test Functions. Penmeco and v-orthomads
were also compared on the cones set of randomly generated test functions from [24]. Constructing a cones test problem starts with a nonsmooth convex objective and one convex hyperspherical constraint which is active at the solution. All other constraints are cone shaped with the solution at each cone's apex. The region which violates a conical constraint has a semi-angle of π/3, and randomly oriented axis. The objective and constraints are then twisted using a random cubic transformation which destroys convexity, but does not create any other local minimizers. In high dimensions the region excluded by the conical constraints is relatively small, and v-orthomads is largely unaffected by them because it ignores constraints until it generates a point which actually violates a constraint. In contrast, any logarithmic barrier method will be affected by the proximity of the conical constraints when near the solution, even if no point violating these constraints is ever generated.
The results for the cones test set appear in Figure 2, which plots the number of randomly generated problems solved by each method against the number of simplex gradients N , for problems of dimension n = 12, 18, and 24, each with 11 constraints.
Initially v-orthomads finds solutions in approximately half the number of function evaluations as penmeco, but is eventually overtaken as N increases. On most problems v-orthomads does not halt, but rather makes negligible progress until the maximum value of N is reached. In contrast, penmeco has a sufficient descent condition, which limits this behaviour. If the maximum value of N is raised to 25,000 then penmeco halts on all problems, with 0, 2, and 13 fails for n = 12, 18, and 24 respectively. V-orthomads was still going on 4, 15, and 22 problems, and for the problems on which it halted, it chalked up 13, 27, and 50 fails for n = 12, 18, and 24 respectively.  Data profiles for penmeco and v-orthomads on 100 randomly generated cones test functions from [24], in 12, 18, and 24 dimensions with 11 constraints. Each curve gives the number of problems solved to an accuracy E f < 10 −4 within N simplex gradients, which is N (n + 1) evaluations of f .

Conclusion.
A barrier function based variant of the Nelder Mead method for nonlinearly constrained optimization has been described. Asymptotic convergence is guaranteed under mild conditions which include strict differentiability at each solution found, but do not include continuous differentiability. Numerical tests show the method performs well in low to moderate dimensions, and that it can be effective on some nonsmooth problems outside the scope of its convergence theory.