OPTIMAL CONTROL FOR CANCER CHEMOTHERAPY UNDER TUMOR HETEROGENEITY WITH MICHEALIS-MENTEN PHARMACODYNAMICS

. We analyze mathematical models for cancer chemotherapy under tumor heterogeneity as optimal control problems. Tumor heterogeneity is in- corporated into the model through a potentially large number of states which represent sub-populations of tumor cells with diﬀerent chemotherapeutic sensi- tivities. In this paper, a Michaelis-Menten type functional form is used to model the pharmacodynamic eﬀects of the drug concentrations. In the objective, a weighted average of the tumor volume and the total amounts of drugs admin-istered (taken as an indirect measurement for the side eﬀects) is minimized. Mathematically, incorporating a Michaelis-Menten form creates a nonlinear structure in the controls with partial convexity properties in the Hamilton- ian function for the optimal control problem. As a result, optimal controls are continuous and this fact can be utilized to set up an eﬃcient numerical computation of extremals. Second order Jacobi type necessary and suﬃcient conditions for optimality are formulated that allow to verify the strong local optimality of numerically computed extremal controlled trajectories. Examples which illustrate the cases of both strong locally optimal and non-optimal extremals are given.


1.
Introduction. Tumor heterogeneity is one of the main reasons for the failure of cancer chemotherapy in practice. The genetic instability of cancer cells [6,7], often coupled with faster mutation rates, fosters the creation of diverse subpopulations of tumor cells which have widely varying phenotypes and chemotherapeutic sensitivities. According to the Norton-Simon hypothesis [22,23], generally faster replicating cell lines are more susceptible to a chemotherapeutic attack while slower growing ones are more resistant. One possible explanation for this feature is that resistance is achieved through pathways that use up more energy which thus cannot be used for proliferation [5].
There exists a large literature that deals with tumor heterogeneity from a mathematical modeling point of view in which generally a small number of different subpopulations, such as sensitive and resistant ones, are considered (e.g., [10,11,14,15,28]). More recently, Lorz et al. [20,21] have formulated a mathematical model for phenotypic heterogeneity and drug resistance in solid tumors that allows for a continuum of possible traits. This model was expanded by Greene, Lavi, Gottesman and Levy [9,12] as a means to explain the roles played by increasing cell densities and mutations in the emergence of specific traits that become dominant. Mathematically, allowing for a continuum of traits in the model gives the problem a distributed aspect and the resulting optimization problem becomes non-standard and of great complexity [30]. It therefore makes sense to approximate this structure by a finite-dimensional model-and this is the approach pursued in this paper-but for models with a potentially very large dimension. Optimal control problems of this type are inherently difficult since they aim to control a large number of distinct cell populations by means of a what in reality typically are a small number of controls (the concentrations of just a few drugs, maybe 1 to 4) giving rise to the notion of multi-targeted optimal control problems: a large number of variables needs to be controlled with a small number of inputs. Systems with similar features are also studied under the name of ensemble control [18,17].
One fundamental difference between this paper and earlier work is that we include a saturating model for the pharmacodynamic (PD) effects of the drug concentration, i.e., the effects drug concentrations have. Specifically, we use the E max -model in pharmacology which describes the effect of a fast acting drug through a Michaelis-Menten relationship of the form Emaxc EC50+c where c denotes the drug concentration, E max its maximum effect, and EC 50 is the value for which half of this effect is realized. In this paper, for simplicity, pharmacokinetics (PK), i.e., the relation between drug dosages and their concentrations in the body, is not modeled. The rationale is that we consider a time horizon that is large relative to the PK timescale (days or months relative to minutes or hours) and this also justifies the use of the 'fast' acting E max -model. In principle, a standard linear model for PK could easily be incorporated. But in this paper we identify drug dosages with their concentrations and these become the controls in our model.
Mathematically, using a saturating model for the drug effects changes the structure of the dynamics. While the dynamics is a control-linear system when the log-kill hypothesis is used to model PD [30], now the model is fully nonlinear in the controls. This feature, coupled with the high dimensionality of the problem, could become unsurmountable obstacles in analyzing this dynamics as an optimal control problem. However, in our case the dynamics is sparse. In fact, the interactions between the state variables are only through their average value. This simplifies the analysis and leads to effective numerical procedures for the computation of extremal trajectories. While the question of an optimal synthesis, i.e., a global solution of the underlying problem, is a difficult one that will not be addressed in this paper, we derive and test efficient second-order necessary and sufficient conditions for the strong local optimality of numerically computed extremal controlled trajectories. Naturally, these Jacobi-type conditions are related to the absence respectively existence of conjugate points along the reference extremal. In our formulation we base the statements on a constructive approach to generate a local field of extremals using the method of characteristics [26] and describe the non-existence of conjugate points in terms of the existence of a solution to a certain matrix Riccati differential equation.
We formulate the mathematical model and its background in Section 2 and then analyze first the conceptually simpler single-input case in Section 3. This model follows the mathematical structure considered in [16] and we draw on the results of that paper to formulate sufficient conditions for local optimality for the model under consideration here. The multi-input case is new and we differentiate between the cases when there is little interaction between the various drugs in Section 4 and when there may be synergistic effects in Section 5. We close with a brief discussion of the numerical procedures to compute the extremals in Section 6.
2. A mathematical model for cancer chemotherapy under tumor heterogeneity with Michaelis-Menten type pharmacodynamic equations. In [20,21] Lorz et al. formulated a model for tumor growth in which a continuum of possible traits (phenotypes) was considered. This model was then expanded upon by Green et al. [9] and Lavi et al. [12] to include the effects of varying cell densities and can be summarized in the following trait dependent model of exponential growth: In this equation x, x ∈ [0, 1], denotes the different phenotypes of cells, n(t, x) denotes the population density of cells with trait x at time t and r = r(x) and µ = µ(x) are phenotype specific replication and natural death rates. The linear log-kill hypothesis is made for the action of a drug concentration c = c(t) with ϕ = ϕ(x) a trait dependent cytotoxic killing parameter and, since most cytotoxic agents merely prevent further cell divisions, the drug-related cytotoxic term is interpreted as lowering the replication rate. The functional parameter G is a strictly increasing function which corrects for the fact that the rates for cell division and apoptosis do not only depend on the trait x, but also on the cell density [8]. This density is closely related to the total tumor mass N = N (t) of cancer cells at time t. It is incorporated into the model as a quotient by means of a reparametrization of the time-scale assuming that the replication rates are strictly decreasing with total tumor size N while the apoptosis rates are increasing [12,27].
In this paper, we replace the log-kill assumption with a Michaelis-Menten relation of the E max -type to describe pharmacodynamic effects and also consider a cocktail of therapeutic agents that may have different activation mechanisms, i.e., act on different pathways. The Michaelis-Menten form for PD is explored here as it is common in the pharmaceutical industry since it better describes the long-term effects of drugs. Furthermore, resistance of cells is not absolute, but drug specific: while certain sub-populations of cells may not react to a particular drug, they may be sensitive to another. However, once multiple drugs are involved, their interactions may be complicated. While we do not need to worry much about this aspect if drugs act on different phenotypes x, i.e., have activation mechanisms which essentially separate their targets, overlaps may occur and different drugs may have synergistic or antagonistic mechanisms. Using a basic underlying probabilistic reasoning of independent particles and their actions, we model these effects through quadratic terms on the concentrations. Normalizing the concentrations in terms of their EC 50 -value, we arrive at the following modified dynamical model: Ωi,j(x)ci(t)cj(t) (1 + ci(t))(1 + cj(t)) −G(N (t))µ(x)} n(t, x). (2) We note that this dynamics is quite different in its formulation from the ones originally given in [20,21,9,12] while relying on the same premises.
In an optimal control formulation, the aim becomes to choose the controls, which in our case are given by the drug concentrations c i , in such a way as to minimize a performance criterion imposed on the dynamics. Here we use a functional of the form where α and β are positive constants while γ is a row vector of positive weights which acts upon the vector c = (c 1 , . . . , c m ) T of concentrations. This objective consists of a weighted average of the total tumor population αN (T ) at the end of a fixed therapy interval [0, T ], an integral term β T 0 N (t)dt that measures the tumor size during the therapy, and a penalty term T 0 γc(t)dt that limits the toxicity of treatment. Recall that we do not include an equation for PK in the model and thus the concentration c becomes the control. This integral represents the AUC ("area under the curve") customarily used in the pharmaceutical industry and serves as an indirect measure of side effects. The coefficients α, β and γ represent variables of choice and generally need to be fine-tuned to achieve a system performance that would be considered adequate with respect to other aspects possibly not included in the modeling. For example, including the integral term  0 c i (t)dt as penalty terms in the objective is just one way of limiting side effects of treatment. Alternatively, one could assume that the total amounts of drugs to be given are specified a priori based on a medical assessment of their toxicity and then the question becomes how to best administer the specified amounts of drugs. Mathematically, this leads to similar problem formulations and the penalty approach we use here is in some sense complementary, but fully compatible with this hard limit approach. Thus the optimal control problem is to minimize the functional (3) over all possible concentrations c = (c 1 , . . . , c m ) T subject to the dynamics (2).
Because of its high-complexity, we discretize the number of traits and form a finite-dimensional approximation with n traits and denote the state vector by N = (N 1 , . . . , N n ) T . Also changing the notation for the concentrations to the more common notation u i in control theory, we write the control vector as u = (u 1 , . . . , u m ) T and we limit the controls to lie in a compact interval, 0 ≤ u i ≤ u i max . Let R = diag(r 1 , . . . , r n ) and M = diag(µ 1 , . . . , µ n ), denote diagonal matrices whose entries are, respectively, given by the replication and apoptosis rates of the ith populations, i = 1, . . . , n. We also discretize the effectiveness of the drugs and denote by Φ j = diag(ϕ j,1 , . . . , ϕ j,n ) and Ω jk = diag(ω jk,1 , . . . , ω jk,n ) diagonal matrices whose entries give the effectiveness of the jth drug, j = 1, . . . , m, on the ith state, i = 1, . . . , n, respectively define the interaction correction terms. Also, we set e = 1 n (1, . . . , 1), so that the average number of tumor cells is given by eN (t). This then leads to the following optimal control problem: The dynamics is sparse with the interaction between the states only through the term G(eN (t)). This feature allows us to analyze this problem even if the dimension is large. We start with the conceptually simpler single-input case (m = 1).

3.
Optimal control under tumor heterogeneity with a single drug. For the case of a single drug, the optimal control problem (P) takes the following form: (P1): For a fixed terminal time T , n-dimensional row vectors α and β of positive weights, and a positive coefficient γ, minimize the objective Recall that R, Φ and M are diagonal matrices. First-order necessary conditions for optimality are given by the Pontryagin maximum principle [24] (for some more recent references on optimal control, see [1,2,19,26]). The control Hamiltonian function for problem (P1) takes the form Since there are no terminal constraints in the problem formulation, abnormal extremals (for which the multiplier at the objective functional vanishes) do not arise (e.g., see [2,26]) and we already anticipate this by normalizing the coefficient at the objective to be 1. If (N * , u * ) is an optimal controlled trajectory for problem (P1) defined over the interval [0, T ], then there exists a multiplier λ = (λ 1 , . . . , λ n ) : [0, T ] → (R n ) * (which we write as a row vector) that satisfies the adjoint equatioṅ with terminal condition λ(T ) = α (10) such that the optimal control u * minimizes the Hamiltonian function H in u over the control set [0, u max ] along (λ(t), N * (t)), i.e., (We use primes to denote the derivatives of the function G.)

SHUO WANG AND HEINZ SCHÄTTLER
Any controlled trajectory (N, u) for which there exists a multiplier λ such that these conditions are satisfied is called an extremal and the triple (N, u, λ) is called an extremal lift.
Theorem 3.1. Optimal controls for problem (P1) are continuous. Defining the indicator function Ψ as Ψ(t) = λ(t)ΦN * (t), we have that Proof. This directly follows from Theorem 3.1 and subsequent Corollary 3.2 in [16]. The name indicator function for Ψ merely reflects the fact that-and very much like the switching function for control-affine optimal control problems-this function determines the form of the optimal control by means of the relation (12). Essentially, while Ψ(t) is negative, the Hamiltonian is concave in u with a minimum at u = 0 whereas it is convex in u if Ψ(t) is positive. In this case, the minimum of H over all of R is attained at the stationary point, but when restricted to the control set [0, u max ], the minimum is given by (12). These convexity properties of the Hamiltonian allow for efficient numerical computations of extremal controlled trajectories and finite concatenations between constant and interior controls become the natural candidates for optimality. For such controls, in [16] sufficient conditions for strong local optimality of the associated extremals are developed based on the method of characteristics [26] and we formulate these conditions here for problem (P1). We call a time τ ∈ (0, T ) a junction time if the control changes between a boundary value (given by 0 or u max ) and the interior control u st = Ψ(t) γ − 1; the point N * (τ ) is a junction. A junction is said to be regular if the derivative of the indicator function Ψ at the junction time τ does not vanish, i.e.,Ψ(τ ) = 0. We then have the following result: [16, Thm. 5.5] Let Γ * = (N * , u * , λ) be an extremal lift with regular junctions at times t i , i = 1, . . . , k, 0 = t 0 < t 1 < · · · < t k < t k+1 = T . Let S * denote the solution to the matrix Riccati differential equatioṅ with terminal value S * (T ) = 0. All partial derivatives are evaluated along the reference extremal Γ * and ι = 1 on intervals where the control u * takes values in the interior of the control set and ι = 0 otherwise. If the solution S * exists on the full interval [0, T ], then there exists a neighborhood P of p * = N * (T ) such that the flow defined by the graphs of the controlled trajectories, is a diffeomorphism on the domain D = {(t, p) : 0 ≤ t ≤ T, p ∈ P }. Here P denotes the positive orthant in R n , the natural state-space for the problem. In this case, the reference control u * is a strong local minimum for problem (P1). Specifically, the reference controlled trajectory (N * , u * ) is optimal relative to any other controlled trajectory (N, u) for which the graph of the trajectory N lies in the set (D).
An essential step in this argument is that, given a reference extremal controlled trajectory with a finite number of regular junctions, it is possible to embed this extremal into a parameterized family of broken extremals, (N (t, p), u(t, p), λ(t, p)), p ∈ P ⊂ R n , with regular junctions [16,Prop. 5.2]. Note that we assume that the terminal time is not a junction as such a construction would not be feasible otherwise. The local injectivity of the corresponding flow map along the reference trajectory can be characterized in terms of the existence of a solution to equation (13). This connection is through the relation with S * (t) = S(t, p * ). In our case, since the control can switch between interior and boundary values, the character of this equation changes at junction times. While the control lies in the interior of the control set, we have that and thus those segments are what in the engineering literature are called nonsingular extremals [3,26]. This leads to the standard matrix Riccati differential equation for S along interior controls. On the other hand, while controls are constant, since ∂u ∂p ≡ 0, the quadratic term drops out and this differential equation reduces to a linear Lyapunov equation which always has a solution. Finite explosion times are thus only possible along segments when the control lies in the interior. We also mention that it is the fact that the controls remain continuous which implies that the solutions S(t, p) remain continuous at these surfaces. Discontinuities in the control would lead to jumps in these solutions at the switching surfaces (e.g., see [26,Sect. 6.1]). Further analysis of the flow mapping also gives us a second-order Jacobi-type necessary condition for optimality. It is clear from the above discussion that loss of local optimality can only happen over an interval where the control lies in the interior. Theorem 3.3. Let Γ * = (N * , u * , λ) be an extremal lift with regular junctions at times t i , i = 1, . . . , k, 0 = t 0 < t 1 < · · · < t k < t k+1 = T and suppose there exists a time τ which is not a junction time such that the solution S * to the differential equation (13) has a finite explosion. Then the reference controlled trajectory (N * , u * ) is not a strong local minimum for the optimal control problem (P1).
We recall that τ is a finite explosion for the solution S * of the Riccati differential equation (13) if at least one of its components diverges to ±∞ as t → τ , t > τ . Generally, such finite explosions correspond to conjugate points of the underlying control problem.
Proof. The reasoning is classical and we only outline the main steps. The solution S * to the terminal value problem (13) agrees with the matrix S(t, p * ) defined in (15) and the existence of the solution S * is equivalent to the local invertibility of the flow around the reference controlled trajectory [26]. If there exists a time τ > 0 at which the solution S * has an explosion, this is a time when the control takes values in the interior of the control set. By assuming that this time is not a junction time, we have the typical situation for a flow of non-singular extremals and well-known results about Jacobi necessary conditions for optimality are applicable (e.g., see [3] or Sections 5.3 and 5.4 in [26]). For example, if the flow undergoes a fold singularity, it is possible to construct an envelope and local optimality of the flow ceases at the envelope (e.g., see [25,26]).
If the time when the Riccati differential equation has an explosion is a junction time, then the above argument does not apply and one needs to look closer at the geometric relations between the junction surface and the corresponding singular manifold. Actually, in this case, it is conceivable that the overall flow remains injective even if the solution to the Riccati equation explodes at the junction, but the flow no longer is a diffeomorphism. We do not consider this more degenerate case here. We still record the following formulas for the second partial derivatives. Like in (16), all derivatives are evaluated along the reference extremal: We illustrate these results with a numerical example. The parameter values and functions which we use are obtained from a discretization of data for the continuum model which are taken from [9]. These are not based on specific medical data, but merely reflect qualitative biological properties. We discretized the state evenly spaced over the interval with n = 21 states corresponding to the levels x i = i−1 20 , i = 1, . . . , 21. This simply represents one particular example to illustrate the theory: n = 21 is merely a compromise between 'large' and 'small' while choosing the partition equidistant has no bearing on the applicability of the computational procedures as these points merely define the matrices in the discretization. For the replication rates r and cytotoxic killing parameter ϕ we chose r(x) = 2 1.1 + 2x 5 and and we kept the death rate constant, µ ≡ 0.50. This is merely done to better focus on the effects that different replication rates have. Here the chemotherapeutic sensitivities decay with increasing index i; all sub-populations are sensitive to the drug, but with different rates. For the reparameterization function we have chosen the strictly increasing function G(τ ) = ln(1 + τ ). The upper limit on the control is set to u max = 3, i.e., three times the level for the 50% efficacy and the final time T is fixed. In all the calculations we choose the initial distribution of cells over the traits uniform with value n(0, x) ≡ 200 21 . Figure 1 shows a numerically computed extremal control for the weights α = (1, . . . , 1), β = 500α and γ = 5000 along with the values of the states at the end of the therapy horizon T = 10 represented as curve n(T, x). We discuss the numerical procedures used to compute this and also the other examples below in Section 6. This control is a strong local minimum as the entries of the corresponding solution to the Riccati equation (13) remain finite over the full therapy horizon [0, T ].

4.
Optimal control under tumor heterogeneity with multiple non-interacting drugs. We now consider the multi-input case (combination therapy), but we are assuming that the overlap of the cell populations on which the agents act is relatively small so that we can neglect the interactions of the drugs. Mathematically this is reflected in a small intersection of the support of the functions ϕ i which define the effectiveness of the drugs or in a small effectiveness if indeed there is such an Once again, all matrices are diagonal matrices. The necessary conditions for optimality are straightforward multi-input extension of the conditions in Section 3 and we skip them. Since the control set is the compact interval, U = [0, u 1 max ] × · · · × [0, u m max ], and since the Hamiltonian is the sum of m separate terms which only depend on a single control u i , the earlier analysis directly carries over to the multi-input system and gives us the following result: denoting the indicator function for the ith control i = 1, . . . , m, we have that In particular, optimal controls for problem (Pm) are continuous.
The sufficient condition for the strong local optimality of a numerically computed control carries over almost verbatim if we assume that there are no simultaneous junctions in the controls. We call such a junction simple. Simultaneous junctions in the controls lead to more degenerate and more difficult settings (e.g., see [4] for the case of switchings in the controls) as the sequence between interior and boundary controls may change locally near the reference trajectory at such points. This is not the case when at each junction only one of the controls changes from an interior to a boundary value. This clearly is the least degenerate scenario and under this assumption for each control there exists a well-defined sequence for when the controls lie in the interior of the control set or on the boundary and this order does not change over a small enough neighborhood of the reference extremal controlled trajectory. We can therefore again construct a parameterized family of broken extremals by integrating the dynamics and adjoint equations backward from the terminal points by following this sequence. As before, this family defines a local field if there exists a solution to a Riccati type differential equation for the matrix S defined in (15). As in the single-input case, we merely need to keep track which of the controls lie in the interior and which ones in the boundary of the control set. Since the Hamiltonian is a sum of m terms which are functions of u i , i = 1, . . . , m, this can be done component wise for each control and the resulting Riccati differential equation takes the forṁ where the set I(t) consists of all indices i ∈ {i, . . . , m} for which at time t the ith control lies in the interior of the control set. As before, all partial derivatives are evaluated along the reference extremal. We thus have the following second-order necessary, respectively sufficient Jacobi-type conditions for a local minimum: Theorem 4.2. Let Γ * = (N * , u * , λ) be an extremal lift with simple regular junctions at times t i , i = 1, . . . , k, 0 = t 0 < t 1 < · · · < t k < t k+1 = T , i.e., at each junction only one of the controls changes from an interior to a boundary value and the derivative of the corresponding indicator function does not vanish. If there exists a solution S * to the differential equation (21) with terminal condition S * (T ) = 0 over the full interval [0, T ], then there exists a neighborhood P of N * (T ) such that the corresponding flow of extremal trajectories is a diffeomorphism on the domain D = {(t, p) : 0 ≤ t ≤ T, p ∈ P }. In this case, the reference control u * is a strong local minimum for problem (Pm): the reference controlled trajectory (N * , u * ) is optimal relative to any other controlled trajectory (N, u) for which the graph of the trajectory N lies in the set (D). On the other hand, if there exists a time τ ∈ (0, T ) which is not a junction time such that the solution S * to the differential equation (21) has a finite explosion, then the reference controlled trajectory (N * , u * ) is not a strong local minimum for the optimal control problem (Pm).
We list the partial derivatives of the Hamiltonian which are the obvious modifications from those for the single-input case: We illustrate this procedure with two examples: one where the numerically computed extremal is a strong local minimum and one where it isn't. Indeed, even with convexity on the Hamiltonian in the controls, there are no a priori guarantees that numerically computed solutions would be strongly optimal. For both examples we choose the same replication rate r and death rate µ as in the single-input case, but we vary the functions that describe the effectiveness slightly. In Figure 2 which is an example of a strongly locally optimal solution, we take  = (600, 1400). In this case, the first drug only acts on the populations N i with i ≥ 11 while the second one acts on the populations with index i ≤ 11. However, the first group of populations grows faster than the drug's effectiveness which leads to an overall growth of these populations while the first drug can eliminate the slower growing populations with higher index. But the average of the cell populations is increasing at the end of the therapy interval. Yet, this control is a strong local minimum as the solutions to the Riccati equation exists over the full interval. In Figure 3 we give an example of a numerically computed extremal which no longer is a strong local minimum. Here we take and once again interpolate linearly in the connecting segments. Thus the first drug is able to fully control the populations N i for i ≤ 5 and the second control dominates the populations for i ≥ 7, but both controls together are not able to limit the population N 6 which spikes under this treatment. The solution to the Riccati equation has a finite explosion and thus this no longer is a strong locally optimal solution. While the numerically computed extremal control is able to diminish the cell populations for which the drugs are effective, it is unable to limit the so-to-speak resistant population. Clearly, here a different strategy is needed, possibly one that limits this population by crowding effects that let the other populations survive at higher rates. Figure 4 shows an example of constant ad hoc controls which (as far as the value of the objective functional is concerned) do significantly better than the extremal shown in Fig. 3, yet still are not able to limit the strong dominance of the resistant trait.

5.
Optimal control under tumor heterogeneity with multiple interacting drugs. In this section, we allow for more significant interactions between the controls, but for simplicity we only consider the case of two controls. The Michaelis-Menten functional relations in the dynamics represent the pharmacodynamic effects of the drugs. A tumor cell can only be killed once and thus, taking a probabilistic, semi-mechanistic point of view, we correct these term by a quadratic expression of the form where Ω can also be utilized to incorporate possible synergistic effects. This leads to the following optimal control problem: The Hamiltonian is now given by ] and then minimizing the resulting function of u 2 . We fix the time t, set φ i = λ(t)Φ i (t)N (t), i = 1, 2, ω = λ(t)ΩN (t) and write H(u 1 , u 2 ) for the Hamiltonian as a function of the controls suppressing the other variables which are constant once time is fixed. For a fixed value u 2 of the second control, the minimization problem in u 1 has the same structure as before with indicator function given bỹ In particular, the minimizing control u 1 is unique and we write u 1 = Υ(u 2 ) for this functional relation. If ω = 0, the minimizing control does not depend on u 2 and Υ is constant. Otherwise the indicatorΨ is strictly increasing in u 2 if ω < 0 and strictly decreasing if ω > 0. In the first case, the function Υ consists at most of an initial segment where Υ(u 2 ) ≡ 0 followed by a segment where Υ(u 2 ) lies in the interior of the control set and is strictly increasing and possibly closing with another segment where Υ(u 2 ) ≡ u max 1 . These monotonicity relations are reversed if ω < 0. In any case, Υ is piecewise smooth and monotonically non-decreasing or non-increasing. Furthermore, and denoting the derivative of Υ by Υ , we have that For, on intervals where the minimizing control lies in the interior of the control set the first term vanishes while the second term is zero when the minimizing control is constant. As a result, the composite function defined by η(u 2 ) = H(Υ(u 2 ), u 2 ) is smooth. In particular, it has a well-defined minimum over the control set [0, u max 2 ]. Suppose this minimum is attained at an interior point u * 2 . Then we have that Hence any such point where Υ(u 2 ) is constant in a neighborhood of u * 2 is a local minimum. If Υ(u 2 ) lies in the interior of the control interval near u * 2 , then it follows from ∂H ∂u1 (Υ(u 2 ), u 2 ) ≡ 0 that and its sign is indeterminate. Computing Υ (u 2 ) by implicit differentiation of ∂H ∂u1 (Υ(u 2 ), u 2 ) ≡ 0, it follows that η (u * 2 ) > 0 if and only if 4γ In principle, it can happen that this inequality is reversed and then possibly multiple extrema in the interior of the control set may exist. This raises the specter of possible jumps in the controls. Since our model does not include a pharmacokinetic model on the drug actions, this is feasible, even if we think of the controls as representing the 'concentrations' of these drugs. With a pharmacokinetic model, these discontinuities simply are eliminated by smoothing the jumps, a standard feature in such models. In various simulations, though, we never encountered jumps and extremal controls always were continuous. Figure 5 gives a snapshot for how the functions Υ, η and the full Hamiltonian H look as function of the controls for one of the numerical examples below. If the controls are continuous, then, and under analogous assumptions as before (switchings between interior and boundary controls occur one at a time, the derivatives of the indicator functions at the junctions are nonzero, explosion times are not junctions) secondorder necessary and sufficient conditions for optimality can be formulated as above in terms of the existence of a solution S to a Riccati differential equation. In this case, however, the Riccati differential equation does not simplify as in (21), but the matrices of the full partial derivatives in the control vector u need to be computed, i.e.,Ṡ with all second order partial derivatives evaluated along the reference extremal. We close with a couple of examples that show the behavior of solutions for this model. We again choose the same birth and death rates as before, r(x) = (1−x) 6 . The difference between the two runs lies in the time horizon. Over the full therapy horizon T = 20 this control is not optimal (see Fig. 7 which shows that the solution to the Riccati equation explodes around t = 9), while it still is locally optimal over a shorter therapy horizon for T = 5 (Fig. 8). Over the shorter horizon, the control is able to diminish all subpopulations with the populations N i for i = 8, 9 and 10 the highest. Over a prolonged time-period, the population N 9 becomes dominant and leads to a spike in the density with the populations for a low or high index almost eliminated. Note that the oncoming explosion can already be noticed in the results for the shorter time horizon as the solutions the Riccati equation tend to get larger at the initial time.
6. Comments about the numerical algorithm used to compute extremals. For the numerical computations we used the standard shooting method to compute extremals. Starting with an initial guess for λ(0), the dynamics and adjoint equation are integrated forward in time while the control is chosen as the minimizer of the Hamiltonian function over the control set. We then implement small perturbations which, for each element of λ(0), are derived from the gradient (sensitivities) of the multiplier λ(T ) at the terminal time with respect to λ(0). Numerically, we compute the matrix ∂λ(T ) ∂λ(0) ∈ R n×n and then implement λ(0) − η ∂λ(T ) η > 0 a step size as the new guess for λ(0). This procedure is iterated until a desired error ε in the transversality condition on the multiplier λ at the end-point is achieved, i.e., λ(T ) − α < ε. The pseudo-code in Algorithm 1 illustrates this shooting algorithm.
These computations are able to generate extremal controlled trajectories corresponding to either the single or multiple control case for the model. However, the shooting method becomes really sensitive to the initial guess when the interaction between drugs is taken into account and thus requires a good initial guess for λ(0) which is close to the true initial condition of λ. In order to obtain such an initial guess for λ(0), we first run an iterative procedure that is shown in the pseudo-code Algorithm 2, where we synthesize the nonlinear ensemble dynamics by solving a linear ensemble approximation for each iteration. This procedure has been proven to be convergent [29] and converges to a good candidate for the solution of the optimization problem.

Algorithm 2 Iterative Method
Require: N (t) > 0, ∀t ∈ [0, T ] and pick a threshold > 0. Ensure: initial guess Start with an initial trajectory N (0) and k = 1. while N (k) − N (k−1) > do A (k−1) ⇐ function of N (k−1) O (k−1) ⇐ function of N (k−1) Solve S (k) , ν (k) backward in time by assuming λ (k) = S (k) N (k) + ν (k) . Update N (k) forward in time: Conclusion. In this paper, we formulated Jacobi-type second order necessary and sufficient conditions for the local optimality of numerically computed extremal controls for a mathematical model of chemotherapy under tumor heterogeneity when a large number of different traits are considered. Different from previous papers, here a Michaelis-Menten type functional relationship (given by strictly concave functions) for the pharmacodynamics of the drugs was explored. This induces partial convexity properties on the Hamiltonian function of the control problem which allow efficient numerical computations even for the large systems considered here, albeit aided by a sparse dynamics. Our numerical computations highlight the possibility of conjugate points and several examples of non-optimal numerically computed extremals are given stressing the importance of efficient second-order tests for optimality.