Strict dissipativity analysis for classes of optimal control problems involving probability density functions

Motivated by the stability and performance analysis of model predictive control schemes, we investigate strict dissipativity for a class of optimal control problems involving probability density functions. The dynamics are governed by a Fokker-Planck partial differential equation. However, for the particular classes under investigation involving linear dynamics, linear feedback laws, and Gaussian probability density functions, we are able to significantly simplify these dynamics. This enables us to perform an in-depth analysis of strict dissipativity for different cost functions.


1.
Introduction. Strict dissipativity of optimal control problems is a pivotal property for the rigorous stability and infinite horizon performance analysis of general (often also termed economic) Model Predictive Control (MPC) schemes. This fact was revealed in a series of recent papers, see, e.g., [10,2,16] or the monographs and survey papers [28,19,11], which provide an abstract framework for the analysis of MPC schemes based on strict dissipativity, which has triggered a renewed interest in this classical systems theoretic property that goes back to [34].
In this paper, we are going to check strict dissipativity, i.e., the central assumption of this abstract framework, for a class of optimal control problems for probability density functions (PDFs). Such problems, in which the entire distribution of an Itô-stochastic control system is shaped via a suitable optimization objective, provide an interesting alternative to classical stochastic optimal approaches that optimize the mean or higher moments. The dynamics of the optimal control problem to be solved in this setting is determined by the Fokker-Planck partial differential equation (FP-PDE) [29], and the approach became particularly popular due to a series of papers in which an MPC-like scheme was applied to this problem [3,4]. Since then, this approach was used in different contexts, e.g. in [7,30].
One way of establishing strict dissipativity is by using the necessary optimality conditions provided by Pontryagin's Maximum Principle (PMP) and the corresponding Hamiltonian optimality system or the corresponding Riccati equation. Most papers following this PMP approach -such as [24,31,32,8,20,22] -do not show strict dissipativity but rather aim at proving certain stability properties of the optimally controlled system, typically in form of the so called turnpike property that we will discuss below at the end of Section 3. However, as shown in [18,17], this property is in many situations equivalent to strict dissipativity. While the PMP based approach is very powerful, it only applies to problems with linear dynamics and quadratic cost, because it requires this structure -and in particular the convexity implied by this structure -to obtain the (global) strict dissipativity property from the (local) information on the optimality system. The class of systems we discuss in this paper, however, does not have linear dynamics. For such systems, the PMP approach in the form it is used in the literature only provides local stability results via linearization (e.g., in [32]) but no global stability or strict dissipativity results, which are needed for the analysis of MPC schemes. Since the PDF control problem we consider in this paper is nonlinear, the PMP approaches from the literature thus do not apply and we need to proceed in a different way.
For the simplest case, in which the optimization objective penalizes the distance of the state-control pair to a desired equilibrium PDF and the corresponding control input, the stability and performance of MPC for this problem was analyzed in [12,13]. However, the setting analyzed in these references assumes that the control input corresponding to the desired equilibrium PDF is known and used in the optimization objective. Both may not be realistic: the computation of the control may be difficult -as it is determined by an inverse problem involving the FP-PDE -and it may not be desirable to penalize the distance of the control to this control but rather an economically more meaningful quantity, such as the overall control effort. When proceeding this way, one ends up at a more general MPC problem, a so-called unreachable setpoint problem [27]. This problem falls into the more general class of economic MPC schemes for which strict dissipativity is the main ingredient for guaranteeing stability and near optimal performance. We note that recently, largely motivated by mean-field type control problems, a wealth of theoretical results and numerical methods for optimal control problems involving the FP-PDE have been developed, e.g., in [1,4,5,6]. These can be used for solving the subproblems in an MPC scheme, making MPC numerically efficient for solving infinite horizon optimal control problems, provided the subproblems have the strict dissipativity property.
Motivated by these facts, we decided to analyze strict dissipativity for certain classes of this optimal control problem. This analysis was started in [14], where objectives involving the L 2 -norm for penalizing the distance to the desired equilibrium PDF was investigated -the results from this paper will be briefly summarized below. Here we extend and complement this analysis to alternative cost functions including the Wasserstein distance W 2 . While both the L 2 and the W 2 cost are perfectly suited for being used in a nonlinear setting for general PDFs -the resulting control scheme performs excellently in numerical tests -it turned out that for a rigorous mathematical analysis the problem must be simplified. Hence, as in [14], we perform our analysis for linear SDE dynamics governed by the Ornstein-Uhlenbeck process, linear feedback controllers, and Gaussian PDFs. While this is clearly a restricted setting, we believe that the insights from this analysis are nevertheless very valuable for the general nonlinear setting, in particular those results that clarify that certain storage functions are not appropriate for the strict dissipativity analysis. Clearly, if certain approaches do (provably) not work in the linear Gaussian setting, they will inevitably also fail in more general settings. Moreover, the linear Gaussian setting allows us to compare our results with general purpose cost functions to results for a cost function that is particularly tailored to the linear Gaussian setting. This cost function combines the 2-norm for the mean and the Frobenius norm for the covariance matrix of the Gaussian PDF and is thus termed 2F cost. Despite its similarity with the W 2 cost, the results on strict dissipativity are strikingly different for these two cost functions, which is another important result of this paper.
The paper is organized as follows. Section 2 introduces the problem and the cost function under consideration. In Section 3 we introduce strict dissipativity and briefly summarize the main results for MPC schemes that can be derived from this property, in order to motivate our subsequent analysis. Section 4 collects a few auxiliary results and summarizes the main results from [14] for the L 2 cost before we present our new results for the W 2 cost and the 2F cost in Section 5. We end the paper with concluding remarks in Section 6.
2. Problem setting. In this paper we study the optimal control of probability density functions associated with linear continuous-time stochastic processes with an initial conditionX ∈ R d that is normally distributed, i.e.,X ∼ N (μ,Σ) with initial meanμ ∈ R d and covariance matrixΣ ∈ R d×d , which is symmetric and positive definite. The matrices A ∈ R d×d , B ∈ R d×l , and D ∈ R d×m and the m-dimensional Wiener process W t ∈ R m are given. The control u(t) is defined by with functions K : R ≥0 → R l×d and c : R ≥0 → R l . Since the control u(t) exhibits this special structure, whenever beneficial, we identify with u the pair (K, c). Plugging (2) into (1) leads to In this setting, X t ∈ R d is normally distributed for all t ≥ 0 and the corresponding PDF ρ reads where for matrices A ∈ R d×d , throughout the paper, we write |A| := det(A). The evolution of the PDF (associated with the stochastic differential equation (SDE) (1) or (3)) is described by the Fokker-Planck equation, a parabolic linear partial differential equation: where α ij := k D ik D jk /2, b(X t , t; u) := (A − BK(t)) X t +Bc(t), Q := Ω×(0, T E ), and Ω := R d . This approach works for more general Markov processes and is not limited to normal distributions, cf. [29,25,26]. As mentioned in the introduction, in order to enable the analysis in this paper, we limit ourselves to the case of Gaussian distributions with mean µ(t) ∈ R d and covariance matrix Σ(t) ∈ R d×d . In this case, we can replace the Fokker-Planck equation by the following system of ODEs for µ and Σ:μ Using this ODE system will enable us to analyze strict dissipativity for the optimal control problem we will introduce below. Remark 1. The considered systems (3) and (6) form only a small subclass of the more general problem class for which MPC is known to yield good numerical results. Here, we follow the common practice in systems and control theory to first look at these arguably simpler systems because they are more amenable to a rigorous mathematical analysis. We note, however, that both dynamics (3) and (6) are bilinear if K(t) and c(t) are to be optimized. Although simplified, we expect several benefits from our study for more general settings: cost functions that turn out to work well for the bilinear problem might also perform well for more general nonlinear problems. Conversely, cost functions that do not perform well for these simpler systems are likely to perform poorly for more general nonlinear problems, as well. Finally, the results for the bilinear case provide the basis for obtaining local nonlinear results via bilinearization.
Particularly, we will carry out the analysis in this paper for the Ornstein-Uhlenbeck process.
Our aim is to steer the probability density function ρ to a desired Gaussian PDF starting from an initial (Gaussian) PDFρ. In continuous time, this can be formulated as the following optimal control problem: where for some norm · and some weight γ ≥ 0 and where we use (4) to calculate the PDF from the solution of (6).
The choice of , i.e., the choice of the norms in (12) is important. For the control cost, we identify with u the pair (K, c) and suggest the Frobenius norm for K and the Euclidean norm for c, which fit well together. For the state cost, we consider three options. The first possibility is to use the L 2 norm which is the standard norm used in costs for optimal control problems governed by parabolic PDEs [33]. This choice of the cost was analyzed in [14]. We can express the L 2 norm in terms of Σ and µ, which proves useful when focusing on the ODE system (6) instead: Looking at the cost from the ODE perspective, the L 2 penalization (13) does not seem standard or intuitive at all. One alternative is to use the Wasserstein metric, which is specifically designed to measure the distance between two PDFs. For the general definition of this metric we refer to [15]. Here we only use the formula for the Wasserstein metric for normal distributions derived in [15]. In case Σ andΣ commute 1 this formula yields the following stage cost: The third option we discuss in this paper is very similar to the Wasserstein distance from (15). The only difference is to consider Σ andΣ instead of Σ 1/2 andΣ 1/2 , respectively. Thus, we end up with µ 2F (µ, Σ, K, c) := This form of the cost function is commonly used in optimization of systems governed by ODE systems. The index used in the notation for this cost, 2F, indicates the combination of Euclidean and Frobenius norm. In the special caseΣ = I we have that µ W 2 (µ, Σ 2 , K, c) = µ 2F (µ, Σ, K, c), i.e., considering the squared covariance matrix Σ 2 instead of Σ in the W 2 cost leads to the 2F cost. All three stage costs -minus the control cost -are illustrated in Figure 1.
To summarize, we consider infinite horizon optimal control problems for probability density functions governed by different types of cost functions. For these problems we investigate strict dissipativity in this paper. The motivation for this analysis is given by one of the most popular computational approaches to such optimal control problems, namely Model Predictive Control (MPC). Before we turn to the dissipativity analysis in Sections 4 and 5, we explain this motivation and define strict dissipativity in the next section.
3. Model predictive control. In this section we briefly introduce the concept of (nonlinear) MPC. More details can be found in the monographs [19] and [28].  Suppose we have a process whose state z(k) is measured at discrete times t k , k ∈ N 0 . Furthermore, suppose we can control it on the time interval [t k , t k+1 ) via a control signal u(k). Then we can consider nonlinear discrete time control systems with state z(k) ∈ X ⊂ Z and control u(k) ∈ U ⊂ U , where Z and U are metric spaces and state and control constraint sets are given by X and U, respectively. Whenever clear from the context, we might abbreviate the definition of the control system in (17) by z + = f (z, u). Continuous time models such as the one presented in Section 2 can be considered in the discrete-time setting by sampling, using a (constant) sampling time T > 0, i.e., t k = t 0 + kT . Given an initial state z 0 and a control sequence (u(k)) k∈N0 , the solution trajectory is denoted by z u (·; z 0 ). Note that we do not require the control u(k) to be constant on [t k , t k+1 ) -in general, each u(k) can be a time-dependent function on [t k , t k+1 ).
As we have seen in Section 2, stabilization and tracking problems can be recast as infinite horizon optimal control problems (11). However, solving these is in general computationally hard. The idea behind MPC is to circumvent this issue by iteratively solving optimal control problems on a shorter, finite time horizon and use the resulting optimal control values to construct a feedback law F : X → U for the closed loop system In the discrete-time setting, the infinite horizon functional (11) translates to Since infinite horizon problems are in general very difficult to solve, we construct the feedback law F through the following MPC scheme: 0. Given an initial value z F (0) ∈ X, fix the length of the receding horizon N ≥ 2 and set n = 0. 1. Initialize the state z 0 = z F (n) and solve the following optimal control problem: Apply the first value of the resulting optimal control sequence denoted by u * ∈ U N , i.e., set F(z F (n)) := u * (0). 2. Evaluate z F (n + 1) according to relation (18), set n := n + 1 and go to step 1. Whenever we want to point out the importance of N , we will denote the feedback by F N instead of F.
When passing from the infinite horizon formulation to the MPC scheme, a priori it is not clear, at all, whether we will obtain approximately optimal trajectories. In fact, it is not even clear whether the closed loop system is asymptotically stable.
A key difference for the analysis of MPC schemes is whether is positive definite with respect to some given equilibrium pair (z e , u e ) of (17), i.e., (z e , u e ) = 0 and (z, u) > 0 for (z, u) = (z e , u e ), where f (z e , u e ) = z e . A prime example is the stage cost for some norm · and some γ > 0. For this case, called stabilizing MPC, in [13] we have answered the question of minimal stabilizing horizon lengths for a class of linear stochastic processes.
However, the above cost function may be difficult to compute because one needs to know the corresponding u e for a desired z e beforehand, which may be cumbersome to compute. A stage cost that is less complicated to design and thus easier to implement is This function is also more common in optimal control literature and structurally similar to the costs (13), (15), and (16). For u e = 0, the new stage cost is not positive definite w.r.t. (z e , u e ) since (z e , u e ) = 0. 2 The specific stage cost (21) models a so-called unreachable setpoint problem [27], which is a particular instance of an economic MPC problem. The conceptual difference between stabilizing and economic MPC is that, instead of stabilizing a prescribed equilibrium pair (z e , u e ) via a stage cost that is positive definite w.r.t. that pair, in economic MPC the interplay of these stage cost and dynamics determines the optimal (long-term) behavior. As such, equilibria stay equally important, but the definition of the decisive optimal equilibrium changes.
Assuming such an equilibrium pair (z e , u e ) exists (which is the case for the dynamics considered in Section 2) and if f and are continuous and X × U is compact, then an optimal equilibrium exists, see, e.g., [19,Lemma 8.4]. The next natural question is under which circumstances -if at all -it is asymptotically stable for the MPC closed loop. In [2,21] it was shown that strict dissipativity is the decisive property. In order to define it, we use the notation for the distance from z 1 to z 2 and recall the notion of comparison functions, introduced by Hahn in [23].
A continuous function β : (a) The optimal control problem (OCP N ) with stage cost is called strictly dissipative at an equilibrium pair (z e , u e ) ∈ X×U if there exist a function λ : X → R that is bounded from below and satisfies λ(z e ) = 0 and a function ∈ K ∞ such that for all (z, u) ∈ X × U : (b) If (a) holds with ≡ 0 then the optimal control problem is called dissipative.
(c) The function λ in (a) is called storage function.
(d) The left-hand-side of (23), i.e., is called modified cost or rotated cost.
Note that the requirement λ(z e ) = 0 in Definition 3.3(a) can always be satisfied by a constant translation of λ without influencing the inequality (23).
In a classical interpretation of (23), λ(z) serves as a quantifier for the amount of energy stored at state z, (z, u) − (z e , u e ) can be viewed as a supply rate that tracks the amount of energy supplied to or withdrawn from the system via the control u, and (|z| z e ) is the amount of energy the system releases (or dissipates) to the environment in each step. Note, however, that in the optimal control problems we discuss here there is not necessarily a notion of "energy" in a physical sense.
For strictly dissipative optimal control problems satisfying appropriate continuity properties 3 , the following statements hold.
• The optimal equilibrium z e is practically asymptotically stable for the MPC closed loop, where the neighborhood around z e to which the closed-loop trajectory converges shrinks down to z e as the horizon N → ∞. • The MPC closed-loop trajectories are approximately averaged optimal with approximation error tending to 0 as N → ∞. • On any finite horizon K, the MPC closed-loop trajectories are approximately optimal among all other trajectories converging to z e , with an approximation error that grows linearly in K and tends to 0 as N → ∞. • For suitable terminal constraints and costs these properties can be improved to exact (as opposed to practical) asymptotic stability, exact averaged optimality and to finite-horizon optimality with an approximation error that is independent of K.
In summary, strict dissipativity is the decisive structural property that makes MPC work. This is the main motivation why we analyze it in this paper. Before turning to this analysis, we briefly discuss its relation to another important property of optimal control problems, the so-called turnpike property. This property demands that there exists σ ∈ L such that for all N, P ∈ N, z ∈ X and the optimal trajectories z * (k, z) with horizon N the set has at most P elements. In words, most of the time the finite-horizon optimal trajectories stay close to the optimal equilibrium z e .
Under a boundedness condition on the optimal value function (known as cheap reachability), it can be shown that strict dissipativity implies the turnpike property, and under a controllability condition, these two properties are even equivalent [18]. Hence, the turnpike property is often a good indicator for strict dissipativity. In contrast to strict dissipativity, the turnpike property is more difficult to check analytically, because it involves the knowledge of optimal trajectories. On the other hand, the turnpike property is more easily checked numerically by means of simulating optimal trajectories. Hence, these two properties complement each other in a nice way when analyzing strict dissipativity of optimal control problems. 4. Auxiliary results regarding dissipativity. After the introduction of MPC, now we turn our attention to dissipativity. More precisely, we analyze whether the optimal control problems under consideration are (strictly) dissipative in the sense of Definition 3.3. For this, we first rephrase our objective to steer a PDF to a target PDF in discrete time. To this end, we have to specify the dynamics at hand.
As mentioned in Section 2, we will carry out our analysis for the Ornstein-Uhlenbeck Process from Example 1. The reason for this is its simple, but bilinear structure. For the sake of better comparability to [10,9], in which dissipativity of linear discrete time dynamics was considered, we would like to keep the bilinear structure in the discrete time setting. Moreover, in any numerical implementation of MPC the dynamics must be approximated by a numerical scheme. In order to allow for a fast computation of the optimal open-loop trajectories, in MPC implementations simple but less accurate schemes are often preferred to more expensive high-order methods. For these reasons, as in [14], we perform our analysis for the forward Euler approximation of the ODE system (9). This discretization both maintains the bilinear structure and defines a scheme that is frequently used in practice. It is given by In contrast to [14], in which strict dissipativity for the stage cost L 2 (ρ, u) from (13) was analyzed, the stage cost we consider in this paper is given either by (16). Remark 2. Note that the state constraint Σ > 0 automatically holds for (6) and (9). However, when switching to the Euler approximation (25), we have to impose Σ(k) > 0 as a constraint for all k ∈ N 0 . In conjunction with K θ (k) > 0, cf. (8), this can be incorporated as control constraints The optimal control problem (OCP N ) that is solved in the MPC algorithm then is with given by either (15) or (16). To prove that (27) is strictly dissipative, we need to find a suitable storage function λ for which the inequality (23) in Definition 3.3 holds. In general, it is not easy to find such a function. However, for OCPs with linear discrete-time dynamics a convex constraint set and strictly convex stage cost , it is known [10] that the linear function λ l (z) :=λ T z (29) is a suitable storage function; for a proof, see, e.g., [9]. 4 Here,λ is the Lagrange multiplier in the optimization problem consisting of finding the optimal equilibrium (z e , u e ): min The reason for this linear storage function is the close connection between the Lagrange function L(z, u, λ) associated to (30) and the resulting modified cost˜ : In this particular form of dissipativity, also known as strict duality in optimization theory, the strict convexity of carries over to L and therefore to˜ , with the global minimum being attained at (z e , u e ). In the final step, due to L(z e , u e ,λ) = (z e , u e ), we have that˜ is positive definite with respect to the optimal equilibrium (z e , u e ), which allows to conclude (23), i.e., strict dissipativity. Although this is, in general, not true for nonlinear f (z, u), in the following, we analyze how far the approach of a linear storage function can be successfully extended to bilinear OCPs, such as (27) with stage cost given by (15) or (16). To this end, in the rest of this section, we state some auxiliary results. These were presented in [14,Lemmas 4,5,6] for the L 2 cost (13). Since they trivially extend to the stage costs (15) and (16), we omit the proofs.
The first result characterizes equilibria. We recall that the imposed constraints ensureK θ = θ +K > 0, cf. (26).  (9) and (25) and is given by Without loss of generality, we assume that (μ,Σ) = (0, 1). Otherwise we introduce a new random variable Y t :=Σ −1/2 (X t −μ) and get a new ODE system similar to (9). With this assumption, due to (32), we havec = 0, which allows us to further simplify the dynamics under consideration for the chosen cost criteria.
with the same , is strictly dissipative at the equilibrium (Σ,K).
Thus, in the following, we only need to examine whether (33) is strictly dissipative with the respective stage cost. In this setting the two different stage cost functions under consideration, (15) and (16), can be simplified to respectively. For these cost functions we state auxiliary results about optimal equilibria.
Lemma 4.3. Let (Σ e , K e ) be an optimal equilibrium for one of the stage cost func- For the stage costs considered here, parameters satisfying ς 2 /2−θ = 0 correspond to stabilizing MPC. In this case, the respective OCPs are strictly dissipative with storage function λ ≡ 0. Hence, that case is excluded in the ensuing analysis.
Before we turn to present our new results for the W 2 stage cost (15) and the 2F stage cost (16), we briefly recall the results from [14] for the L 2 stage cost (13). The different cases in Lemma 4.3 were decisive for this analysis: • For ς 2 /2−θ > 0, strict dissipativity cannot hold with a linear storage function.
• In the case ς 2 /2 − θ < 0, strict dissipativity with a linear storage function has to be checked on a case-by-case basis. • For both ς 2 /2 − θ > 0 and ς 2 /2 − θ < 0, a nonlinear storage function was constructed for which strict dissipativity holds for certain values of θ and ς. However, the verification is tedious and must be done on a case-by-case basis. • Numerical verification of the turnpike property suggests that strict dissipativity holds for many parameters for which the analytical verification is not (yet) possible. As we will see, these cases will also play a role in the analysis in the next section.

5.
Results on strict dissipativity. In Section 4 we simplified the OCP under consideration, (27), by finding an equivalent formulation (33), which is sufficient for analyzing dissipativity. This section is dedicated to the dissipativity analysis of the OCP (33) for the 2F cost (35) and the W 2 cost (34).
As mentioned in Section 4, we cannot directly apply the dissipativity results from [10,9] because our dynamics are not linear but bilinear. In particular, convexity of the stage cost does not necessarily carry over to the modified cost˜ , cf. Definition 3.3. Hence, we will perform an ad-hoc analysis for the three different stage costs, which includes a convexity analysis of the respective modified stage costs˜ and -if˜ is not convex -a closer look at stationary points and boundary values of˜ . With the structural insight we gain from these computations we are not only able to identify settings where strict dissipativity with a linear storage function can be shown, but also to provide alternative, non-linear storage functions to prove strict dissipativity in settings where a linear storage function cannot be used for that purpose. 5.1. 2F cost. In this section we consider the OCP (33) with the 2F stage cost (35). In the one-dimensional case, this amounts to penalizing the quadratic deviation of the variance in addition to the control effort. Overall, the optimization problem in this section is given by For the linear storage function λ l (z), the corresponding modified cost˜ 2F (Σ, K), cf. (24), reads Throughout this section, the pair (Σ e , K e ) denotes an optimal equilibrium, cf. Definition 3.1, i.e., a solution of min (Σ,K) The unique 5 Lagrange multiplierλ ∈ R is obtained from the associated Lagrange function Note that we have not included state or control constraints in the Lagrange function. This is to keep the close connection to the modified cost˜ , cf. (31). From Lemma 4.3 we know that these constraints are always satisfied for optimal equilibria. However, for other stationary points of˜ 2F -which will be of interest in the following, since a necessary condition for strict dissipativity at an equilibrium (Σ e , K e ) is that this equilibrium is the unique global minimum of the modified cost˜ 2F (Σ, K) -we will have to check for admissibility. We characterize these stationary points for a fixed λ next. To this end, we introduce the notation Z := 2λT, which we will use throughout this section. The gradient of˜ 2F is then given by and it holds ∇˜ 2F (Σ, K) = ∇ Σ,K L 2F (Σ, K,λ).
Lemma 5.1. For a fixedλ ∈ R and thus Z, the stationary points of˜ 2F are given by either for arbitrary K in case γ − Z 2 = 0.
Proof. Solving ∂ Σ˜ 2F (Σ, K) = 0 for Σ yields cf. (41). Plugging this into ∂ K˜ 2F (Σ, K) = 0 results in Assuming that γ − Z 2 = 0, one can solve for K, which results in the equation for K in (43). Plugging this K into (45) gives the equation for Σ in (43). If γ − Z 2 = 0, then Z = 0 since γ > 0. Since (Σ e , K e ) is always a stationary point due to (42), we infer from (46) that 1 − Zθ = 0, i.e., Z = 1/θ. In this case, from (45) we get (44) for arbitrary K. As indicated in the proof of Lemma 5.1, the sign of γ − Z 2 is indeed crucial for the rest of this subsection: Since the Hessian is constant, the necessary condition for (strict) dissipativity that the optimal equilibrium is a (strict) global minimum of the modified cost˜ 2F is indeed sufficient, thus equivalent. This requirement is met if and only if γ − Z 2 > 0, i.e., if the modified cost˜ 2F is strongly convex. Hence, strict dissipativity with the linear storage function λ l (z) is equivalent to strong convexity of the modified cost˜ 2F . This is in contrast to the L 2 cost from [14], where the modified cost is not convex for sufficiently large Σ and convexity of the modified cost is only sufficient for (strict) dissipativity. In fact, the 2F cost is similar to the linear setting considered in [9], where strict convexity of the stage cost is sufficient for strict dissipativity. The difference, of course, lies in the bilinear terms in the dynamics f , which cause nonzero entries on the off-diagonal of the Hessian (47). Because of this, convexity of the stage cost 2F does not necessarily carry over to the modified cost˜ 2F . Hence, we check the convexity of˜ 2F directly. To this end, the decisive factor is the sign of γ − Z 2 . Thus, in the following, we focus on finding sets of parameters for which a certain sign of γ − Z 2 can be guaranteed. Based on Lemma 4.3 we consider the two cases ς 2 /2 − θ > 0 and ς 2 /2 − θ < 0 separately.
The case ς 2 /2 − θ > 0. In contrast to the L 2 cost, where for ς 2 /2 − θ > 0 (strict) dissipativity cannot hold with a linear storage function (see the summary at the end of Section 4 or [14, Prop. 7]), with the 2F cost strict dissipativity does hold with a linear storage function.
Proof. The assertion follows from the fact that for ς 2 2 − θ > 0 the Hessian ∇ 2˜ 2F (Σ, K) is positive definite. Indeed, in this case the function˜ 2F in (24) is strictly convex, which immediately implies the existence of a quadratic lower bound ∈ K ∞ in the dissipativity inequality (23). It is thus sufficient to prove the Hessian is positive definite, which holds if and only if γ − Z 2 > 0. To prove this, we need some information about the Lagrange multiplierλ, which we get by taking a closer look at the Lagrange function (40). Since (Σ e , K e ) is an optimal equilibrium, ∇L 2F (Σ e , K e ,λ) = 0. In particular, we can use the results of Lemma 5.1 due to (42).
The case ς 2 /2 − θ = 0 is of no particular interest as it corresponds to the case of stabilizing MPC, cf. Lemma 4.3. Therefore, the natural follow-up question is what happens in case of ς 2 /2 − θ < 0.
Without the restriction on γ, there is one problematic case, in which we indeed lose strict dissipativity due to γ − Z 2 = 0. According to Remark 3(b), for this to happen it is necessary that γ = 1/θ 2 . The following proposition deals with this special case.
Proof. We first calculate the stationary points that are equilibria. To this end, we use and plug this state into the cost function 2F , i.e., 2F ς 2 2(θ + K) , K = 1 2 Then we compute the stationary points of the reduced cost functionˆ 2F (K) in the special case γ = 1 θ 2 : Since violates the constraint K > −θ, we ignore this solution. Moreover, we only care about real solutions. Therefore, we have three distinct solutions if and only if 2ς 2 − θ < 0. 6 Now we consider the three different cases in the Proposition.
(a) Let 2ς 2 − θ < 0. Then the controls K 1 , K 2 , and K 3 satisfy (26) with Σ as in (51). The respective cost is given bŷ We can exclude a minimum ofˆ 2F (K) on the boundary sinceˆ 2F (K) → ∞ for K −θ and for K → ∞. Sincê there are two optimal equilibria, characterized by K 1 and K 2 . Thus, strict dissipativity is out of the question. However, we argue that dissipativity with λ l (z) does hold. For this, we show that γ − Z 2 = 0, i.e., that˜ 2F (Σ, K) is convex but not strictly convex. With the corresponding states a short calculation using yields the associated Lagrange multipliers Z 1 = 1 θ = Z 2 . In particular, we have (b) For 2ς 2 − θ = 0, we get the same result, i.e., dissipativity but not strict dissipativity.
2θ is the unique optimal equilibrium and an analogous calculation reveals that γ − Z 2 3 > 0, i.e., strong convexity of˜ 2F and thus strict dissipativity. This concludes the proof.

Remark 4.
Coinciding with the requirement on γ in Proposition 2, the reduced costˆ 2F (K) from (52) is convex if and only if γ ≥ 1/(4ς 4 ), cf. Figure 2(b). However, as we will see in the subsequent section, in general, convexity of the reduced cost 2F (K) does not transfer to the modified cost˜ 2F (Σ, K).  We briefly summarize the case ς 2 /2 − θ < 0. Instead of a case-by-case analysis that was required for the L 2 cost (see the summary at the end of Section 4) we have shown strict dissipativity provided that γ > 1/(4ς 4 ). Furthermore, we have identified cases in which strict dissipativity does not hold due to the existence of two optimal equilibria, which can only happen if γ = 1/θ 2 . Even for ς 2 /2 − θ < 0 and γ ≤ 1/(4ς 4 ), as long as γ = 1/θ 2 , our numerous simulations indicate that γ−Z 2 > 0. Thus, we conjecture that strict dissipativity (with a linear storage function) holds for the 2F cost provided that γ = 1/θ 2 . To prove this rigorously, one could solve ∇L 2F (Σ, K,λ) = 0 for arbitrary γ > 0. Ultimately, as (52) indicates, this requires finding the roots of a fourth-order polynomial. We avoid from carrying out this computation here for the sake of brevity.
Modifications to the stage cost 2F . In this part we propose two modifications to the stage cost 2F and discuss whether they facilitate the analysis. The first proposal is a scaling of the stage cost.
Remark 5. One could argue that, due to the forward Euler approximation (25), the dynamics are effectively scaled by the sampling time T , and this scaling should also be applied to the stage cost, i.e., use T · 2F instead of 2F . In that case, T can be factored out of the Lagrange function (40), arriving at which is the Lagrange function for the (unscaled) stage cost 2F and continuous dynamics, cf. (9). The Lagrange multiplierλ c is unique and independent of the sampling time T . Thus, while the Lagrange multiplierλ from (40) changes with T , the productλT and the optimal solutions are independent of T . Since in this subsection only the productλT is of relevance, we avoid scaling the stage cost .
The second proposal concerns the control cost γ 2 K 2 in the stage cost 2F . When switching from linear to bilinear systems, it appears reasonable to replace the term penalizing the control effort, K 2 , with (θ + K) 2 in 2F (Σ, K) because this removes the discrepancy between the control term K 2 in the stage cost and the bilinear term (θ + K)Σ in the dynamics. Indeed, this considerably simplifies the analysis.
then (37) is strictly dissipative with the linear storage function λ l (z).
Note that (67)-(68) coincides with (43) in the case θ = 0. Hence, in finding an optimal equilibrium, this is equivalent to setting θ to zero in the original stage cost. For θ = 0 the requirements of Proposition 1 are met and thus the result of Proposition 4 is not surprising. Although the stage cost 2F,θ (Σ, K) is much easier to handle, the price to pay is the loss of optimal equilibria with Σ e ∈ (0, 1): we can see from (68) that Σ e = 1 + Z 2 γ−Z 2 > 1 since γ − Z 2 > 0.
Summary. We summarize our results for the 2F cost in a similar form as for the L 2 cost at the end of Section 4: • For ς 2 /2 − θ > 0, strict dissipativity holds with a linear storage function.
We emphasize once more that for the 2F stage cost considered in this section, proving strict dissipativity with a linear storage function is equivalent to proving strict convexity of˜ 2F (Σ, K). This is in contrast to the L 2 cost (13) considered in [14], where the modified cost was never convex, but for some parameters the OCP was nevertheless strictly dissipative with a linear storage function, cf. [14,Example 10]. In this sense, the W 2 cost considered in the following section is more similar to the L 2 cost than to the 2F cost. 5.2. W 2 cost. The W 2 cost is designed to measure the distance between two PDFs. In our case, it differs only slightly from the cost in the previous section: Instead of (Σ−1) 2 , the square root of the current and the desired state is taken and a quadratic cost is inflicted on the distance thereof, i.e., ( √ Σ − 1) 2 . In this one-dimensional case, this amounts to penalizing the difference in the standard deviation instead of in the variance. Surprisingly, this small difference changes the dissipativity analysis considerably.
Overall, the optimization problem in this section is given by (69) As before, (Σ e , K e ) denotes an optimal equilibrium, i.e., a solution of min (Σ,K) For the linear storage function λ l (z) the corresponding modified cost˜ W 2 (Σ, K) reads (71) Due to ∇(Σ − f (Σ, K)) = 0 the Lagrange multiplierλ ∈ R obtained from the associated Lagrange function is unique. We proceed as in Subsection 5.1 and do not include state or control constraints in the Lagrange function. These constraints are satisfied for optimal equilibria, cf. Lemma 4.3, but have to be enforced for other stationary points of W 2 , where, with Z = 2λT , the gradient reads We count these stationary points next.
Proposition 5. For a fixedλ and thus fixed Z the modified cost˜ W 2 (Σ, K) has at most two admissible stationary points. If Z = 0, then only one admissible stationary point of˜ W 2 (Σ, K) exists and it is given by (Σ e , K e ) = (1, 0).
The result of Proposition 5 is in contrast to the 2F cost, cf. Lemma 5.1: Apart from the degenerate case γ = 1/θ 2 , in which infinitely many stationary points of˜ 2F exist,˜ 2F exhibits a unique stationary point for a fixed Z. However, the result of Proposition 5 coincides with the L 2 stage cost, cf. [14,Prop. 9]. Hence, concerning stationary points of the modified cost, the W 2 cost is more similar to the L 2 cost than to the 2F cost.
The similarity of the W 2 cost to the L 2 cost appears in the Hessian as well: For any fixed Z = 0, it is obvious from the Hessian that˜ W 2 is not convex for sufficiently large Σ. This is in contrast to Subsection 5.1, where the constant Hessian considerably simplified the analysis. Of course strong convexity of˜ W 2 is only a sufficient condition for strict dissipativity. A requirement, however, is that the optimal equilibrium (Σ e , K e ) is the unique global minimum of the modified cost˜ W 2 . Hence, in the following, we will take a closer look at the structure of˜ W 2 . As in Subsection 5.1, we separate the two cases ς 2 /2 − θ > 0 and ς 2 /2 − θ < 0.
The case ς 2 /2 − θ > 0. Similar to the L 2 cost and in contrast to the 2F cost, in case of the W 2 cost for a large set of parameters (strict) dissipativity does not hold with a linear storage function.
Proof. The idea of the proof is to show that the modified cost˜ W 2 can assume negative values, which violates (23). To this end, we first note that Next, we show that Z < 0. From the Lagrange function (72) we deduce and recalling that Z = 2λT . Due to ∂ K L W 2 (Σ e , K e ,λ) = 0, we can exclude Z = 0: If Z = 0, then K e = 0 and thus Σ e = 1, cf. (73). But this contradicts (32) since ς 2 2 − θ > 0, i.e., ς 2 2θ > 1. Thus, we have Σ e = −γK e /Z and K e = 0, which, together with Lemma 4.3, results in K e > 0. Then due to γ > 0 and Σ e > 0 we arrive at Z < 0. 7 Due to Z < 0, the term (K + θ)Z from (77) decreases as K increases. Taking into account the control constraint (26), we consider the limiting case of which, due to Σ → ∞, cf. (77), results in Hence, Thus, if Z 2T + 1 2 < 0, then sgn (K + θ)Z + 1 2 = −1 for large enough admissible K. In this case, (Σ e , K e ) cannot be a global minimum, contradicting dissipativity. As for the 2F cost, cf. Remark 5, the productλT and thus Z is constant in T . Hence, due to Z < 0, one can always achieve Z 2T + 1 2 < 0 for small enough T > 0.
The result of Proposition 6 is very similar to the L 2 case, see the end of Section 4. For the W 2 cost, however, the statement depends on the sampling time T > 0. For instance, one can verify that the OCP (69) with parameters ς = 5, θ = 2, γ = 1/4, and T = 1 is indeed strictly dissipative with λ l (z). Of course this does not mean that increasing the sampling time always helps. Consider the following example.
We want to construct the modified cost˜ W 2 (Σ, K). First, we determine the optimal equilibrium (Σ e , K e ) and the corresponding Lagrange multiplierλ. We formulate the Lagrange function associated to (70) and solve the problem numerically. Note from (73) and (76) that the interest is in Z = 2λT rather than inλ. In particular, the optimal equilibrium is independent of the sampling time T . We get: With this, we can construct the modified cost˜ W 2 (Σ, K), which is depicted in Figure 3. All pairs (Σ, K) illustrated in this figure satisfy the constraints (26). The white area depicts negative values, i.e., pairs (Σ, K) in which (23) is violated. Thus, (strict) dissipativity does not hold with a linear storage function. Example 2 and the corresponding Figure 3 illustrate two reasons why strict dissipativity with λ l (z) does not hold in this example. The first is the asymptotic behavior for Σ → ∞, which might be fixed for large enough sampling times T . The second reason is the second stationary state of˜ W 2 , cf. Proposition 5. In Example 2, it is given by (Σ s , K s ) ≈ (2.6621866, 0.749609). It is important to see that the stationary points of˜ W 2 (Σ, K) depend not on T but on Z (see (73)), and Z is unaffected by a change in T (sinceλ also changes accordingly). Likewise, the modified cost˜ W 2 (Σ, K) itself is unaffected by a change in T . Hence, the problem of a second stationary state attaining negative values persists independently of T .
Moreover, note that in Example 2, γ is such that the reduced cost is strictly convex. 8 8 One can show thatˆ W 2 is strictly convex for γ > 5 5 2 16 ς 4 . However, as this fact is not crucial for the subsequent statements we refrain from giving a rigorous proof.
In short, the properties that were used for the 2F cost (see Propositions 1 and 2 and Remark 4) to guarantee strict dissipativity of (37) are not appropriate to prove strict dissipativity of (69). Instead, a case-by-case analysis is required if ς 2 /2−θ > 0.
The case ς 2 /2 − θ < 0. If ς 2 /2 − θ < 0, then as in the proof of Proposition 6 one can show that Z > 0. Hence, lim Σ→∞˜ W 2 (Σ, K) = ∞, cf. (77). Moreover, lim K→∞˜ W 2 (Σ, K) = ∞. However, the two boundaries Σ 0 and K −θ and the potential second stationary state from Proposition 5 need to be checked in order to verify strict dissipativity with a linear storage function. Hence, similar to the L 2 cost, a case-by-case analysis is required, as the following two examples demonstrate.
As in Example 2, we determine the optimal equilibrium (Σ e , K e ) and the associated Z numerically: The reduced costˆ W 2 , cf. (86), is strictly convex, since 5 5 2 16 ς 4 = 5 5 2 8 3 4 < 1 5 = γ. Furthermore, the Hessian of the modified cost˜ W 2 evaluated at (Σ e , K e ) is positive definite: is not an issue, since˜ W 2 (Σ s , K s ) ≈ 9.2315 · 10 −6 > 0. However, we face problems when looking at the boundary K = −θ respective Σ = 0: which is minimal at K = 0 with Analogously, at the boundary K = −θ, we have: which is minimal at Σ = 1 with In total, we require that Otherwise, due to continuity of˜ W 2 , strict dissipativity with this storage function does not hold. Indeed, in this example, we have see Figure 4, and thus, no strict dissipativity with λ l (z).
Numerically, we identify the optimal equilibrium and the corresponding value for Z: We also determine the second stationary state of˜ W 2 numerically: (0.8642951, −1.4314914) =: (Σ s , K s ).
Modifications to the stage cost W 2 . In this part we discuss the two modifications to the stage cost W 2 that were introduced in the Subsection 5.1.

Remark 6.
For the W 2 cost, scaling the stage cost by a factor T as mentioned in Remark 5 could help in verifying strict dissipativity using linear storage functions in the case of ς 2 /2 − θ > 0, at least for some parameters. However, as mentioned in this subsection, this scaling does not help if a stationary point with a negative function value exists, since it exists independently of T . Hence, analogously to the 2F cost, we do not scale the stage cost W 2 (Σ, K) in this subsection.

Remark 7.
Modifying the cost function W 2 by penalizing (θ + K) 2 instead of K 2 does not guarantee strict dissipativity with a linear storage function: Since the Figure 5. Modified cost˜ W 2 (Σ, K) for Example 4 zoomed in (left) and zoomed out (right). The optimal equilibrium (Σ e , K e ) is illustrated by the orange circle. The white area on the right plot is due to control constraints (26). modified cost function yields the same optimal equilibria as considering θ = 0, in particular, ς 2 /2 − θ > 0 holds. However, this property does neither guarantee strict dissipativity 9 (in contrast to the 2F cost), cf. Example 2, nor does it rule out strict dissipativity (in contrast to the L 2 cost).
A nonlinear storage function. Despite the similarity of the two cost functions W 2 and 2F , the results are very different. In fact, regarding dissipativity with the linear storage function λ l (z), the Wasserstein cost W 2 has more in common with the L 2 cost considered in [14]. This includes that, when running numerical simulations, the MPC closed loop converges to the optimal equilibrium (Σ e , K e ) -even for the parameters in Examples 2 and 3, see Figures 6 and 7. These figures indicate that the turnpike property holds even in cases where the linear storage function fails.
Due to the close relationship between dissipativity and the turnpike property, see Section 3, this strongly suggests that strict dissipativity does indeed hold, but with a nonlinear storage function. Thus, in the rest of this section, we revisit these examples with the nonlinear storage function The parameter α ∈ R is chosen such that the optimal equilibrium (Σ e , K e ) is a stationary point of the new modified cost One notable advantage of λ s (z) over λ l (z) is the asymptotic behavior of the modified cost: While lim Σ→∞˜ W 2 (Σ, K) = sgn (K + θ)Z + 1 2 · ∞ (102) 9 Neither does the condition ς 2 /2 − θ < 0, see Example 3.  depends on the value of Z, the nonlinear storage function λ s (z) yields˜ s W 2 (Σ, K) → ∞ for Σ → ∞ or K → ∞ irrespective of the value of α. Thus, when looking for a suitable/promising storage function λ(z), the asymptotic behavior of λ(z) should be compared to that of the cost (Σ, K).
Ideally, the storage function can be chosen such that the Hessian ∇ 2˜ (Σ, K) is constant. Then one can avoid checking everything by foot, i.e., the boundary values and the stationary points of the modified cost function. Unfortunately, the Hessian ∇ 2˜ s W 2 (Σ, K) is not constant. However, the level sets in Figure 8 clearly suggest that strict dissipativity holds for both Example 2 (left) and 3 (right). We take a closer look at both examples. Figure 8. New modified cost˜ s W 2 (Σ, K) for Examples 2 (left) and 3 (right). The optimal equilibrium (Σ e , K e ) is illustrated by the orange circle. The white area on the right plot is due to the control constraints (26).
Summary. As before, we end our analysis by summarizing our main results in short form: • For ς 2 /2 − θ > 0 and small enough sampling times T > 0, strict dissipativity cannot hold with a linear storage function. For large enough T > 0 strict dissipativity may hold, but has to be checked on a case-by-case basis. • In the case ς 2 /2 − θ < 0, strict dissipativity with a linear storage function is independent of the sampling time T , but has to be checked on a case-by-case basis. • For various values of θ and ς strict dissipativity holds with the nonlinear storage function (100). However, the verification is tedious and must be done on a case-by-case basis. • Numerical verification of the turnpike property suggests that strict dissipativity holds for many parameters for which the analytical verification is not (yet) possible.
The above examples show that the W 2 cost is more difficult to manage than the 2F stage cost. The most striking difference is that positive definiteness of the Hessian ∇ 2˜ W 2 (Σ e , K e ) is not sufficient for strict dissipativity with a linear storage function since ∇ 2˜ W 2 (Σ, K) is not constant. Although this property can be used to conclude local strict convexity in a neighborhood of (Σ e , K e ) (which implies strict dissipativity if state and control are constrained to that region), in general it will not yield global convexity. Another difficulty arises due to the second stationary state of˜ W 2 (see Proposition 5), for which, on top of that, there is no analytic formula, as opposed to the 2F cost, cf. Lemma 5.1. Moreover, one needs to take into account the boundary, which was unnecessary for the 2F cost due to the constant Hessian ∇ 2˜ 2F (Σ, K), cf. (47). All in all, even though the W 2 cost W 2 (Σ, K) from (34) looks more similar to the 2F cost 2F (Σ, K) from (35) than to the L 2 cost, the W 2 cost behaves more like the L 2 cost than like the 2F cost when it comes to analyzing strict dissipativity.
Concluding, the Wasserstein metric, which is in many aspects very suitable for measuring distances of PDFs, does not allow for a simple analysis of strict dissipativity, although our results give strong indication that strict dissipativity holds for many parameter values. 6. Conclusion. In this work we have analyzed whether a particular optimal control problem with bilinear dynamics connected to the Fokker-Planck equation is strictly dissipative. To this end, we have considered two cost functions: an often suggested Wasserstein cost, W 2 , and a quadratic cost function commonly used in tracking objectives, 2F . We have found that for the latter cost, a linear storage function can be used to prove strict dissipativity for a large parameter set. The linear storage function is convenient due to its similarity with the Lagrange function. However, we have demonstrated that it is unsuitable if the W 2 cost is used. To show that the optimal control problems are strictly dissipative in the W 2 case, we have introduced a class of nonlinear storage functions.
Regarding MPC, our results suggest that all considered costs (including the L 2 cost considered in [14]) work well, although a mathematically rigorous proof for large sets of parameters could only be achieved for the 2F cost. Unfortunately, this is the only cost that is not derived from a metric for general PDFs, and thus it is only applicable to the Gaussian setting. It will be an interesting question for further research to see whether it is possible to extend this cost and the associated strict dissipativity results beyond the Gaussian case.