Numerical Study of Polynomial Feedback Laws for a Bilinear Control Problem

An infinite-dimensional bilinear optimal control problem with infinite-time horizon is considered. The associated value function can be expanded in a Taylor series around the equilibrium, the Taylor series involving multilinear forms which are uniquely characterized by generalized Lyapunov equations. A numerical method for solving these equations is proposed. It is based on a generalization of the balanced truncation model reduction method and some techniques of tensor calculus, in order to attenuate the curse of dimensionality. Polynomial feedback laws are derived from the Taylor expansion and are numerically investigated for a control problem of the Fokker-Planck equation. Their efficiency is demonstrated for initial values which are sufficiently close to the equilibrium.


1.
Introduction. In this article, we consider the following bilinear optimal control problem: where: (N j y(t) + B j )u j (t), for t > 0 Here, V ⊂ Y ⊂ V * is a Gelfand triple of real Hilbert spaces and A : D(A) ⊂ Y → Y is the infinitesimal generator of an analytic C 0 -semigroup e At on Y . The precise conditions on B j and N j are given further below. The value function, denoted by V, associates with any initial condition y 0 the value of problem (1). In our previous work [9], we analysed polynomial feedback laws of the form u p (y) j = − 1 β DV p (y)(N j y + B j ), ∀j = 1, ..., m resulting from a Taylor expansion V p of the value function V. The Taylor expansion is of the following form: where T k : Y k → R denotes a bounded multilinear form of order k. The multilinear form T 2 is determined by solving an algebraic Riccati equation. For k ≥ 3, the multilinear form T k is characterized by a generalized operator Lyapunov equation of the form T k (z 1 , . . . , z i−1 , A Π z i , z i+1 , . . . , z k ) = R k (z 1 , . . . , z k ), z 1 , . . . , z k ∈ D(A). (3) In this equation, the operator A Π is associated with the linearized closed-loop system obtained from a Riccati-based stabilization approach and the r.h.s. is computed by induction.
In the present contribution, we provide a detailed description of the numerical implementation of the feedback laws u p and investigate their behavior in practice. A version of the Lyapunov equations (3) in a finite-dimensional space is obtained by discretizing the state equation with a finite-difference scheme, which preserves the bilinear structure of the system. The numerical realisation of the discretized Lyapunov is not straightforward, because of the curse of dimensionality: The size of the linear system to be solved increases exponentially with the dimension of the domain Ω and with the degree p of the Taylor expansion. We therefore propose to use a generalization of the balanced truncation model reduction method [6] to reduce the size of the dimension of the state equation and the Lyapunov equations. We also use a technique of [13] for solving the discretized and reduced Lyapunov equations.
The method is tested on an optimal control problem of the Fokker-Planck equation, in dimension 1 and 2. The impact of model reduction on the corresponding feedback laws is analysed. The efficiency of the feedback laws is also analysed, in particular, we investigate how much improvement can be obtained when using high-order feedback laws rather than Riccati-based feedback laws. At a theoretical level, the method is of local nature: The well-posedness of the closed-loop system associated with the feedback law u p is only guaranteed for initial conditions close to the equilibrium [9]. We therefore investigate the influence of the distance of the initial condition to the origin on the success of the method. The influence of the cost parameter β is also investigated.
The concept of series expansion of the value function has inspired researchers in optimal control for a long time. We refer to [16] for a very useful survey on this topic. Numerical tests are mostly carried out in the context of ordinary differential equations for systems of significantly smaller order than those which typically arise NUMERICAL STUDY OF POLYNOMIAL FEEDBACK LAWS 559 from discretized infinite-dimensional systems, which are in the focus of the present work. Let us mention that an interesting and natural extension of the expansion method consists in computing the expansion in an array of reference points. This relates to the concept of the patchy technique [1,3]. Many additional references have been gathered in [9]. Important efforts have been made recently to develop some new methods for the feedback control of partial differential equations. The present paper contributes to this developing field. For quadratic cost functionals and in the absence of additional constraints, the most noted and investigated technique consists in applying linear quadratic regulator theory after linearization of the state equation around a steady state, see for instance [4,20]. Most other techniques, and especially those involving the value function and the Hamilton-Jacobi-Bellmann equation, rely on system reduction. In [2,17], dimension reduction of the system is based on proper orthogonal decomposition, while the results in [15] are based the high-order approximation properties of spectral methods. In [14], the reduced basis method was used for open-loop control of fluid flow.
The structure of the article is as follows. In Section 2, we briefly recall the main theoretical results obtained in [9] and generalize them to the case of multiple inputs. In Section 3, we describe the bilinear control problem of the Fokker-Planck equation used for the numerical results. We provide in Section 4 a detailed description of our numerical approach for discretizing, reducing, and solving the Lyapunov equations. Numerical results are reported in Section 5.
2. Summary of the theoretical results. In this section, we recall the main theoretical results proved in [9]. In that previous paper, we worked with scalarvalued controls, while we now consider the multi-input control case u(t) ∈ R m . The extension of our results to the case m > 1 is however straightforward. Throughout this section, we assume that the following assumptions are satisfied. (A1) The operator A can be associated with a bounded V -Y bilinear form a : V × V → R such that there exist λ ∈ R and δ > 0 satisfying (A2) For all j = 1, . . . , m, N j ∈ L(V, Y ) and N * j ∈ L(V, Y ). (A3) For A 0 := A−γI, with γ > 0 sufficiently large and the real interpolation space with indices 2 and 1 2 , see [7, Proposition 6.1, Part II, Chapter 1], it holds that In the approach developed in [9], we first characterize the multilinear forms T 2 ,T 3 , . . . of the Taylor approximation. The equations satisfied by the multilinear forms are obtained by successive differentiation of the following Hamilton-Jacobi-Bellman equation.
Proposition 1 (Proposition 9, [9]). Assume that there exists an open neighborhood Y 0 of the origin in Y which is such that the two following statements hold: 1. For all y 0 ∈ Y 0 , problem (1) possesses a solution u which is right-continuous at time 0. 2. The value function is continuously differentiable on Y 0 .
Then, for all y 0 ∈ D(A) ∩ Y 0 , Moreover, for all solutions u to problem (1) with initial condition y 0 , if u is rightcontinuous at 0, then The first nontrivial term T 2 of the Taylor expansion is determined by the unique nonnegative self-adjoint operator satisfying the following algebraic operator Riccati equation: (6) It is obtained by differentiating twice equation (4). It is well-known that the linearized closed-loop operator generates an exponentially stable semigroup on Y , thanks to assumption (A4). Further differentiation of the HJB equation allows to characterize the multilinear forms T 3 , T 4 ... as solutions to generalized Lyapunov equations, whose right-hand sides are defined recursively. The precise structure of these equations is given in Theorem 2.1 below. In the definition of the right-hand sides of the generalized Lyapunov equations, we make use of a specific symmetrization technique, that we define now. For i and j ∈ N, consider the following set of permutations: where S i,j is the set of permutations of {1, ..., i + j}. Let T be a multilinear form of order i + j. We denote by Sym i,j (T ) the multilinear form defined by Theorem 2.1 (Theorem 15, [9]). There exists a unique sequence of bounded symmetric multilinear forms (T k ) k≥2, with T k : Y k → R and a unique sequence of bounded multilinear forms (R j,k ) k≥3 , j = 1, . . . , m with R j,k : and such that for all k ≥ 3, for all (z 1 , ..., z k ) ∈ D(A) k , where and where For all p ≥ 2, we define the polynomial approximation V p as follows: where the sequence (T k ) k≥2 is given by Theorem (2.1). We deduce from V p the polynomial feedback law u p : y ∈ V → R m , defined by Its form is suggested by (5) and (10). A justification of the differentiability of V p and a formula for its derivative, used in the above expression, can be found in [9,Lemma 7]. We consider now the closed-loop system associated with the feedback law u p :ẏ For a given initial condition y 0 , its solution is denoted by S(u p , y 0 ). We also denote by U p (y 0 ) the open-loop control defined by U p (y 0 ; t) = u p (S(u p , y 0 ; t)), for a.e. t ≥ 0.
The following theorem states that for y 0 Y small enough, the closed-loop system (12) has a unique solution and generates an open-loop control in L 2 (0, ∞; R m ). The solution to the closed-loop system is obtained in the space: Theorem 2.2 (Theorem 21 and Corollary 22, [9]). There exist two constants δ 0 > 0 and C > 0 such that for all y 0 with y 0 Y ≤ δ 0 , the closed-loop system (12) admits a unique solution S(u p , y 0 ) ∈ W ∞ satisfying moreover, U p (y 0 ) ∈ L 2 (0, ∞; R m ).
Finally, the following theorem states that V p is an approximation of V of order p + 1, in the neighborhood of 0 and gives an error estimate on the efficiency of the open-loop control generated by u p . Theorem 2.3 (Proposition 2, Theorem 30, and Theorem 32, [9]). Let δ 0 be given by Theorem 2.2. There exists δ ∈ (0, δ 0 ] and a constant C > 0 such that for all y 0 ∈ Y with y 0 Y ≤ δ, the following estimates hold: Moreover, for all y 0 ∈ Y with y 0 Y ≤ δ, problem (1) with initial condition y 0 possesses a solutionū satisfying Remark 1. The constants δ 0 , δ, and C involved in Theorem 2.2 and 2.3 depend on p. They also depend on the data of the problem. In particular, when β converges to 0, the algebraic Riccati equation (6) becomes degenerate and the operator norm of the right-hand sides of the Lyapunov equations (9a) possibly increases, because of the factor 1 2β . Therefore, one can expect that the radius of convergence of the Taylor expansion and the constant δ 0 both converge to 0 as β converges to 0.
3. Fokker-Planck equation. We describe in this section a specific optimal control problem of the form (1) which we shall investigate numerically in Section 5.
3.1. Problem formulation. Following the setup discussed in [8], we consider the following controlled Fokker-Planck equation: where Ω ∈ R d denotes a bounded domain with smooth boundary Γ. The Fokker-Planck equation models the evolution of the probability distribution of a very large set of particles. More precisely, ρ(·, t) is the probability density function of the random variable X t , solution to the following stochastic differential equation: where (W t ) t≥0 is a Brownian motion and where the potential V is controlled by u in the following manner: Each particle moves along the negative direction of the gradient of the potential V and is subject to random perturbations. When no control is used (i.e. u = 0), the potential V equals the ground potential G. The functions α 1 ,...α m are called control shape functions. The reflecting boundary conditions models the fact that the particles are confined in Ω and ensure a preservation of probability, i.e.
We introduce now the stationary probability distribution ρ ∞ , defined by (15) is known to converge to ρ ∞ when t → ∞. This convergence depends on ν and the ground potential G and can be extremely slow. We therefore consider the following optimization problem: (16) in order to speed up the convergence to ρ ∞ .

Abstract formulation and projection.
As is discussed in detail in [8] (for the case m = 1), system (15) can be considered as an abstract bilinear control system of the formρ where the operators A and N j are given by and where their L 2 (Ω)-adjoints are given by Setting y = ρ − ρ ∞ , (17) is equivalent tȯ where Denoting by 1 the constant function on Ω equal to 1, one can easily see that: The mass conservation property follows directly from this observation: Ω y(·, t)dx = Ω y(·, 0)dx, ∀t ≥ 0. Consider the space The mass conservation property implies that y(t) does not converge to 0 if y(0) does not lie in Y P . Therefore, condition (A4) is not satisfied if (20) is considered as a dynamical system in L 2 (Ω). Instead, it must be regarded as a dynamical system 564 TOBIAS BREITEN, KARL KUNISCH AND LAURENT PFEIFFER in Y P . As detailed in [8], this can be done by first considering the projection P on 1 ⊥ along ρ ∞ : Then, we have:ẏ where and where I P denotes the injection of Y P into L 2 (Ω). Assumptions (A1)-(A4) are now satisfied for system (21), as proved in [9, Section 8].
4. Algorithmic approach. Our numerical implementation of the feedback laws is based on the following approach. We first discretize the Fokker-Planck equation with a finite-difference scheme, leading to a finite-dimensional bilinear optimal control problem. Because of the curse of dimensionality, the tensors T 2 , T 3 , . . . cannot be directly computed for the discretized problem. A reduction of the discretized model is therefore necessary. The Lyapunov equations (9) can then be solved using techniques from [13].

Discretization of the original state equation.
The spatial discretization is obtained with a finite-difference method. We use a uniform grid with n points. The discrete approximations of the individual operators are subsequently denoted with a subscript n. Due to the simpler structure of the boundary conditions for the operator A * , we employ a finite-difference scheme for A * rather than for A itself. Then, the transpose of the resulting matrix serves as a discrete approximation of A. It is denoted by A n ∈ R n×n . For the discretization of the advective term ∇G · ∇·, an upwinding-like scheme which utilizes backward/forward differences based on the sign of the derivatives G xi is used. Since the sign of the controls are not known a priori, central differences are used for the discretization of N j . The discretization of N j is denoted by N j,n ∈ R n×n .
Remark 2. A finite-difference scheme has been used because it preserves the bilinear structure of the Fokker-Planck equation and therefore allows the computation of a reduced-order model and provides a natural way of discretizing the Riccati equation (6) and the Lyapunov equations (9). Other popular schemes for the discretization of the Fokker-Planck equation, like the Cooper-Chang algorithm [11] or semi-Lagrangian methods (see the detailed bibliography of [10]) have nice features (in particular, positivity preservation), however, they do not maintain the bilinear structure.
Remark 3. The central finite-difference scheme used for the operators N j worked well for our simulations, even though in principle, it might lead to numerical instabilities. The design of a scheme for discretizing N j without prior knowledge of the properties of the control is still a challenging issue.
Since the operator A is known to have a real spectrum with ρ ∞ corresponding to the smallest eigenvalue (in magnitude) (see [8,Section 3]), a discretization ρ ∞,n ∈ R n can be efficiently computed, even for large scale problems, by an inverse iteration applied to A n . Denoting by h xi the mesh size in the direction x i and settinḡ h = d i=1 h xi and 1 = [1, . . . , 1] T ∈ R n , we normalize the stationary distribution ρ ∞,n so thath1 T ρ ∞,n = 1. The initial probability distribution is also normalized: h1 T ρ n,0 = 1. Finally, we use B j,n = N j,n ρ ∞,n ∈ R n for the discretization of B j and set All together the spatially discretized problem reads: 4.1.2. Discretization of the projected state equation. As explained in Section 3, the state equation must be regarded on a subspace Y P of L 2 (Ω) to guarantee stabilizability. The underlying projection must be numerically implemented to allow an efficient resolution of the algebraic Riccati equation. We recall the main steps of the computation of the corresponding discretized and projected operators, details can be found in [8]. Consider the matrix R ∈ R n×n and its inverse, given by The n − 1 first columns of R build a basis of the orthogonal set to the vector 1. We After the state space transformation, we obtain the system

TOBIAS BREITEN, KARL KUNISCH AND LAURENT PFEIFFER
Because of the normalization of ρ 0,n and ρ ∞,n , we have z n (0) = 0. Moreover, the second block row in (24) is null, therefore z(t) = 0 and Finally, we obtain the following equivalent formulation of problem (22)-(23): subject to: where C n = √h RQ.

4.2.
Computation of the feedback tensors. In theory, the polynomial feedback laws associated with the discretized problem (25)-(26) can be obtained by solving the algebraic Riccati equation and the generalized Lyapunov equations associated with the discretized operators A, N 1 ,..., N m , B (for simplicity, we omit the subscript n in this subsection). However, the generalized Lyapunov equation of order k, corresponding to the discretized system, is equivalent to a linear system of size (n − 1) k . As a remedy, we propose to replace system (26) by a reduced-order model. We describe below our approach for reducing the discretized state equation and explain how to solve the corresponding reduced Lyapunov equations.

Model reduction.
We construct a reduced-order model for (26) of the forṁ where the matrices A r , N j,r ∈ R r×r , B r ∈ R r×m , r n − 1 are computed in such a way that for some matrix C r ∈ R n×r , C r y r (t) ≈ Cỹ n (t), for a range of controls u j (t). Our construction is based on a known generalization of the method of balanced truncation for bilinear systems, see e.g. [6]. It has already been used in the context of the Fokker-Planck equation in [5]. Let us briefly summarize it. As in the case of linear systems, a reduced-order model is obtained as a truncation of a system that is balanced with respect to certain Gramians. In the bilinear case ( [6]), reachability and observability of a bilinear system can be associated with the definiteness of the Gramians X and Y given as the solution of the generalized Lyapunov equations Since an explicit computation of X and Y based on vectorization requires O(n 6 ) operations, we use a fixed point iteration, as discussed in [12]. More precisely, we compute and stop when the relative residual falls below a prescribed tolerance ε. The same procedure is applied for computing Y. Once (approximations of) the Gramians X and Y have been computed, the steps for balancing and truncation are the same as in the linear case. Based on the product of the Cholesky factors S and R, respectively, of the Gramians X = S T S and Y = R T R a singular value decomposition U ΣV = SR T is computed. Using the best rank-r decomposition of SR T then yields the final reduced-order model via a Petrov-Galerkin projection where V r = S T U (:,1:r) Σ (1:r,1:r) . The initial condition is obtained as follows: y 0,r = W T rỹn (0). Some comments concerning the reduced-order modeling approach are in order.

Remark 4.
In contrast to the linear case, the generalized method of balanced truncation does not exhibit an a priori error bound. In the next section, we therefore provide several comparisons between the original and the reduced model and the corresponding feedback laws. For applicability of MOR techniques, one typically assumes that the number of inputs and outputs is small. This is clearly not the case for C ∈ R n×n−1 . On the other hand, in case at least the input space is finite-dimensional, analytic control systems are still known to have rapidly decaying singular values ( [19]). System theoretic model reduction techniques typically assume that the initial value is zero, i.e., ρ 0,n = ρ ∞ . Obviously, this leads to a trivial stabilization problem for (26). For nonzero initial values, the initialization of the reduced-oder model is not obvious and might potentially yield a significantly different transient response. While the projected initial condition y r,0 = W T rỹn (0) might still lead to deviations between original and reduced-order model, we expect this effect to be comparably small since for the theoretical results of the feedback law, the initial value is assumed to be close to the origin.

Lyapunov equations.
It is now possible to compute at a higher degree the feedback laws associated with the following reduced problem: subject to: ẏ r (t) = A r y r (t) + m j=1 N j,r y r (t)u j (t) + B r u(t), y(0) = y 0,r .
Note that the above problem has a slightly different structure from problem (1), because of the operator C r , however, only the algebraic Riccati equation has to be modified. It reads: We set: For solving the generalized Lyapunov equations, we represent any multilinear form S : (R r ) k → R by an array in R r×...×r . The associated vectorization is denoted by vec(S) ∈ R r k . The generalized Lyapunov equation of order k corresponding to (28) can be formulated as a tensor-structured linear system: where ⊗ is the Kronecker product and where R k,j,r is computed with (9b)-(9c).

Remark 5.
Because of the symmetrization operations involved in (9c), the term R k,j,r must be computed as a sum of (k − 1) + k−2 i=2 k i ≈ 2 k terms. Note that an explicit computation of the inverse of A k,r requires O(r 3k ) operations which would be infeasible even for moderate reduced dimensions r. However, the specific tensor structure allows us to approximate the solution to (29) by a quadrature formula. The method is described and analyzed in [13]. The main idea consists in combining an explicit integral representation of the inverse A −1 k,r with a separability property of the matrix exponential of tensor-structured matrices. The obtained approximation of A −1 k,r takes the form We refer to [13] for the choice of the weights and points. For our numerical simulations, we have used l = 50, leading to a sufficiently accurate approximation.
Remark 6. Let us emphasize that the model reduction step does not entirely resolve the curse of dimensionality, since the cost of computing the feedback tensor T k,r still grows exponentially with k. The use of low-rank tensor formats would possibly allow to increase the degree of the polynomial approximation of the feedback law. However, this would introduce a further approximation error. Moreover, in our numerical examples, we obtained sufficiently accurate approximations of the optimal control and thus we refrain from a more detailed discussion on tensor calculus.
Once the feedback tensors T k,r have been computed up to a degree p, we arrive at the following reduced closed-loop system: y r (t) = A r y r (t) + m j=1 N j,r y r (t) u p (y r (t)) j + B r u p (y r (t)), where the reduced feedback law u p is given by: T r,k (N r y r + B r , y r , . . . , y r ).
If y 0,r R r is sufficiently small, then (30) is well-posed and the feedback law generates a control u p ∈ L 2 (0, ∞; R m ), given by u p (t) = u p (y r (t)).
In the numerical results below, once the control u p has been computed (by solving the reduced closed-loop system (30)), its efficiency is tested with the discretized system (23), that is to say, by solving: y n (t) = A n y n (t) + m j=1 N j,n y n (t) u p (t) j + B n u p (t), y n (0) = y 0,n .
5. Numerical results. We report on numerical tests in dimension 1 and 2, respectively. The main discussion focuses on the one-dimensional example while the two-dimensional example should illustrate the applicability of the method for larger dynamical systems. All simulations were done on an Intel R Xeon(R) CPU E31270 @ 3. For solving the algebraic Riccati equation, we use the routine care. The matrix exponential involved in the approximation formula for A −1 k,r is implemented with the routine expm.

5.1.
One-dimensional example. The first example that we consider is of the form (15) with d = 1, m = 1, ν = 1 and Ω = (−6, 6). The ground potential G that we use is represented in Figure 1a and the corresponding probability distribution is shown in Figure 1b. The potential G has three local minima reached at x 1 , x 2 , and x 3 and two local maxima reached at x 4 and x 5 , with The minimum is reached at x 1 . The energy activation Q, defined as the highest potential barrier that a particle has to overcome to reach the most stable equilibrium x 1 , is approximately: Q = G(y 1 ) − G(x 2 ) ≈ 1.11. For small values of ν, the rate of convergence of the uncontrolled system is approximately Ce −Q/ν , where C is a constant (see [18,Section 2]).
The control shape function α(x) ∈ R is such that so that ∇α · n = 0 on Γ. It is constructed by (twice continuously differentiable) Hermite interpolation on the intervals (−5.9, −5.8) and (5.8, 5.9). The control u(t) is scalar-valued and allows to interact with the system by tilting one half of the ground potential while raising the other. Our numerical tests are guided by the following three issues. First, we show the effect of model reduction on the corresponding feedback laws. Then, we investigate the convergence of the controls generated by the different polynomial feedback laws towards the optimal control, as the order p increases. Finally, we study the influence of the initial condition and the value of β on the efficiency and the convergence of these controls. The last item relates to the local behavior of the method.   5.1.1. Reduced vs original model. As described in Subsection 4.2.1, we rely on a reduced-order model for the computation of the feedback laws. We expect the reduced-order model to replicate faithfully the original dynamics. Due to the absence of a rigorous error bound, we provide the numerical results for some of our test cases. For this purpose, we first compare the controls obtained with the original model (for a finite-difference discretization with n = 100) with the controls obtained with a reduced model of dimension r = 25. The dimension of the reduced model is determined by neglecting states corresponding to singular values of the product of the generalized Gramians whose magnitude is smaller than 10 −6 . Two different initial conditions are considered. The first one is represented in Figure 2a and results from a random perturbation of the stationary distribution. The second one, represented in Figure 3a, models a set of particles located close to x = 0, in the second well of the potential G. For both situations, we chose β = 10 −4 . Due to the size of the original model, we are only able to compute the first three feedback laws when no model reduction is applied. The controls obtained with the original model and the reduced model are shown in Figures 2b and 3b. In both cases, the control obtained with the reduced model replicates accurately the ones obtained with the original model. The only visible deviation occurs at the beginning of the simulation, for the first initial condition. In view of the nonzero initial condition and Remark 4, this is to be expected. Since the reduction from n = 100 to r = 25 is only moderate, we investigate further parameters of n and r, respectively. Figure 4a shows the decay of the singular values of the product of the Gramians for an original model of dimension n = 1000. We include the thresholds for relative magnitudes smaller than 10 −3 and 10 −6 . Let us emphasize that in contrast to n = 100, the relative accuracy of 10 −6 is already obtained for r = 21 rather than r = 25. For larger values of n, this threshold, however, remains constant at r = 21. Figure 4b shows a comparison between the controls obtained for reduced-order models of dimension r = 21 (ε = 10 −6 ) and r = 9 (ε = 10 −3 ). Two comments are in order: a) comparing Figure 3b with Figure  4b, the control laws are visually (almost) indistinguishable, b) the first singular values remain approximately the same for discretizations with a larger value of n.

5.1.2.
Convergence of higher order feedback laws. We investigate in this subsection the behavior of the controls u p derived from the feedback laws u p for large values of p. More precisely, we investigate the convergence of u p towards the solution u opt of the problem, when p increases. The method used for computing u opt is described below. Two different initial conditions are tested. The first one, represented in Figure 5a, is a random perturbation of the stationary distribution. The second one is the uniform distribution on Ω. We use β = 10 −4 and choose n = 1000 for the discretization. The original model is reduced to r = 9 (ε = 10 −3 ), so that the seven first feedback laws can be computed. The obtained controls are shown in Figures  5b and 6b, respectively. As can be observed in the case of the randomly perturbed initial condition, the Riccati-based feedback law differs significantly from all higher order feedback laws. Let us emphasize that the bilinear term characterized by the operator N does not influence the computation of the first feedback tensor T 2 . This potentially explains the strong deviations between u 2 and all other controls. We further see that the higher order control laws quickly approach the optimal control law u opt . A clear deviation between the Riccati-based feedback law u 2 and all higher order controls can also be observed for the case of a uniform initial condition. The convergence, however, appears to be slightly slower than in the first case as is indicated by a deviation from u 3 and u 4 from the other controls. This might be due to the different initial condition which is further away from the stationary distribution, i.e., y 0 is further away from the origin. Solving the open-loop problem. An approximation of the solution of the problem, denoted by u opt , is obtained by solving min u∈L 2 (0,T )J r (u) := where y r is the solution to the reduced-order model (27) and where T = 20. To this purpose, we use a gradient-descent algorithm: u k+1 = u k − t k ∇J r (u k ), where t k is computed with Armijo's stepsize-rule: with C = 500, θ = 0.7, and σ = 0.05. The stopping criterion ∇J r (u k ) L 2 (0,T ) ≤ δ is used with δ = 3 · 10 −4 . Note that the control provided by such a numerical method may only be an approximation of a local solution to the problem. Even though the gradient-descent algorithm is rather slow, it has the advantage, in the current framework, of being easy to implement and robust. Since the focus of our study is the computation and the analysis of feedback laws, more sophisticated methods for solving (32) have not been considered. Let us mention that variants of J r incorporating a penalty term on the final state y r (T ) provide extremly similar solutions to the problem, since the chosen value for T is large.

5.1.3.
Influence of the initial value and the control costs. The efficiency of the polynomial feedback laws is only guaranteed in a neighborhood of the origin (i.e. for ρ 0 sufficiently close to ρ ∞ , in the context of the Fokker-Planck equation). The size of the neighborhood may decrease for small values of β, as explained in Remark 1.
In this subsection, we investigate the efficiency and the convergence of the controls u p for three different initial conditions, which are respectively close, rather close, and far from the stationary distribution. Different values of β are tested. For the      Figure 6. Convergence of the control laws for β = 10 −4 , n = 1000 and r = 9. discretization and for the reduction of the model, the values n = 1000 and r = 21 (ε = 10 −6 ) are used. The integral (25) is reduced to the interval (0, T ) with T = 20 for the evaluation of the cost function. Test case 1: uniform initial condition. Table 1 provides results for a uniform initial condition. This initial condition can be regarded as very close to the stationary distribution, since the cost of the uncontrolled system J(u = 0) = 0.045 is small. We have ρ 0 − ρ ∞ L 2 (Ω) = 0.24. In such a situation, the control u 2 generated by the feedback law u 2 is almost optimal, as can be seen on Table 1a. For p = 6, the L 2 -distance of u p to u opt is approximately 7 times smaller than for p = 2, for the three considered values of β. Test case 2: centered initial distribution. In this second test case, we consider an initial condition modeling a set of particles located around the origin. The results are shown on page 575 in Figure 7 and Table 2. In order to reach the stationary distribution, an important proportion of the particles has to overcome the barrier of the reference potential G located at y 1 ≈ −2.24. The optimal control takes  positive values, in order to lower the barrier by tilting the potential on the left side. The cost associated with the uncontrolled system is now significantly larger than in the first test case: J(u = 0) = 0.174. The L 2 -distance to the equilibrium is larger: ρ 0 − ρ ∞ L 2 (Ω) = 0.57. For all the considered values of β, a big reduction of u p − u opt L 2 (0,T ) is observed when the order of the feedback law increases. For p = 6, the L 2 -distance is at least 10 times smaller than for p = 2. Convergence is achieved for values of β larger than 5 · 10 −5 , but is not observed for β = 10 −5 , as is well shown in Figure 7d. For this intermediate initial condition, the convergence of the controls as well as an important reduction of the costs can be observed, at least for the smallest values of β. For values of β larger than 5 · 10 −4 , the controls u 2 ,...u 6 are all almost optimal, while for β ranging from 10 −4 to 5·10 −5 , a significant difference between u 2 and u 3 is observed. For β = 10 −5 , the costs of u 4 are twice smaller as those of u 2 and u 3 .
Test case 3: right-sided initial distribution. A third test case is presented on page 576 in Figure 8 and Table 3, where the set of particles is assumed to be located in the third potential well. This initial configuration appears to be more challenging than the two other configurations, since a large proportion of the set of particles has now to overcome two barriers of the reference potential, a first one at y 2 ≈ 2.43 and a second one at y 1 ≈ −2.24. The cost of the uncontrolled system is J(u = 0) = 0.865 and the L 2 -distance of the initial condition to the stationary distribution is ρ 0 − ρ ∞ L 2 (Ω) = 0.76 For a comparably high control cost parameter β = 10 −2 , the controls rapidly converge. Lowering the parameter to β = 10 −3 , convergence is in question (at least cannot be determined from the numerical results). Finally, for β = 10 −4 , only the control laws u 2 , u 3 and u 4 actually converge to zero. Higher order closed-loop system appear to be attracted by a further (nontrivial) steady state. This is indicated by the symbol ∞ in Table 3. This behavior can be explained by the fact that the closed-loop system is a nonlinear (polynomial) equation for which different steady states might occur. In the case of the Fokker-Planck equation, the feedback laws u 5 and u 6 introduce a shift of the ground potential such that the particle remains in the (stable) stationary distribution associated with this shifted potential. This last test case shows the local nature of the method.    (b) L 2 -distance between the controls up and the optimal control uopt.       The ground potential G is represented in Figure 9a and the corresponding probability distribution is shown in Figure 9b. The potential G has four local minimizers, located as follows: x Two control shape functions are used represented in Figure 10 and given by: The control shape function α 1 is constructed by interpolation on ((−6, −5.8) ∪ (5.8, 6)) × (−6, 6) so that ∇α 1 · n = 0 on Γ, as in the one-dimensional case. The technique is also used for α 2 . A negative value of u 1 allows to shift the distribution along the first coordinate axis of Ω and a negative value of u 2 allows to shift the distribution along the second coordinate axis.
As for the one-dimensional case, we investigate the influence of the initial condition and the value of β on the efficiency of the feedback laws. We present below the results obtained for two different initial conditions, for a reduced model of order r = 47, obtained from a finite-difference discretization with n = 50 · 50 degrees of freedom with a tolerance of ε = 10 −4 . For such a dimension, only the first four feedback laws can be computed. Figure 11 shows the decay of the singular values of the product of the Gramians for the unreduced discretized system. As can be observed, the decay is significantly slower than in the one-dimensional case. This can be partly explained by the fact that the ground potential G has a more complicated structure, and that a wider range of controls are taken into account. The open-loop control problem is solved with the same parameters: C = 500, θ = 0.7, σ = 0.05, δ = 3 · 10 −4 .
The final-time used for solving (32) and for evaluating (25) is set to T = 200. Test case 4: a random perturbation of the initial condition. Figure 12 and Table 4 (page 579) show the results obtained for an initial condition obtained by randomly perturbing the stationary distribution. The initial condition is therefore close to the stationary distribution. The cost of the uncontrolled system is J(u = 0) = 0.593 and ρ 0 − ρ ∞ L 2 (Ω) = 0.18. Good convergence results are observed for values of β ranging from 10 −3 to 5 · 10 −5 . For p = 5, the L 2 -distance u p − u opt L 2 (0,T ) is approximately 10 times smaller than for p = 2. For β = 10 −4 and β = 5 · 10 −5 , a significant reduction of the costs can be observed as k increases. For β = 10 −5 , the situation is more complex. Convergence of the controls u k as k increases is not achieved. The values of the costs decrease with k except for k = 4. In case the associated closed loop system associated with still converges to 0 but some strong oscillations of the control render a large value of J(u 4 ). Test case 5: initial condition with support in the second potential-well. Figure 13 and Table 5 show the results obtained for an initial condition located in the second potential well (around x B ). A large proportion of the distribution must be transported to x A , along the first coordinate axis and from the negative values to the positive ones. Therefore, one can expect that the first coordinate of the control takes negative values and that the second coordinate has a smaller amplitude than                                        the first one. This initial condition is much further from the stationary distribution than the previous one. Consequently the cost of the uncontrolled system is much larger: J(u = 0) = 9.04 and ρ 0 − ρ ∞ L 2 (Ω) = 0.63. As a consequence, the feedback laws are only efficient for larger values of β than those considered previously. Convergence can be observed for values of β larger than 5 · 10 −3 . The reduction factor of the L 2 -distance is smaller than for the test case 4, but still a significant reduction of the cost is noted for β = 10 −2 and β = 5 · 10 −3 . For β = 10 −3 , convergence with respect to k cannot be achieved. The closed-loop system associated with u 4 quickly converges to a non-trivial stationary point. The closed-loop system associated with u 5 generates a control which has strong oscillations along time and eventually converges to a non-trivial stationary point. 6. Conclusion. A numerical method for computing polynomial feedback laws for an infinite-dimensional optimal control problem with infinite-time horizon has been proposed. It consists in particular in reducing the state equation in order to attenuate the curse of dimensionality, which prevents a direct resolution of the involved Lyapunov equations. The applicability of the method has been demonstrated with an optimal control problem of the Fokker-Planck equations in dimensions 1 and 2. The effect of model reduction on the feedback laws has been numerically analysed and the relevance of the reduction approach has been shown. Good convergence results for high-order feedback laws have been obtained in many situations for which the initial condition was close enough to the equilibrium or for which the value of the cost parameter β was not too small. The influence of the initial condition and the cost parameter β on the success of the method has been investigated in a systematic manner. Further research will focus on the design of polynomial feedback laws for infinitedimensional systems with a more complicated structure. At a numerical level, the use of low-rank tensors formats could be investigated to facilitate the numerical resolution of the Lyapunov equations and the simulation of closed-loop systems. It may also be of interest to design a heuristic mechanism which selects an appropriate order for the feedback law, in order to avoid convergence to a non-trivial stationary point and to allow a practical implementation. At a theoretical level, the computation of an error estimate for the efficiency of controls generated by reduced feedback laws could also be a topic for future work.