A convergent hierarchy of non-linear eigenproblems to compute the joint spectral radius of nonnegative matrices

We show that the joint spectral radius of a finite collection of nonnegative matrices can be bounded by the eigenvalue of a non-linear operator. This eigenvalue coincides with the ergodic constant of a risk-sensitive control problem, or of an entropy game, in which the state space consists of all switching sequences of a given length. We show that, by increasing this length, we arrive at a convergent approximation scheme to compute the joint spectral radius. The complexity of this method is exponential in the length of the switching sequences, but it is quite insensitive to the size of the matrices, allowing us to solve very large scale instances (several matrices in dimensions of order 1000 within a minute). An idea of this method is to replace a hierarchy of optimization problems, introduced by Ahmadi, Jungers, Parrilo and Roozbehani, by a hierarchy of nonlinear eigenproblems. To solve the latter eigenproblems, we introduce a projective version of Krasnoselskii-Mann iteration. This method is of independent interest as it applies more generally to the nonlinear eigenproblem for a monotone positively homogeneous map. Here, this method allows for scalability by avoiding the recourse to linear or semidefinite programming techniques.


Introduction.
1.1.Motivation.A fundamental issue, in optimal control, is to develop efficient numerical schemes that provide globally optimal solutions.Dynamic programming does provide a guaranteed global optimum but it is subject to the well known curse of dimensionality.Indeed, the main numerical methods, including monotone finite difference or semi-Lagrangean schemes [16,14,19,15], and the antidiffusive schemes [13], are grid-based.It follows that the time needed to obtain an approximate solution with a given accuracy is exponential in the dimension of the state space.
Recently, some innovative methods have been introduced in optimal control, which somehow attenuate the curse of dimensionality, for structured classes of problems.
McEneaney considered in [34] hybrid optimal control problems in which a discrete control allows one to switch between different linear quadratic models.The max-plus type method that he introduced approximates the value function by a supremum of quadratic forms.Its complexity, which is exponential in some parameters, has the remarkable feature of being polynomial in the dimension [35,42].To produce approximations of the value function as concise as possible, the method makes an intensive use of semidefinite programming [22].
A different problem consists in computing the joint spectral radius of a finite set of matrices [29].This boils down to computing an ergodic value function, known as the Barabanov norm.Specific numerical methods have been developed, which approximate the Barabanov ball by a polytope [26], or are of semi-Lagrangean type [30].Ahmadi, Jungers, Parrilo and Roozbehani [8] developed a new method, based on a path complete automaton.It approximates the Barabanov norm by a supremum of quadratic norms.Whereas the worst case complexity estimates in [8] are still subject to a curse of dimensionality, in practice, the efficiency of the method is determined by the complexity of the optimal switching law rather than by the dimension itself.This allows one to solve instances of dimension inaccessible by a grid-based method.
In the max-plus method of McEneaney, and in the method of Ahmadi et al., solving large scale semidefinite programs appears to be the bottleneck, limiting the applicability range.
In our recent work [24,43], we introduced a new method to approximate the joint spectral radius.We replaced the solution of large scale SDP problems by the solution of eigenproblems involving non-linear operators, the "tropical Kraus maps".The latter are the analogues of completely positive maps, or of "quantum channels" acting on the space of positive semidefinite matrices, the operation of addition being now replaced by a multivalued supremum operation in the L öwner order.To solve these eigenproblems, we used iterative power type schemes, allowing us to deal with large scale instances (the algorithm of [24,43] could handle several matrices of order 500 in a few minutes).The convergence of these iterative schemes, however, is only guaranteed so far under restrictive assumptions, since the "tropical Kraus maps" are typically nonmonotone and expansive in the natural metrics.1.2.Contribution.In this paper, we develop a non-linear fixed point approach to approximate the joint spectral radius in the special case of nonnegative matrices.We exploit a result of Guglielmi and Protasov [23], showing that for nonnegative matrices, it suffices to look for a monotone norm.We show here that such a monotone norm can be approximated by a finite supremum of linear forms, which are found as the solution of a non-linear eigenproblem.This is in contrast to earlier polyhedral approximation schemes, relying for instance on linear programming.
More precisely, we introduce a hierarchy of linear eigenproblems, parametrized by a certain "depth", inspired by [8,24], and we show that, as the depth tends to infinity, the non-linear eigenvalue does converge to the joint spectral radius.We remark that the initial ("depth 0") bound in our hierarchy coincides with the bound of the joint spectral radius introduced by Blondel and Nesterov [11].
The non-linear operator arising in our construction actually belongs to a known class: it can be identified to the dynamic programming operator of an ergodic risk sensitive control problems [1], or of a (one player) "entropy game" [2,4].This operator enjoys remarkable properties, like log-convexity, monotonicity, nonexpansiveness with respect to Thompson's part metric or Hilbert's projective metric.As a result, computing the non-linear eigenvalue is a tractable problem.It is shown to be polynomial time in [4].Moreover, large scale instances can be solved by power-type schemes.In particular, we introduce a projective version of the Krasnoselskii-Mann iteration.We present this numerical scheme in a more general setting, for a monotone positively homogeneous maps on the standard orthant.The convergence of this scheme is obtained as a corollary of the convergence of the original scheme.This projective scheme for nonlinear eigenproblems may be of wider interest and applicability: it has universal convergence properties and explicit bounds independent of the dimension, unlike the nonlinear power algorithm considered classically, see e.g.[38,7,20].It also has a geometric convergence property, under less restrictive assumptions.
We report numerical results on large scale instances, up to dimension 5000, obtained by an OCaml implementation of the present algorithm.
A comparison with our companion work [24] may help to appreciate the present approach: it appears to be a "dequantization" of the non-linear fixed point approach of [24].By "dequantization", we mean that we use here operators acting on the standard orthant, in contrast, the operator in [24] acts on the cone of positive semidefinite matrices.Whereas the approach of [24] is more general, leading to a convergent approximation scheme for any family of matrices, the present algorithm only applies to families of nonnegative matrices.However, it is experimentally faster, and it has stronger theoretical convergence guarantees.This suggests that the joint spectral radius problem is easier for nonnegative matrices.
1.3.Organization of the paper.In Section 2, we recall some basic results on Barabanov norms of nonnegative matrices.In Section 3, we introduce the family of non-linear eigenproblems to approximate the joint spectral radius.We show that these eigenproblems are solvable, under an appropriate irreducibility condition.In Section 4, we show that the non-linear eigenvalues in this hierarchy do converge to the joint spectral radius.The projective Krasnoselskii-Mann iterative scheme is analysed in Section 5. Benchmarks are presented in Section 6.
2. The joint spectral radius of nonnegative matrices.The joint spectral radius ρ(A) of a finite collection of n × n real matrices A = {A 1 , . . ., A p } is defined by When the set of matrices A is irreducible (meaning that there is no nontrivial subspace of R n that is left invariant by all matrices), a fundamental result by Barabanov [9] shows that there is a norm ν on R n such that max for some positive real number λ.The scalar λ is unique and coincides with the joint spectral radius ρ(A).
A norm that satisfies Equation ( 1) is called an invariant norm.A norm that only satisfies the inequality max for all vectors x ∈ R n is called a λ-extremal norm.In that case, it is readily seen that λ ρ(A), so that λ-extremal norms provide safe upper bounds of the joint spectral radius.
We now assume that the matrices in A are nonnegative, i.e. their entries take nonnegative values.It is then readily seen that all matrices in A leave the (closed) cone of nonnegative vectors invariant.The latter cone, denoted by R n + , induces an ordering on R n : we have x y if and only if y − x is nonnegative.We note that a vector belongs to the interior of R n + if its entries are positive.Recall that the cone R n + is self-dual, so that x y if and only if u, y − x 0 for all u ∈ R n + .This cone also induces a lattice structure on R n , meaning that the supremum of two vectors x, y always exists and is given coordinate-wise by sup(x, y) i = sup(x i , y i ).A norm defined on R n is called monotone if 0 x y implies ν(x) ν(y).
The (extreme) faces of R n + are the sets {x ∈ R n + : x i = 0 if i / ∈ I} for I ⊆ {1, . . ., n}.The cases I = ∅ (corresponding to F = {0}) and I = {1, . . ., n} (giving F = R n + ) yield the trivial faces.When the matrices in A are nonnegative, the irreducibility assumption on A can be weakened to positive-irreducibility, meaning that there is no non-trivial face of the cone of nonnegative vectors that is left invariant by all matrices in A. A theorem by Guglielmi and Protasov [23] shows that, in this setting, the norm in Equation ( 1) can be chosen to be monotone.

Theorem 1 (Corollary 1 in [23]). A positively-irreducible family of nonnegative matrices has a monotone invariant norm.
We shall say that a map ν from R n + to R is a monotone hemi-norm if it is convex and positively homogeneous of degree 1, if 0 x y implies ν(x) ν(y), and if ν(x) = 0 with x 0 implies x = 0.The term hemi-norm is borrowed to [25], functions of this kind are also known as weak Minkowski norms in metric geometry [41] Note that a monotone hemi-norm ν defined on R n + extends to a monotone norm on R n : ν(x) := inf{ν(y) ∨ ν(z) : x = y − z , with y, z 0} . ( We shall say that ν is a monotone λ-extremal hemi-norm on R n + if max This implies that the associated monotone norm ν is a λ-extremal norm.In this way, it suffices to study monotone λ-extremal hemi-norms defined on R n + .
The operator considered at the 0-level of the hierarchy is given by Higher levels of the hierarchy are built by introducing a memory process that keeps track of the past matrix products, up to a given depth.More precisely, given an integer d, the operator considered in the d-level is a self-map of the product cone , where each x s ∈ R n + , to the vector T d (x), whose s-component is the vector of R n + given by Here, the map In other words, the transition forgets the initial symbol of a sequence, and concatenates the letter a representing the most recent switch, to this sequence.The map T d is monotone with respect to the cone and it is (positively) homogeneous (of degree one), meaning that holds for all positive λ.

Some results of non-linear Perron-Frobenius theory.
Monotone and homogeneous maps are studied in non-linear Perron-Frobenius theory.We recall some basic results here, referring the reader to [38,32] for background.
The spectral radius of a monotone and homogeneous map f defined on a cone C, denoted by r( f ) is defined by: for x ∈ int C.This value is independent of the choice of x and the norm • if the cone C is included in a finite dimensional space, see [36,6,32].
We say that a monotone and homogeneous map f : R n + → R n + is positively irreducible if it does not leave invariant a non-trivial face of R n + .A basic result of non-linear Perron-Frobenius theory, which follows as a consequence of Brouwer theorem, shows that a positively irreducible map has an eigenvector in the interior of the cone.Then, the associated eigenvalue λ coincides with the spectral radius r( f ).The same conclusion holds, in fact, under less demanding assumptions [21], however, for the present class of operators, positive irreducibility will suffice.
Nussbaum proved in [37] that the classical variational characterization of the Perron root of a nonnegative matrix carries over to the non-linear setting.The next theorem follows by combining results of [37], [21] and [3].

8]). Given a continuous, monotone and homogeneous map f on the cone
We write "inf", as the infimum is not attained in general, whereas the maximum is always attained.
In particular, if the map f is not positively-irreducible, it may be the case that f (u) = λu holds for some nonzero vector u in the boundary of the cone R n + .Then, we can only conclude from the second of the Collatz-Wielandt formulae that λ r( f ).However, by the first formula, we do have r( f ) = λ if u belongs to the interior of R n + .

Construction of the hierarchy.
For every integer d 0, the d-level of the hierarchy consists in solving the non-linear eigenproblem: The first main result shows that every problem (E d ) has a solution, and that a solution provides an upper bound on the joint spectral radius ρ(A) and a corresponding monotone λ d -extremal hemi-norm.

Theorem 3. Suppose that the set of nonnegative matrices
Moreover, the map x u := max s u s , x is a monotone λ d -extremal hemi-norm: Proof.First, note that the map T d , which is continuous and positively homogeneous on the cone ∏ s∈[p] d R n + , has an eigenvector u, i.e., T d (u) = λ d u for some λ 0. This is indeed a standard result, which follows by applying Brouwer fixed point theorem to the map x → T d (x)/ T d (x) 1 , where • denotes the ℓ 1 norm.This map sends continuously the simplex Let us write u = (u s ) s this eigenvector, and for each s, let F s denote the minimal face of R n + containing the vector u s .We introduce the set Taking the supremum over r and a, we arrive at max a A a x u λ d x u , hence ρ(A) λ d .We deduce from the second of the Collatz-Wielandt formulae in Theorem 2, that λ d r(T d ).
The positive-irreducibility of T d can be decided by checking whether a lifted version of the set of matrices A is positively irreducible.In the following, the set {e r : r ∈ [p] d } denotes the canonical basis of the space R p d and ⊗ is the Kronecker product.

Proposition 4. The map T d is positively-irreducible if and only if the set of matrices {(e r e T s ) ⊗
Proof.First, we consider the case d = 0. Suppose the map T 0 is not positively irreducible, that is, there is a non-trivial face F of the cone R n + that is invariant by T 0 .Given any x ∈ F, we have A T a x T 0 (x) ∈ F, thus F is also invariant by all matrices A T a , which is equivalent to saying that the set A is not positively irreducible.
In the general case, we can rewrite the map T d , originally defined on the space ∏ s∈[p] d R n , on the space R p d ⊗ R n .To this end, we "stack" the (vector) components of the vector (x r ) r∈[p] d as one vector l[x] := ∑ r e r ⊗ x r .
Moreover, we obtain by "stacking" the components of T d (x) the vector where we have used the fact that the coefficients of e s besides position s are zero, that e T r e t = 0 if r = t and we have denoted A r,s,a = (e r e T s ) ⊗ A a when τ d (r, a) = s, and A r,s,a = 0 n,n otherwise.
Hence, the map T d is positively irreducible if and only if the map A T r,s,a y is positively irreducible.This reduces to the case d = 0 where the set of matrices A is replaced by the set {A r,s,a }.
The term "hierarchy" for the sequence of problems (E d ) is justified by the following proposition.

Proposition 5. Suppose that the set of nonnegative matrices
+ and λ a positive real number such that T d (u) λu for some d.Let v denote the vector in ∏ s∈[p] d+1 R n + defined for all r ∈ [p] d and a ∈ [p] by v ar := u r .We have ,b) .Taking the supremum over r and a, we obtain T d+1 (v) λv.Each vector v ar is positive, so taking the infimum over λ, by the first of the Collatz-Wielandt formulae in Theorem 2, we arrive at r(T d+1 ) r(T d ).
4. Convergence of the hierarchy of nonlinear eigenproblems.The next theorem shows that the spectral radius of the map T d approximates the joint spectral radius ρ(A) up to a factor n 1/(d+1) .The proof of this result is inspired by the ones found in [8,39] in the case of piecewise quadratic approximations of norms.The latter proofs rely on the approximation of a symmetric convex body by the Löwner-John ellipsoid.Here, we use the fact that a monotone hemi-norm ν can be approximated by a monotone linear map, up to a factor n, as shown by the following observation.Proposition 6.Given a monotone hemi-norm ν, there is a vector c with positive entries such that for all nonnegative vectors x.
Proof.Let c denote the vector defined by c i := ν(e i ), with (e i ) 1 i n the canonical basis of R n , and observe that c i is positive since ν is a hemi-norm.Let x denote a nonnegative vector.By convexity and homogeneity of the map ν, we have ν(x) ∑ i x i ν(e i ) = c, x .By monotonicity of ν, we have We now show that the sequence of approximations provided by this hierarchy do converge when d tends to infinity.Theorem 7. Suppose that the set of nonnegative matrices A is positively-irreducible.Then r(T d ) n 1/(d+1) ρ(A) .
Proof.We first prove the case d = 0, in which case the map T 0 is positivelyirreducible.By Theorem 1, the set A admits a monotone invariant hemi-norm denoted by ν.As noted earlier, By Proposition 6 there is a positive vector c such that c, x ν(x) n c, x holds for all nonnegative vectors x.Since ν is a monotone invariant hemi-norm, we have 0 for all such vectors x, which implies that nρ(A)c − A T a c 0, i.e., A T a c nρ(A).Taking the supremum over a, we get T 0 (c) nρ(A)c.Thus r(T 0 ) nρ(A) by Theorem 2.
We now prove the general case.It will be convenient to consider the variant of the map T d obtained by replacing the set of matrices A by the set A d+1 of products in A of length d + 1, yielding the map on R n given by: Now, let v denote a positive vector in R n + and µ a positive real number such that T d (v) µv.By Theorem 2, the infimum of such real numbers µ is equal to the spectral radius r( T d ).We introduce the collection of vectors u = (u s ) s∈[p] d defined by where s(i) denotes the i-th letter of the word s ∈ [p] d , with the convention that the product in the summation ( 5) is equal to v when k = 0.It is readily seen that this collection satisfies the set of inequalities A T a u r µ 1/(d+1) u τ(r,a) .Moreover, by definition of u s , u s v, and so u s is positive.We deduce that T d (u) µ 1/(d+1) u, hence r(T d ) r( T d ) 1/(d+1) by Theorem 2.
It remains to be shown that r( T d ) nρ(A) d+1 .Now, we consider the perturbed map T d ε defined for ε > 0 by originating from the family of perturbed matrices A ε := {A a + εJ} a∈ [p] , where J is the square matrix with all entries equal to 1.The matrices in A ε are positive hence the latter set is positively-irreducible.Thus we fall in the case d = 0 and we obtain Moreover, the inequality T d (x) T d ε (x) holds for all nonnegative vectors x, hence r( T d ) r( T d ε ) by [32].Combined with the fact that lim ε→0 ρ(A ε ) = ρ(A) as proved in [28], we obtain the desired inequality.
We obtain as an immediate corollary of Theorems 3 and 7 that the hierarchy is convergent, in the sense that any sequence of eigenvalues of the map T d converges towards the joint spectral radius.

Suppose that the set of nonnegative matrices A is positively-irreducible. If λ d denotes an eigenvalue of the map T d for all d, then
In particular, the sequence of spectral radii r(T d ) is non-increasing and its limit is equal to ρ(A).

Solving the non-linear eigenproblem.
Several numerical methods allow one to solve the nonlinear eigenproblem (E d ).First, the log-convexity property of T d allows a reduction to convex programming, which entails a polynomial time bound (see for instance the part of [4] concerning "Despot free" entropy games).There are also algorithms, more efficient in practice, that do not have polynomial time bounds.Protasov proposed a "spectral simplex" algorithm [40].A policy iteration scheme was proposed in [4].The spectral simplex, like policy iteration, involves at each step the computation of the spectral radius of a nonnegative matrix, which is generally the bottleneck.
For the huge scale instances which are of interest here, it is more convenient to employ a simpler iterative scheme.We propose to use a projective version of the Krasnoselskii-Mann iteration [33,31].The Krasnoselskii-Mann iteration can be written as x k+1 = 2 −1 (x k + F(x k )), it was originally considered when F is a nonexpansive mapping F acting on a uniformly convex Banach space [31].The uniform convexity assumption was relaxed by Ishikawa:

Theorem 9 ([27, Theorem 1]). Let D be a closed convex subset of a Banach space X, let F be a nonexpansive mapping sending D to a compact subset of D. Then, for any initial point x 0 ∈ D, the sequence defined by x
The analysis of this iteration, by Edelstein and O'Brien [18], involves the notion of asymptotic regularity.The latter property means that F(x k ) − x k tends to 0 as k tends to infinity.The estimate F(x k ) − x k provides a convenient way to measure the convergence.Baillon and Bruck obtained in [10] the following quantitative asymptotic regularity estimate see [17] for more information.Observe that the rate 1/ √ k is independent of the dimension.
Here, we adapt the idea of the Krasnoselskii-Mann iteration to the eigenproblem, by considering it as a fixed point problem in the projective space.
It will be convenient to consider an arbitrary monotone and positively homogeneous (of degree one) f : R N + → R N + , having a positive eigenvector u.In the present application, we will consider the special case f := T d , however, the scheme does converge in a rather general setting, and it may have other applications.

Definition 10 (Projective Krasnoselskii-Mann iteration). Starting from any positive vector v
where • denotes the entrywise product of two vectors.and denotes the geometric mean of the components of the vector x.
By comparison with the original Krasnoselskii-Mann iteration, the arithmetic mean is replaced by the geometric mean, and a normalization is introduced to deal with the projective setting.
To show that this iteration does converge, we need to recall some metric properties of monotone positively homogeneous maps.We shall use a seminorm called Hopf's oscillation [12] or Hilbert's seminorm [21].The latter is defined on R N by x H = inf{β − α : α, β 0, αe x βe} , with e = (1 • • • 1) T .This seminorm is invariant by addition with a constant ( x + αe H = x H ). Observe that Hopf's oscillation defines a norm on the vector space X := {x ∈ R N : ∑ i x i = 0}.The Hilbert's projective metric [38], defined on the interior of the cone R N + , is given by where log is understood entrywise, meaning that log x := (log x i ) i∈ [N] .Observe that d H (αx, βy) = d H (x, y) for all α, β > 0; so d H defines a metric on the space of rays included in the interior of the cone.We shall also use Thompson's metric, defined on the interior of where • ∞ is the sup-norm.It is known that if f is monotone and positively homogeneous, and if it preserves the interior of R N + , then it is nonexpansive both in Hilbert's projective metric and in Thompson's metric, see e.g.[7,Lemma 4.1].
We next show that the scheme (7) does converge.
Theorem 11.Suppose that f is a monotone positively homogeneous map R N + → R N + having a positive eigenvector.Then, the iteration in (7) initialized at any positive vector

converges towards an eigenvector of f , and G( f (v k )) converges to r( f ).
We next show that this reduces to the convergence of the original scheme, after a suitable transformation.
Proof.Let u denote a positive eigenvector of f , so that f (u) = λu for some λ > 0. For all x in the interior of R N + , we can write αu x for some α > 0, and since f is order preserving and positively homogeneous, we deduce that f (x) α f (u) = αu, so f preserves the interior of R N + .We now define the self-map S of R N by S(y) = log f exp(y) , (8) where, again, the notation log for a vector is understood entrywise, and similarly for exp.The map S is monotone and commutes with the addition of a constant.It follows that the map S and the map S : y → S(y) − N −1 e, S(y) e (9) are also non-expansive with respect to Hopf's oscillation, see e.g.[7,Lemma 4.1].
We also note that S(log u) = log u + (log λ)e.
Given r > 0, we consider B r := {x ∈ R N : x − log u H r}. The set B r is invariant by the map S, since, using the nonexpansiveness of this map in Hopf's oscillation The same holds for the map S since it only differs from S by addition of a multiple of the vector e.Moreover, the vector log v 0 belongs to B r for r large enough.We fix such an r in the sequel.
Observe that S is a nonexpansive self-map of the normed space X equipped with Hopf's oscillation, and that this map leaves invariant the set B r ∩ X.The latter set is closed and bounded in the Euclidean metric, hence it is compact.It is also convex.By Theorem 9, the iterative process defined by initialized at any point y 0 ∈ B r ∩ X, converges towards some vector y ∈ B r ∩ X.This limit satisfies S(y) = y.By writing v = exp(y) and v k = exp(y k ), we rewrite Equation (10) to obtain the iteration in Equation ( 7), and observe that the condition ∏ i∈[N] v 0 i = 1 entails y 0 ∈ X.Hence, the sequence v k converges to v. Recall that f is nonexpansive in Thompson's metric and that the latter induces in the interior of the cone R N + the euclidean topology.It follows that f is continuous, with respect to this topology, on the interior of the cone.Hence, passing to the limit in (7), we obtain , and so The following quantitative version of Theorem 11 shows that f (v k ) becomes approximately proportional to v k as k tends to infinity, i.e., v k is an "approximate eigenvector".

Corollary 12. Suppose that f is a monotone positively homogeneous map R N
+ → R N + having a positive eigenvector.Then, the sequence v 0 , v 1 , . . .constructed by the projective Krasnoselskii-Mann iteration satisfies Proof.Since d H (αx, βy) = d H (x, y) for all α, β > 0, and since f is positively homogeneous, we may assume that ∑ i v 0 i = 1, and so log Then, it follows from the final part of the proof of Theorem 11 that Ŝ leaves invariant D.Then, the inequality (11) follows from (6).
We also deduce that the projective Krasnoselskii-Mann iteration provides convergent lower and upper approximations of the spectral radius r(T d ).
Corollary 13.Suppose that f is a monotone positively homogeneous map R N + → R N + having a positive eigenvector u.Let v 0 , v 1 , . . .be the sequence constructed by the projective Krasnoselskii-Mann iteration, and let Then, and Proof.By definition of Hilbert's projective metric, we have (12) follows from the Collatz-Wielandt formula (Theorem 2), whereas (13) follows from (11).
Proof.We combine the inequalities in Theorem 7 and Corollary 13.
Remark 15.One can give an a priori bound on the vector u, to get an explicit control of d(v 0 , u) in (14).See Lemma 16 of [4].
Remark 16.By Proposition 4 and the Perron-Frobenius theorem, the map T d has an eigenvector with positive entries when the set of matrices given in Proposition 4 is positively-irreducible.We point out that the same iteration also converges under the weaker assumption that A is positively-irreducible, but it must be initialized with a vector belonging to the interior of a non-trivial face of the cone that is invariant by T d and that has minimal dimension.
We now show that the projective Krasnoselskii-Mann iteration converges at a geometric rate under an additional assumption.Let u denote a positive eigenvector of f , so that f (u) = λu with λ > 0, and suppose that f is differentiable at point u.Since f is order preserving, the derivative f ′ (u) can be identified to a nonnegative matrix.By homogeneity of f , we have f (su) = λsu, and so, differentiating s → f (su) at s = 1, we get f ′ (u)u = λu.Hence, by the Perron-Frobenius theorem, λ is the spectral radius of f ′ (u).So we can list the eigenvalues of f ′ (u) as λ = µ 1 , µ 2 , . . ., µ N , counting multiplicities, with |µ i | λ for all i ∈ [N] \ [1].We set As soon as λ is a simple eigenvalue of f ′ (u), we have Then, the assumption ϑ < 1 is satisfied under this simplicity condition.The next theorem shows that this entails the geometric convergence with rate ϑ of the projective Krasnoselskii-Mann iteration.
Theorem 17. Suppose that f is a monotone positively homogeneous map R N + → R N + having a positive eigenvector u, normalized so that ∏ i∈[N] u i = 1, suppose that f is differentiable at point u, let ϑ be defined by (15) and suppose finally that ϑ < 1.Then, Proof.The proof idea is inspired by the analysis of the power algorithm in [20].
The power algorithm defines the sequence i.e., the difference with the projective Krasnoselskii-Mann iteration is the damping in (7).We showed in the the proof of Theorem 11 that the projective Krasnoselskii-Mann iteration, after the change of variable v k = exp(y k ), is equivalent to the iteration From f ′ (u)u = λu, we deduce that Fe = e.It follows that Me = 0.Moreover, it is shown in the proof of Corollary 5.2 of [20] that the eigenvalues of M are precisely 0, λ −1 µ 2 , . . ., λ −1 µ N .Hence, the sequence y k satisfies y k+1 = H(y k ) where H is a self-map of the space X, with fixed point y, and H ′ (y) has a spectral radius ϑ.By a standard argument (end of the proof of Corollary 5.2, ibid.), it follows that there is a neighborhood Y of y such that lim sup k→∞ y k − y 1/k ϑ if y 0 ∈ Y.However, we already showed in Theorem 11 that y k does converges to y for every initial condition y 0 ∈ X.Hence, we deduce that lim sup k→∞ y k − y 1/k ϑ for all y 0 ∈ X.Since v k = exp(y k ), we deduce that (16) holds.
Remark 18. Theorem 17 is easily applicable in situations in which the map f is known a priori to be differentiable, for instance when f is a polynomial map associated to a nonnegative tensor, as in [20].It is shown in [20] that the power algorithm converges with a geometric rate ϑ ′ = max i∈[N]\{1} |λ −1 µ i | as soon as ϑ ′ < 1.The condition that ϑ ′ < 1 is more restrictive that ϑ < 1 as it excludes the presence of a non-trivial peripheral spectrum of f ′ (u).Hence, Theorem 17 improves on results of [20], by showing that the Krasnoselskii-Mann iteration does converge geometrically under more general circumstances than the power algorithm.
Remark 19.When f is not everywhere differentiable, verifying the assumption of Theorem 17 can be difficult.In particular, the map f = T d , defined as a finite supremum of linear maps, is differentiable except at the exceptional points x where the supremum in ( 4) is achieved twice.To apply Theorem 17 to the eigenproblem for T d , we need to know a priori that the eigenvector u is a differentiability point of T d .For certain classes of maps, including max-plus linear maps, this property can be shown to hold under some genericity assumptions, exploiting methods in [5].However, the map T d has an explicit structured form which makes it hard to use such genericity arguments.Remark 20.When f is not everywhere differentiable, an easier route to get a geometric convergence rate is to use the notion of semidifferential of f , as in [7].Thus, we suppose now that f (u + , where f ′ u is a continuous positively homogeneous map, not necessarily linear, called the semidifferential of f at point u.We refer the reader to [7] for more background on semidifferentials.It follows in particular from Theorem 3.8, ibid., that the map f = T d has a semidifferential at every point.The power iteration for a semidifferentiable monotone positively homogeneous map is analysed in [7,Theorem 7.8].It is shown there to converge with a geometric rate r( f ′ u ) where r, defined in the same reference, is the spectral radius with respect to the local norm attached to Hilbert's projective metric.We leave it to the reader to verify that a modification of the proof of Theorem 7.8, ibid., leads to the conclusion that the sequence generated by the projective Krasnoselskii-Mann iteration satisfies lim sup When f is differentiable, it can be checked that r( f ′ u ) = ϑ ′ = max i∈[N]\{1} |λ −1 µ i |, with the same notation than in Theorem 17 and in Remark 18, and so, in this case, the estimate of the convergence rate provided by (17) can be coarser than the one provided by Theorem 17.However, in the nondifferentiable case, the assumption that r( f ′ u ) < 1 is often easily verifiable, e.g., by Doeblin-type contraction arguments, as in the final section of [7].
6. Benchmarks.The present method has been implemented in OCaml and has been run on one core of an 2.2 GHz Intel Core i7 processor with 8 Gb of RAM.We report two numerical experiments, showing respectively the convergence of the scheme and the gain in scalability.By definition of the joint spectral radius, the spectral radius of a product of N matrices in A is no larger that the N-th power of the joint spectral radius ρ(A).When such a product achieves equality, we say that the set A has a spectrum maximizing product [23] of length N. The pair {A, B} has a spectrum maximizing product of length 6 given by A 2 B 4 yielding a joint spectral radius equal to 2.0273.
We report in Table 1 the eigenvalue obtained by solving the hierarchy (E d ) for 1 d 9 as well as the computation time.We observe that the hierarchy is stationary at d = 7 and that we recover the exact value of the joint spectral radius.The last column indicates the relative error λ d − ρ(A) /ρ(A).Finally, we also observe the exponential cost in computation time at the level d of the hierarchy.6.2.Scalability of the approach.We demonstrate the scalability of our method on quadruplets of matrices of increasing size, with random entries between 0 and 0.9.We show in Table 2 the computation time associated with each dimension.The iteration process converges in less than 50 iterations in all examples, with a 10 −6 numerical stopping criterion.A monotone extremal hemi-norm has been computed as the supremum of 16 or 64 linear forms (respectively for d = 2 and d = 3).

Conclusion.
We have proposed a new approach for computing a convergent sequence of upper bounds of the joint spectral radius of nonnegative matrices, by solving a hierarchy of non-linear eigenproblems.At any level of this hierarchy, the non-linear eigenvalue λ provides an upper bound for the joint spectral radius, whereas the eigenvector encodes a monotone λ-extremal norm.The non-linear eigenproblem is solved efficiently by a projective version of the Krasnoselskii-Mann iteration.We have implemented this approach and numerical results are witnesses of the scalability of this approach, compared to other works based on the solution of optimization problems.

Corollary 14 .
Let f := T d and suppose that T d has a positive eigenvector u.Then the sequence (β k ) k defined inCorollary 13 satisfies z) = S(z) − N −1 e, S(z) e and S(z) = log • f • exp(z).Let y := log u, and let δ(u) denote the diagonal matrix with entries u 1 , . . ., u N .A simple computation shows that the matrix F := S ′ (y) satisfies

× 5 matrices 6 . 1 .
Convergence of the hierarchy.We illustrate the convergent nature of the hierarchy on the pair of matrices A =

TABLE 1 .
Level d CPU Time (s) Eigenvalue λ d Convergence of the hierarchy on 5