CLOSED-FORM EXPRESSION FOR THE INVERSE OF A CLASS OF TRIDIAGONAL MATRICES

. Despite the simplicity of tridiagonal matrices, they have shown to be very resilient to closed-form solutions. We consider a class of tridiagonal stiﬀness matrices that stems from a variety of lumped element models in mechanical, acoustical and electrical systems. The computational eﬀorts in such models are related to solving the generalized eigenvalue problem and ﬁnding the inverse of the stiﬀness matrix. To improve accuracy, it is desired to dis-cretisize the problem as much as possible at the expense of growing matrices. This paper improves the eﬃciency of ﬁnding the inverse by a factor of at least three and the computational memory involved is at least halved. Moreover, the result provides an analytical expression for where the stable position is, which might be used in control systems. Surprisingly, it is the practical application itself that guides the proof.


1.
Motivation. In this paper we develop an expression for the inverse of the ndimensional square matrix K, which is given by where all the k i s can be any complex numbers satisfying the restriction that the determinant is not equal to zero. The matrix K, appears in the canonical example when springs are connected to point masses in a sequence. Describing this problem leads to a system of second order ordinary differential equations, which is described in many mathematical textbooks. A description of the mechanical aspect of the mass-spring problem is described in [13], but in this book n is limited to two. The extension to n dimensions is given in for instance the book of of Dermel [3]. The sequential mass-spring example is used several times to describe numerical methods in linear algebra such as finding 438 SIGVE HOVDA eigenvalues. Chapter 12 in [12] contains a discussion of other physical systems that have the same set of differential equations. That is, these systems are analogous systems to the sequential mass-spring example. From [12], it is clear that K appears in mechanical systems involving rotation and electrical systems as well.
In [3], the mass spring problem is horizontal, but as soon as the gravity forces are not perpendicular to the directions of the springs, the stable position of the masses are shifted by a term that is expressed by K −1 . This motivates finding an analytical expression for K −1 . We will return to this discussion in section 2.
The literature on the inverse of tridiagonal matrices, as given by consists of papers involving general approaches and papers involving special cases.
Meurant's paper [10] includes a well written review of the literature on the inverse of symmetric tridiagonal matrices. We will not repeat this discussion here, but focus on the main results that are relevant for this study. The special cases involve closed-form expressions that are related to Toeplitz matrices. A Toeplitz matrix is a matrix where all α i = α, all β i = β and all γ i = γ. Paper [7] contains a closed-form expression for symmetric Toeplitz matrices, while properties of the general Toeplitz matrices are given in [9]. In [6], an analytical solution is given for a pseudo-Toeplitz matrix. This matrix is a symmetric Toeplitz, except for the fact that the first and the last element on the diagonal are different. It is clear that K is not a special case of any of these papers.
Typically, the goal of the general approaches is to describe general properties of the matrices, where the ultimate goal is to find numerical algorithms for efficient and stable inversion of large matrices, described in for instance [8] and [4]. Of particular interest are some papers, where the entries of the inverse are given by recursive recurrence relations.
The most general result in this direction is the result from Usmani [15], which states that Here, the θ i s and the φ i s are the recurrence relations given by with the initial conditions are θ 0 = 1,θ 1 = α 1 , φ n+1 = 1 and φ n = α n . In this expression, θ n is the determinant of the matrix and is required be nonzero for this inverse to exist. Simply inserting our problem into this framework yields where the initial conditions are θ 0 = 1,θ 1 = k 1 + k 2 , φ n+1 = 1 and φ n = k n + k n+1 . It seems difficult to find a closed-form expression for K −1 directly from these expressions. Another route to finding K −1 is using the fact that the inverse of an invertible irreducible tridiagonal matrix is a generator representable semiseparable matrix and vice versa. This is proven in on page 42 in corollary 1.39 in [16]. This means that the lower triangular tril(K −1 ) is equal to tril(uv T ) and the upper triangular trip(K −1 ) is equal to trip(pq T ), where u, v, p and q are column vectors of length n.
We can also use the fact that when K is symmetric, then trip(K −1 ) = trip(vu T ), as reported in [1] and [5]. In [10], recursive recurrence relations for these entries are given. Translating to our problem gives and u n = 1 e n v n , u n−i = k n−i+1 · · · k n e n−i · · · e n v n , for i = 1, · · ·n − 1 where As long as the matrix is weakly diagonally dominant, which K is, then this algorithm is computationally stable as reported in [10]. Finding the v i s requires 6n − 3 operations, while finding the u i s requires 6n − 2 operations. This totals to 12n − 5 operations and a storage space of 4n.
To compute the full matrix, it is required to calculate n(n+1)/2 more operations, which dominates the computation. However, in many applications, the inverse of the matrix is multiplied by a vector or used in a way where it is not necessary to compute the full matrix. In those cases, we only need to compute the u i s and the v i s. Since K is weakly diagonally dominant, this is the gold standard that we compare with our new expression for K −1 .
In this paper, we seek an analytical solution and we show that it computes the inverse faster and requires less storage space. Moreover, u and v can be calculated from this solution and computation times can be compared directly.
In order to find the inverse, we are inspired by Newton and the laws of physics. In section 2, the mass-spring element model is outlined and an expression for K −1 is proposed. This proposition is proven, while the computational efficiency is discussed in section 3. Some final remarks are given in the sections 4. Figure 1. The figure shows a rod that is modeled as five springs and four blocks. On the left side, we see the rod and its model when it is nowhere in tension nor in compression. On the right side, we see the effect of gravity on the rod and its model. Since the radius of the rod is non-uniform, the m i s and all the k i s are not equal.

2.
Mass-spring element model of a rod. We consider a vertical rod with nonuniform radius, that is mounted at the bottom and at the top. When the rod was mounted, it was nowhere in tension nor in compression. However, because of gravity, parts of the rod become in tension and parts of the rod become in compression. We model this rod as a set of n blocks that are connected by n + 1 spring elements. This is illustrated in figure 1.
A one dimensional coordinate system is introduced, where the positive direction is pointing downwards. The center point on each block is denoted X i (t), where t is time. In the unrealistic case where all springs are not in compression and not in tension, these coordinates are denoted X s,i . The distance between X s,i and X i (t) is defined to be q i (t), that is X i (t) = X s,i + q i (t). The physical state of the rod at any time is therefore uniquely defined by the generalized coordinates q i (t).
Newton's second law on each element gives where g is the gravity constant, m i is the mass of block i and k i is the spring constant of the spring above block i. Equation (5) is a system of n coupled second order ordinary differential equations. This can be written on a matrix form by where M is a diagonal matrix with the m i s on the diagonal and the elements of g are constants of the form m i g. Here, the coefficients of the matrix K are restricted to be positive. If we set the g = 0 and k n+1 = 0, the mathematical description reduces to the equations in Demmel's book [3]. In Dermel's book the springs and the masses are laid out horizontally and the last spring does not exist.
If we make the coordinate transformation q = y + K −1 g, then equation (6) becomes which is homogeneous. Notice that when the initial conditions are y(0) = 0 anḋ y(0) = 0, then all y stays at zero. The solution of this system will fluctuate around its stable position, which is y = 0 [14]. The stable position in terms of q is K −1 g.
In fact the elements of the vector K −1 g are the vertical displacements of all the blocks in figure 1.
In the stable position when nothing is moving we see from Newton's first law that Clearly, we can reason that This inspires the following theorem: Theorem 2.1. If θ n , as given in equation (2), is not equal to zero, then Notice that the k i s are now allowed to be complex, which is more general than in the example with the rod. Moreover, this result is in compliance with Asplund [1] and Fadeev [5]. In particular, we can use theorem 2.1 to simplify equation (3) and (4) It is known that if one of the k i 's is zero, the symmetric tridiagonal problem reduces to two smaller sub-problems as discussed in [2]. In theorem 2.1, some of the k i s are allowed to be zero as long as the determinant is not zero. Division into sub-problems is therefore not necessary as long as θ n is not equal to zero. It is interesting that this is the case, despite the fact that some of the c i s become infinite.
Contrary, when all k i = 0 then all c i 's are finite and hence the inverse exist. To be formal: Notice that in the expressions of the closed form solution, it is clear that unless any of the k i s are very close to zero, then the solution will be robust. It is basically summing reciprocal k i s. In a practical problem, we typically discretize a physical object into a set of discrete entities, which means that K increases in size. Moreover, the k i s increase in magnitude. If we for instance, model a uniform rod with stiffness k, then the partial stiffnesses will be k times the number of elements. This means that the more we discretisize, the more stable the inversion becomes.
Proof. In order to rigorously prove theorem 2.1, we need to prove that K(A−B) = I and also that (A − B)K = I. Both of these proofs are very similar and we will only show that K(A − B) = I.
It makes sense to let K = K 1 + K 2 , where K 2 is zero everywhere except on K 2nn where it is equal to k n+1 . The proof is concluded when we have shown that First, we see that K 1 = EKE T and that K 1 K is a diagonal matrix with the n-first k i s on the diagonal and As K 2 only contains one element, it is clear that K 2 (A − B) is a matrix with zeros everywhere, except on the last row. In fact the last row is 3. Computational efficiency of computing large matrices. Computing the closed-form expression is very fast. Computing the c i 's requires 2n−1 computations and the storage is approximately n. In many applications, the actual inverse is not needed and further calculations can be derived from the c i s. For comparison, computing the u i s and the v i s involves 4n − 1 computations, which is much less than the 12n − 5 computations that are needed when using equation (3) and (4). Moreover, in the special case when k n+1 = 0, the full matrix is computed in just 2n − 1 computations. This is a drastic enhancement compared to the method in [10], where there is a O(n 2 ) term to compute the entries of the matrix. In order to discuss the efficiency of computing the inverse of large matrices, we decided to compare computation of the u i s and the v i s when using the new method and the standard method, which is here defined as using the recursive relations in equation (3) and (4). We generated n k i s using a random generator and chose them to be between 1 and 2 to avoid that any of them could be too close to zero. The code was implemented in Matlab (Mathworks Inc) on an iMac with processor 3,5 GHz Intel Core i7 and 32 GB 1600 MHz DDR3 memory card. Table 1 shows computational speed for various n. For large n, we see the expected linear relationship between cpu times and n. Moreover, the new method seems to be in the range of three times faster then the standard method. This is expected as the new method contains three times less computations. When n = 10 9 , the standard method is more than four times slower. The reason for this may be that the standard method requires more storage space than the new method.
It is also worth mentioning that it is often not necessary to compute the v i in many applications, since they are directly derived from the u i s. In this case, the computation time will be even less than what is reported here.

Matrix size n
Cpu time in seconds Standard method New method 10 5 0.01 0.00 10 6 0.07 0.03 10 7 0.73 0.40 10 8 7.62 2.22 10 9 98.94 22.19 Table 1. The figure shows computational speed of computing the u i s and the v i s using the standard method, ie solving equation (3) and (4) and the new method which involves computing the c i s.
The theorem provides an analytical and comprehendible insight about the stable position of the problem that is described by equation (6). The solution of the differential equations oscillates around K −1 g. If we add a damping term to the problem, the system will either be under-, over-, or critically damped. In all cases, the behavior moves toward the stable position.
In many real systems, the coefficients in the matrices may actually have a time dependence. This complicated the full solution of the differential equations, but the "stable position" can still be calculated by the inverse at each time step. Solving this system can be done by for instance adaptive observers [11]. It is possible that the closed-form expression for the inverse can help regulating the adapters.
A last point regarding the consequences of theorem 2.1, is that it actually allows a discussion of when n goes to infinity. The lumped element model simplifies the description of a spatially distributed system into a topology of n identities. This simplification can be studied by letting n go to infinity. This type of discussions are difficult without analytical expressions for the inverse.