Energy Minimization and Preconditioning in the Simulation of Athermal Granular Materials in Two Dimensions

Granular materials are heterogenous grains in contact, which are ubiquitous in many scientific and engineering applications such as chemical engineering, fluid mechanics, geomechanics, pharmaceutics, and so on. Granular materials pose a great challenge to predictability, due to the presence of critical phenomena and large fluctuation of local forces. In this paper, we consider the quasi-static simulation of the dense granular media, and investigate the performances of typical minimization algorithms such as conjugate gradient methods and quasi-Newton methods. Furthermore, we develop preconditioning techniques to enhance the performance. Those methods are validated with numerical experiments for typical physically interested scenarios such as the jamming transition, the scaling law behavior close to the jamming state, and shear deformation of over jammed states.

1. Introduction. Granular materials are conglomerates of discrete, macroscopic, solid particles, such as sand, soils, pills, etc. They are ubiquitous in many industrial and natural processes. Analytical study is usually difficult due to the complicated nonlinear heterogeneous multi-body interactions in the dense granular system. Numerical methods have remarkable significance in the study of granular materials since most perturbative analytical methods in condense matter physics requires either the low density in the ideal gas limit or a prefect lattice for crystals. On the meanwhile, it is difficult for experimental methods to deal with large number of particles, measure physical quantities in 3D and investigate phase transitions such as jamming phenomena quantitatively [9,24]. Numerical simulations allow access to all detailed information of the particle system, and usually it is straightforward to calculate all the relavant quantities (e.g. fabric, stress, energy) of the system. However, it remains challenging to develop efficient and robust numerical methods for granular system, especially for large particle systems. Molecular dynamics and energy minimization are two main classes of simulation methods. Discrete element method (soft-particle method) is one of the typical molecular dynamics methods in granular applications [11], where a set of Langevin equations of motion are formulated, with relaxation and random fluctuation terms to quantify the physical relaxation time scale and thermal effect, respectively. Related methods include event-driven method, non-smooth contact dynamics, etc, see references in [10]. On the other hand, energy minimization methods can be applied to the situation of quasi-static deformation, when the applied load changes slowly over time with respect to the inertial forces. One thus can significantly speed up the simulation if time scales are not needed to be fully resolved. Also, the high accuracy of the energy minimization methods provides advantage for the quantitative study of the jamming transition and the scaling law behavior close to the jamming point. Although molecular dynamics method have been extensively used in the simulation of granular materials, the application of energy minimization methods is relatively recent. For example, Luding et. al. have applied trust region methods to study the relaxation and shear of granular system [10]. In this paper, we will study the performance of several typical minimization methods for the granular simulation, such as conjugate gradient methods and quasi-Newton type methods.
Preconditioning is the main bottleneck for the development of efficient and robust numerical algorithms for large scale molecular simulation [1,18], boh in the case of molecular dynamics and energy minimization. General purpose preconditioners are not specifically targeted at large-scale atomistic systems, and are not particularly effective. In this paper, we will also stress on the development of preconditioning techniques for the efficient energy minimization of granular systems.
Remark 1.1. The molecular details of molecular dynamics methods such as softparticle methods and energy minimization methods are usually not the same. However, it is important to note that the equilibrium physical quantities such as fabric, stress and energy remain similar [10]. Also, there is no temperature in the granular model simulated by energy minimization methods, namely, we are only interested in athermal granular materials .
1.1. Outline. The paper is organized as follows. In Section 2, we present the physical model for the granular system, as well as the critical jamming transition which is important for the numerical investigation of dense granular system. We present the optimization and preconditioning techniques used for granular simulation in Section 3. The numerical methods are validated and benchmarked in the numerical experiments in Section 4. We conclude in Section 5 and point out some future improvements.
2. Physical Setup. In this section, we introduce the basic physical models. The potential energy of the system is the sum of interaction energy of all particles. It is important to note that particles can interact unless they overlap. In other words, there is only repulsive force therefore the system is qualitatively different from the repulsive-attractive forces such as Lennard-Jones [4]. For simplicity, we only consider the contact models without tangential forces, and also only the two dimensional case.
2.1. Granular configuration. An admissible configuration C of particles is given by a list of center x i and radius r i of particles. The volume (or packing) fraction φ is a key parameter in the simulation of granular materials, PRECONDITIONING IN GRANULAR SIMULATION   3 where V is the volume of the containing domain (container). The geometric structure of the granular materials can be characterized by the following fabric tensor, 2.2. Contact Model. Here we will introduce a contact potential of the following form, Remark 1. The exponent α ≥ 2 identifies the nonlinearity of the pair potentials. When α = 2, the interaction between particles follows the Hooke's law, which models a repulsive harmonic spring. The force between two overlapped particles linearly depends on the overlap. Hooke's model is one of the simplest contact model, which is still of great importance. From equation (6) and (8), it is clear that the first derivatives of the Hooke potential are continuous at δ c = 0, while the second derivatives are discontinuous there. When α > 2, we obtain the so-called Modlin-Hertzian model or Hertzian model, which is based on the Hertz contact theory [8].
The force between two overlapped particles has a power law nonlinearity. Both the first and second derivatives of the Hertzian potential are continuous if α > 2. We take α = 5/2 in the numerical examples of this paper.
2.3. Macroscopic Variables. The average stress tensor can be calculated by the following formula, where V is the volume of the containing domain and C is the collection of all the contact points in the domain. According to equation (9), the pressure of the system can be defined as, where d is the dimension of the domain. The bulk modulus can then be defined as, For a simple shear [16] such that the deformation matrix is given by the shear stress is −σ 12 , and the shear modulus can be calculated by: In addition, the average contact number or coordination number, denoted byz, is a key geometric parameter in granular simulation. For a granular system consisting of N particles,z can be simply defined by, z = |C| N where |C| is the number of all contact points.
We note that there exist particles with zero contacts, namely, the so called "rattlers". In addition, in our 2D frictionless circular disk system, some particle are called "dynamic rattlers" [5] if their numbers of contacts are less than 3, which can lead to mechanical instability. Thus, these rattlers are excluded from the calculation of the coordination number. We denote the number of particles with at least 3 contacts by N 3 , and the set of contact points for particles with at least 3 contacts by C 3 . The modified definition of coordination number is, 2.4. Jamming formation. The jamming transition is an important physical process in granular system [12,17,23], which refers to the critical slowing down of the system due to "overcrowded" particles. Close to the transition point, the geometric constraints prevent the system to explore the phase space, and the system changes its behavior from gas or liquid like to solid like. During the transition, the system has to adjust its configurations more and more in a collective manner instead of via local movements, in order to achieve energy equilibrium. This corresponds to the essential difficulty for the numerical simulation which will be discussed later.
In this paper, we adopt the approach by O'Hern et al. [17] to generate a jamming configuration: An initial configuration is sampled with sufficiently low packing fraction φ init such that there is no overlapping particles after relaxation (energy minimization). The total energy and pressure for the relaxed system with volume fraction φ init is 0. Starting from φ init , we repeatedly increase the volume fraction φ with a small increment δφ by enlarging the size of particles accordingly, and then carry out energy relaxation. At some critical fraction φ cr , there will be no free space for particles to explore and unavoidably, they will come into contact. As the system is further compressed, some particles overlap, and the total energy and pressure become nonzero and starts to grow. The system will also posses nonzero bulk modulus since the pressure increases upon compression. φ cr is called the critical volume fraction or jamming point. In 2D bi-disperse system with number ratio 1:1 and size ratio 1.4:1, φ cr is approximately 0.842 [10].
We can adopt a bisection algorithm to increase the accuracy of φ cr . Let φ app cr be an approximate critical fraction associated with a configuration with nonzero energy and pressure, the previous unjammed configuration with zero energy and pressure has the volume fraction φ pre cr := φ app cr − δφ. We can then compress the unjammed configuration at φ pre cr by increasing its volume fraction to φ m = φ pre cr + δφ/2, and see if the jamming transition already happens at φ m . This procedure can be repeated until for example, δφ < ε, where ε represents the desired accuracy. We note that this procedure can achieve very high accuracy for the determination of the jamming point, which is not feasible for experimental methods.
The overall procedure is summarized in the following algorithm for the generation of the jamming configuration. Denote by min E(φ, C ref ) the energy minimization in the configuration space C with the packing fraction φ and the initial configuration C ref , which can be solved by certain optimization algorithm in Section 3. The minimizer is denoted by C eq = arg min E(φ, C ref ). Here we use a square shaped container Ω and the periodic boundary condition for the granular system. Algorithm 1 Generation of the Jamming Configuration. Input: particle number N ; initial packing fraction φ init ; increment δφ; prescribed accuracy ε. Output: critical volume fraction φ cr ; jamming configuration C jam with φ cr . 1: Generate an initial configuration C init with N particles and volume fraction Fix the position of the particles, and enlarge their radii uniformly to generate a new configuration C ref with volume fraction φ;

18:
Let C eq = arg min(φ m , C m ), and calculate E(C eq ) and p(C eq ). 19: end if 24: end while 25: Denote φ cr = φ app cr , C jam = C eq . 26: return φ cr , C jam ; 3. Numerical Method. The minimization of the total potential energy (3) relies on the state-of-the-art optimization and precondition techniques which will be presented in this section. The performance of those methods is contingent upon the particular applications. In particular, the construction of a good preconditioner, can be regarded as "an art rather than a science" [22].
3.1. Energy minimization. In this paper, we use the limited-memory Broyden-FletcherGoldfarbShanno (L-BFGS) method [14] and Fletcher-Reeves conjugate gradient (FR-CG) method [15] for the energy minimization of granular system. We denote the total energy of a system as f (x) where x is the positions of all particles. If x (k) is the iterate at step k, we denote the energy at step k by f (k) = f (x (k) ), and the gradient at step k by g (k) = ∇f (k) .
3.1.1. L-BFGS method. L-BFGS is one type of the quasi-Newton methods, namely, it utilizes an approximation of the inverse Hessian matrix to generate the search direction. Instead of the full dense n × n approximations to the inverse Hessian (n is the number of variables, which is N d with N number of particles and d the dimension in our problem), L-BFGS uses a low rank approximation with only a few vectors of length n to represent the approximation implicitly.
We denote the approximate inverse Hessian at step k by B (k) . Assuming we have stored the last m updates of the form s can be obtained using the following algorithm.
Algorithm 2 L-BFGS two loop recursion. in the boxed step in the algorithm is an rough approximation to the inverse of the exact Hessian, and an effective choice is: This choice ensures that the search direction is well scaled and therefore the unit step length is accepted in most iterations. Besides, the diagonal matrix makes it much simpler to compute the multiplication r = B (k) 0 q.

FR-CG method.
We use the Fletcher-Reeves variant [6] of the nonlinear conjugate gradient (CG) method [19]. As an extension of the linear conjugate gradient method [7], the search direction p (k) at kth step of FR-CG method is defined by,
Once we calculate the new search direction using either L-BFGS in Algorithm 2 or FR-CG in Algorithm 3, the next iteration is given by where α (k) is chosen by a line search method which satisfies Wolfe conditions or strong Wolfe conditions [15], such that the update is stable.

3.2.
Preconditioning. Preconditioning is important for the efficiency of the minimization algorithms, especially for large scale problems. We have to choose the preconidtioner matrix P (k) such that it is similar to the exact Hessian, and at the same time easy to invert.

Preconditioned minimization algorithm.
For L-BFGS algorithm 2, we replace the boxed step in the algorithm with: in order to obtain a preconditioned L-BFGS algorithm. For FR-CG algorithm 3, we have the following preconditioned version for p (k) .

Algorithm 4 Preconditioned FR-CG method.
1: y (k) = P (k),−1 g (k) ; The equation (15) or Step 1 in Algorithm 4 is equivalent to solve a linear system: An alternative point of view for the preconditioning is to make a change of coordinates: x = P (k),1/2 x. Considering the function F ( x) = f (P (k),−1/2 x), we have Applying the L-BFGS algorithm 2 or FR-CG algorithm 3 to optimize the transformed function F (x), we can obtain the preconditioned version of the corresponding algorithm.

3.2.2.
Construction of preconditioners for large scale molecular system. Preconditioning is well established in numerical linear algebra and numerical PDE problems [22]. However, in many applications general purpose preconditioners do not work particularly well, and specifically designed preconditioned are preferable. For large scale atomic/molecular simulations, Packwood et. al. proposed the so-called "universal preconditioner" P [18]. They have successfully applied this preconditioner to atomic simulation and electronic structure calculation of crystalline system of the size from hundreds to 10 4 atoms, with a speedup of order 1 or 2 [13]. In this paper, we will test the performance of those universal preconditioners for granular systems.
The universal preconditioner P is defined via the quadratic form Here r cut is the cutoff radius, µ is the energy scale, A is a parameter which should be large enough to ensure that the nearest neighbor interactions dominant. The numerical experiments in [18] and in our numerical results indicate that as long as A is of order 1, it does not significantly affect the performance of the preconditioner.
In particular, if we take A = 0, we choose the preconditioner matrix P as the stabilized adjacency matrix of the particles [18], namely, where d ij is the distance between particle i and j, C stab is a stabilization term to make sure the matrix is strictly positive definite, we choose C stab = 0.1 in our application.
In our model problem, the bi-disperse granular system is composed of two types of particles, whose radii are denoted by r min and r max . We take the cutoff radius r cut > 2r max so that all the possibly overlapping particles are covered. For example, we can take r cut = 2.2r max . The preconditioner matrix will be recalculated when the maximum displacement of particles is beyond r cut /4, therefore it will not be recalculated frequently. The energy scale µ is chosen to make sure that the precondition matrix is of the same order as the real hessian, so we may choose the unit step-length when the L-BFGS method is applied. One way to obtain µ is, where P µ=1 is the matrix when µ = 1, and ν is a test perturbation with the following form where L is the lengths of the periodic cell and M is a constant, here we choose M = 10 −3 r cut .
Since we need to solve linear system (16) with respect to P , when the particle number is small, we use direct method and rearrange the index of grains by sparse reverse Cuthill-McKee ordering method [3], the band width of P can be reduced so that the equation (16) can be easily solved. If the number of particles is large, iterative solvers such as preconditioned CG or AMG [2] can be utilized. 4. Numerical Results. In our numerical experiments, we use the bidisperse granular system which consists of two types of particles with number ratio 1:1 and size ratio r max /r min = 1.4, so the system will not crystallize [20]. The particles are contained in a unit square domain Ω = [0, 1] 2 . We use the periodic boundary condition in order to reduce the boundary layer effect. Solid wall boundary conditions can also be implemented.
In the numerical experiments, we test the performance of preconditioned L-BFGS method and preconditioned FR-CG method, by running several benchmark simulations: the jamming formation, the scaling law behavior for the jammed configuration, and the shear deformation of the over-jammed states. In those different scenarios, we will show that the proposed preconditioned algorithm has better performance compared to their un-preconditioned counterparts, and preconditioned L-BFGS is currently the method of choice. 4.1. Jamming formation. We first run Algorithm 1 to generate the jamming configuration, starting from a relaxed configuration with volume fraction φ < φ cr , where φ cr is the critical volume fraction (or jamming fraction). We let the particles grow with small increment δφ, then we use either the preconditioned or un-preconditioned version of Algorithm 2 and Algorithms 3, 4 as the energy minimizer in Algorithm 1.
In Figure 2 , the residual norm of each iteration during a relaxation process is plotted. We apply four methods (L-BFGS, preconditioned L-BFGS, FR-CG, Preconditioned FR-CG) to minimize the energy of an unjamming configuration. A high accuracy tolerance is adopted here when an accurate jamming point is pursued. The figure shows that: L-BFGS is better than FR-CG, preconditioned L-BFGS method has the best convergence curve compared to other methods, and preconditioner provides a factor of two speed up for L-BFGS. We also notice that convergence curves for all methods are extremely rough and have long asymptotic regimes, that means the energy landscape is very complex and there are many local minimizers to explore. Figure 3 shows the evolution of the granular system up to the jamming point. The volume fraction of the initial configuration is small, therefore particles can move around in order to achieve a zero energy configuration. When the volume fraction increases, the particles start to contact with each other, but initially it is still possible to achieve a zero energy configuration through energy relaxation (minimization). As the free space becomes less and less, at a critical volume fraction, the particles are forced to contact and overlap, and the potential energy of the system becomes non-zero. Figure 3. Schematic of the evolution from non-jamming to jamming state of bi-disperse granular system using Algorithm 1, a Initial non-jamming configuration. b Intermediate configuration. c Jamming configuration, Visualization tool: OVITO [21].

Scaling Law.
Once the jamming configuration C jam is obtained, we can continue increasing the volume fraction in a quasi-static manner. It has been proposed by physicists that the granular system has the so-called scaling law behavior close to the jamming point [17]. Here, we numerically justify this scaling law using our energy minimization techniques, and more interestingly, we observe that the iteration number of the energy minimization methods also has similar scaling law behavior.  . pressure p vs φ − φ cr for a 2D bi-disperse granular system with particle number 4096.
In the Figure 4, we plot the log-log curve of pressure p vs φ − φ cr . It is clear that p follows a power law with respect to φ − φ cr close to the jamming point, and the critical exponent depends on the nature of interactions. For harmonic (α = 2) and Hertzian (α = 2.5) potentials, the critical exponent are approximately 1.0 and 1.5, respectively. In addition, this scaling law between p and φ − φ cr is robust for different initial configurations (they may have different φ cr ).  Figure 5. Shear modulus G vs φ−φ cr for a 2D bi-disperse granular system with particle number 4096.
If we exert small simple shear strain γ to these configurations, we can compute their shear modulus using equation (12). In Figure 5, we plot shear modulus for different volume fractions. It shows that the shear modulus G of the granular systems also follows a similar power scaling law with respect to φ − φ cr but with different exponents, which is approximately 0.5 when α = 2.0, and 1.0 when α = 2.5.  Figure 6. Average contact numberz vs φ for a 2D bi-disperse granular system with particle number 4096. z 0 is the average contact number of the granular system with volume fraction φ cr . In our 2D simulation, particles are frictionless, z 0 = 4.0.
The average contact numberz defined in (13) also exhibits a scaling law in regards to volume fraction difference φ − φ cr . In Figure 6, the lines in the log-log plots have similar slopes close to 0.5 for both α = 2.0 and α = 2.5. The critical average contact number z 0 = 4 is the average contact number when a system reaches exactly the jamming state. The average contact number of a system takes a leap from zero to 4 when jamming transition occurs. In Figure 7, we plot the iteration number vs. volume fraction difference φ−φ cr for the preconditioned L-BFGS and L-BFGS methods. Starting from certain jamming configuration, we compress the granular system by increasing the volume fraction φ. Then for each relaxed over-jammed configuration, we exert small shear strains from 10 −6 to 10 −5 quasi-statically, in 10 steps. The iteration numbers exhibit "random" behavior, especially close to the jamming point. We plot both the average and the variance (as error bars) of the iteration number. We observe that the iteration number of each numerical method also follows a scaling law, which ranges from 10 3 − 10 4 close to the jamming point (φ − φ cr = 10 −6 ), to about hundreds when φ−φ cr = 10 −2 . The result shows that the preconditioned algorithm performs better, and also it is more robust to rounding error as the variance of the iteration number for the preconditioned algorithm is also smaller away from the jamming point.
In Table 1, we illustrate the performance of four optimization methods: L-BFGS, preconditioned L-BFGS, FR-CG, and preconditioned FR-CG in three different situations given an over-jammed configuration with 4096 particles and volume fraction φ = 0.8438.
• Case 1, fixing the positions of the particles, and enlarging them slightly with ratio 1 + δ, δ = 5 × 10 −5 ; • Case 2, applying a small shear strain (γ = 10 −4 ) to the configuration; • Case 3, exerting a small (O(10 −4 )) random perturbation to the configuration. The comparison of iteration number and computational time in Table 1 clearly demonstrates that the preconditioner can improve the efficiency of the energy minimization methods. Also, it seems we can simply take the parameter A = 0 (18) in the preconditioner.   (18)). The granular system has 4096 particles and volume fraction φ = 0.8438.

Shear Test.
In the shear test, we deform the over-jammed configuration with pure shear (stretching in x direction and compressing in y direction, while keeping the area unchanged). The deformation matrix is given by,  . Stress (σ) vs. strain and fabric (F ) vs. strain curve for pure shear, γ is the shear strain. The subscripts iso and dev, represent the isotropic and deviatoric parts of these tensors (stress, fabric), respectively. Let λ 1 and λ 2 be the eigenvalues of those tensors, the isotropic part is defined as (λ 1 + λ 2 )/2 and the deviatoric part is |λ 1 − λ 2 |/2 The stress vs. strain and fabric vs. strain curve for the pure shear simulation is shown in Figure 8. Notice that we use increments of different sizes, 4 × 10 −4 , 2 × 10 −4 , and 10 −4 to reach a final 2% shear strain, we still obtain qualitatively (quantitatively for small strain) similar curves.

5.
Conclusion. In this paper, we introduce the energy minimization techniques for the efficient simulation of dense athermal granular systems in two dimensions. Preconditioners are used to enhance the performance of the simulation. We carry out numerical experiments for some typical scenarios of granular simulation in order to validate the numerical methods, which include the jamming formation, scaling law phenomena close to the jamming point, and deformation of over-jammed states. Speedup of the preconditioned minimization methods are observed.
This work opens avenue for several possible improvements: First of all, we are going to extend the current work to three dimensions, which is physically more relevant to real applications, and still difficult for experimental studies [9,24] .
Secondly, from the practical point of view, we will optimize the implementation of preconditioners using, for example, AMG [2] and other types of state-of-the-art techniques, especially for the three dimensional case.
Last but not least, we observe that the iteration number of our current preconditioned algorithms follows a scaling law behavior close to the jamming point, which is a fundamental bottleneck for the energy minimization approach. It remains open whether one can find a preconditioner which is robust close to the jamming point. Usually, an efficient preconditioner takes account of the long wavelength modes of the system. However, close to the phase transition point, more and more high frequency information might be needed. The development of numerical techniques in this direction relies on the understanding of the physical origin of the jamming transition.