Solver | CPU time | Number of |
[s] | iterations | |
GS | 703 | 10000 |
SOR | 92 | 1136 |
BiCG | 68 | 568 |
Bi-CGSTAB | 41 | 348 |
The finite volume method (FVM) as a numerical method can be straightforwardly applied for global as well as local gravity field modelling. However, to obtain precise numerical solutions it requires very refined discretization which leads to large-scale parallel computations. To optimize such computations, we present a special class of numerical techniques that are based on a physical decomposition of the computational domain. The domain decomposition (DD) methods like the Additive Schwarz Method are very efficient methods for solving partial differential equations. We briefly present their mathematical formulations, and we test their efficiency in numerical experiments dealing with gravity field modelling. Since there is no need to solve special interface problems between neighbouring subdomains, in our applications we use the overlapping DD methods. Finally, we present the numerical experiment using the FVM approach with 93 312 000 000 unknowns that would not be possible to perform using available computing facilities without aforementioned methods that can efficiently reduce a numerical complexity of the problem.
Citation: |
Table 1. Efficiency comparison of the stationary and nonstationary methods in the experiment with 259 200 unknowns, tested on one CPU core
Solver | CPU time | Number of |
[s] | iterations | |
GS | 703 | 10000 |
SOR | 92 | 1136 |
BiCG | 68 | 568 |
Bi-CGSTAB | 41 | 348 |
Table 2. Efficiency comparison for various Bi-CGSTAB linear solvers in the experiment with 4 374 000 unknowns, tested on one CPU core
Solver | Number of | CPU time | Additional memory |
iterations | [s] | for solver [MB] | |
Bi-CGSTAB | 1053 | 403.82 | 184.26 |
BiCGstab(2) | 554 | 494.14 | 258.02 |
BiCGstab(4) | 272 | 629.01 | 405.46 |
BiCGstab(8) | 130 | 860.86 | 700.34 |
Table 3. Comparison of Processes/Threads parallelization in the experiment with 4 374 000 unknowns, tested on 4 quad-core CPUs
MPI | OpenMP | CPU time | Speedup | RAM | Memory |
Processes | Threads | [s] | ratio | [MB] | increase |
1 | 1 | 403.82 | - | 237.108 | - |
2 | 232.40 | 1.73 | |||
4 | 191.36 | 2.11 | |||
8 | 87.31 | 4.63 | |||
16 | 57.51 | 7.02 | |||
2 | 1 | 216.84 | 1.86 | 245.868 | +3.7% |
2 | 126.17 | 3.20 | |||
4 | 98.46 | 4.10 | |||
8 | 85.88 | 4.70 | |||
4 | 1 | 114.01 | 3.54 | 266.040 | +12.2% |
2 | 79.72 | 5.06 | |||
4 | 55.56 | 7.26 | |||
8 | 1 | 79.34 | 5.09 | 308.456 | +30.0% |
2 | 70.81 | 5.70 | |||
16 | 1 | 59.51 | 6.78 | 390.068 | +64.5% |
Table 4. Efficiency comparison for the different number of subdomains in the DD experiment with 4 374 000 unknowns, tested on one CPU core
Number of | CPU time | Speedup | RAM | Memory |
subdomains | [s] | ratio | [MB] | saving |
1 | 403.82 | - | 237.108 | - |
5 | 1651.68 | 0.24 | 89.868 | -62.1% |
10 | 907.99 | 0.44 | 71.308 | -69.9% |
15 | 856.04 | 0.46 | 65.248 | -72.5% |
30 | 854.24 | 0.47 | 57.816 | -75.6% |
Table 5.
Efficiency comparison for the different number
$ \eta $ | CPU time | Speedup |
[s] | ratio | |
1 | 854.24 | - |
5 | 308.02 | 2.77 |
10 | 252.33 | 3.38 |
15 | 224.65 | 3.80 |
20 | 236.56 | 3.61 |
25 | 265.73 | 3.21 |
Table 6.
Comparison for the different number of subdomains using parallel DD method in the experiment with 4 374 000 unknowns with
Number of | CPU time | Speedup | RAM | Memory |
subdomains | [s] | ratio | [MB] | saving |
1 | 55.56 | - | 266.040 | - |
5 | 55.52 | 1.00 | 115.508 | -56.6% |
10 | 28.47 | 1.95 | 97.568 | -63.3% |
15 | 17.44 | 3.18 | 91.156 | -65.7% |
30 | 18.67 | 2.97 | 84.128 | -68.3% |
Table 7.
Efficiency comparison for different computation strategies in the experiment with 4 374 000 unknowns where we use 30 subdomains and
Computation | CPU time | Speedup | RAM | Memory |
strategies | [s] | ratio | [MB] | saving |
Serial without DD | 403.82 | - | 237.108 | - |
Serial with DD | 224.65 | 1.79 | 57.816 | -75.6% |
Parallel without DD | 55.56 | 7.26 | 266.040 | +10.8% |
Parallel with DD | 18.67 | 21.6 | 84.128 | -64.5% |
Table 8. Comparison for the different number of subdomains using Parallel-Domain decomposition method in the experiment with 34 992 000 000 unknowns, tested on 28 octo-core CPUs
No. sub. | CPU time | CPU time | RAM | Memory |
domains | [s] | saving | [GB] | saving |
1 | 706.8 | - | 1 652 | - |
2 | 683.6 | 1.03 | 968 | -41.4% |
5 | 703.5 | 1.00 | 557 | -66.3% |
10 | 700.9 | 1.01 | 420 | -74.5% |
15 | 710.0 | 0.99 | 375 | -77.3% |
30 | 718.5 | 0.98 | 329 | -80.0% |
[1] | O. B. Andersen, The DTU10 Gravity field and Mean sea surface, Second International Symposium of the Gravity Field of the Earth (IGFS2), Fairbanks, Alaska, (2010). |
[2] | Y. Aoyama and J. Nakano, RS/6000 SP: Practical MPI programming, IBM., (1999), http://www.redbooks.ibm.com. |
[3] | X. Cai, Overlapping domain decomposition methods, Advanced Topics in Computational Partial Differential Equations, (2003), 57–95. doi: 10.1007/978-3-642-18237-2_2. |
[4] | T. F. Chan and T. P. Mathew, Domain decomposition algorithms, Acta Numerica, 3 (1994), 61-143. doi: 10.1017/S0962492900002427. |
[5] | B. Chapman, G. Jost and R. Pas, Using OpenMP: Portable shared memory parallel programming, The MIT Press, Scientific and Engin Edition, (2007). |
[6] | R. Čunderlík, K. Mikula and M. Mojzeš, Numerical solution of the linearized fixed gravimetric boundary-value problem, Journal of Geodesy, 82 (2008), 15-29. |
[7] | R. Čunderlík and K. Mikula, Direct BEM for high-resolution global gravity field modelling, Studia Geophysica et Geodaetica, 54 (2010), 219-238. |
[8] | V. Dolean, P. Jolvet and F. Nataf, An Introduction to Domain Decomposition Methods. Algorithms, Theory, and Parallel Implementation, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 2015. doi: 10.1137/1.9781611974065.ch1. |
[9] | Z. Fašková, R. Čunderlík and K. Mikula, Finite element method for solving geodetic boundary value problems, Journal of Geodesy, 84 (2010), 135-144. |
[10] | P. Holota, Coerciveness of the linear gravimetric boundary-value problem and a geometrical interpretation, Journal of Geodesy, 71 (1997), 640-651. doi: 10.1007/s001900050131. |
[11] | P. Holota, Neumann's boundary-value problem in studies on Earth gravity field: Weak solution, 50 years of Research Institute of Geodesy, Topography and Cartography, 50 (2005), 49-69. |
[12] | W. Keller, Finite differences schemes for elliptic boundary value problems, Bulletin IAEG, 1 (1995), Section IV. |
[13] | R. Klees, Loesung des Fixen Geodaetischen Randwertprolems mit Hilfe der Randelementmethode, Ph.D thesis, Muenchen, 1992 |
[14] | R. Klees, M. van Gelderen, C. Lage and C. Schwab, Fast numerical solution of the linearized Molodensky problem, Journal of Geodesy, 75 (2001), 349-362. doi: 10.1007/s001900100183. |
[15] | K. R. Koch and A. J. Pope, Uniqueness and existence for the geodetic boundary value problem using the known surface of the earth, Bulletin Géodésique (N.S.), 46 (1972), 467-476. |
[16] | T. Mayer-Gürr and et al., The new combined satellite only model GOCO03s, International Symposium on Gravity, Geoid and Height Systems GGHS 2012, (2012). |
[17] | P. Meissl, The Use of Finite Elements in Physical Geodesy, Geodetic Science and Surveying, Report 313, The Ohio State University, 1981. |
[18] | Z. Minarechová, M. Macák, R. Čunderlík and K. Mikula, High-resolution global gravity field modelling by the finite volume method, Studia Geophysica et Geodaetica, 59 (2015), 1-20. |
[19] | O. Nesvadba, P. Holota and R. Klees, A direct method and its numerical interpretation in the determination of the gravity field of the Earth from terrestrial data, Proceedings Dynamic Planet 2005, Monitoring and Understanding a Dynamic Planet with Geodetic and Oceanographic Tools, 130 (2007), 370-376. |
[20] | N. K. Pavlis, S. A. Holmes, S. C. Kenyon and J. K. Factor, The development and evaluation of the Earth Gravitational Model 2008 (EGM2008), Journal of Geophysical Research: Solid Earth, 117 (2012), 1-38. doi: 10.1029/2011JB008916. |
[21] | B. Shaofeng and B. Dingbo, The finite element method for the geodetic boundary value problem, Manuscripta Geodetica, 16 (1991), 353-359. |
[22] | G. L. G. Sleijpen and D. R. Fokkema, BiCGstab$(l)$ for linear equations involving unsymmetric matrices with complex spectrum, Electron. Trans. Numer. Anal., 1 (1993), 11-32. |
[23] | M. Šprlák, Z. Fašková and K. Mikula, On the application of the coupled finite-infinite element method to geodetic boundary-value problem, Studia Geophysica et Geodaetica, 55 (2011), 479-487. |
Illustration of solution after: a)
Illustration of data management in parallel DD implementations where blue color illustrate subdomains and yellow color illustrate parallelization
Global gravity field model with the resolution