doi: 10.3934/dcdss.2020383

Transformation of a Nucleon-Nucleon potential operator into its SU(3) tensor form using GPUs

1. 

Faculty of Nuclear Sciences and Physical Engineering, Czech Technical University in Prague, Trojanova 13, Praha 2,120 00, Czech Republic

2. 

Department of Physics and Astronomy, Louisiana State University, Baton Rouge, LA 70803, USA, Nuclear Physics Institute, Czech Academy of Sciences, Řež 25068, Czech Republic

3. 

Department of Physics and Astronomy, Louisiana State University, Baton Rouge, LA 70803, USA

4. 

Faculty of Information Technology, Czech Technical University, Prague 16000, Czech Republic, Aerospace Research and Test Establishment, Prague 19905, Czech Republic

5. 

Department of Physics and Astronomy, Louisiana State University, Baton Rouge, LA 70803, USA

* Corresponding author

Received  January 2019 Revised  February 2020 Published  June 2020

Starting from the matrix elements of a nucleon-nucleon potential operator provided in a basis of spherical harmonic oscillator functions, we present an algorithm for expressing a given potential operator in terms of irreducible tensors of the SU(3) and SU(2) groups. Further, we introduce a GPU-based implementation of the latter and investigate its performance compared with a CPU-based version of the same. We find that the CUDA implementation delivers speedups of 2.27x – 5.93x.

Citation: Tomáš Oberhuber, Tomáš Dytrych, Kristina D. Launey, Daniel Langr, Jerry P. Draayer. Transformation of a Nucleon-Nucleon potential operator into its SU(3) tensor form using GPUs. Discrete & Continuous Dynamical Systems - S, doi: 10.3934/dcdss.2020383
References:
[1]

Y. Akiyama and J. P. Draayer, A user's guide to Fortran programs for Wigner and Racah coefficients of SU$_3$, Comp. Phys. Comm, 5 (1973), 405-406.   Google Scholar

[2]

T. Dytrych, K. D. Launey, J. P. Draayer, P. Maris, J. P. Vary, E. Saule, U. Catalyurek, M. Sosonkina, D. Langr and M. A. Caprio, Collective modes in light nuclei from first principles, Phys. Rev. Lett., 111 (2013), 252501. doi: 10.1103/PhysRevLett.111.252501.  Google Scholar

[3]

T. DytrychP. MarisK. D. LauneyJ. P. DraayerJ. VaryD. LangrE. SauleM. A. CaprioU. Catalyurek and M. Sosonkina, Efficacy of the SU(3) scheme for ab initio large-scale calculations beyond the lightest nuclei, Comp. Phys. Comm., 207 (2016), 202-210.  doi: 10.2172/1326837.  Google Scholar

[4]

H. T. Johansson and C. Forssén, Fast and accurate evaluation of Wigner $3j$, $6j$, and $9j$ symbols using prime factorization and multiword integer arithmetic, SIAM J. Sci. Comput., 38 (2016), A376–A384. doi: 10.1137/15M1021908.  Google Scholar

[5]

K. D. LauneyT. Dytrych and J. P. Draayer, Symmetry-guided large-scale shell-model theory, Prog. Part. Nucl. Phys., 89 (2016), 101-136.  doi: 10.1016/j.ppnp.2016.02.001.  Google Scholar

[6]

M. F. O'Reilly, A closed formula for the product of irreducible representations of SU(3), J. Math. Phys., 23 (1982), 2022-2028.  doi: 10.1063/1.525258.  Google Scholar

show all references

References:
[1]

Y. Akiyama and J. P. Draayer, A user's guide to Fortran programs for Wigner and Racah coefficients of SU$_3$, Comp. Phys. Comm, 5 (1973), 405-406.   Google Scholar

[2]

T. Dytrych, K. D. Launey, J. P. Draayer, P. Maris, J. P. Vary, E. Saule, U. Catalyurek, M. Sosonkina, D. Langr and M. A. Caprio, Collective modes in light nuclei from first principles, Phys. Rev. Lett., 111 (2013), 252501. doi: 10.1103/PhysRevLett.111.252501.  Google Scholar

[3]

T. DytrychP. MarisK. D. LauneyJ. P. DraayerJ. VaryD. LangrE. SauleM. A. CaprioU. Catalyurek and M. Sosonkina, Efficacy of the SU(3) scheme for ab initio large-scale calculations beyond the lightest nuclei, Comp. Phys. Comm., 207 (2016), 202-210.  doi: 10.2172/1326837.  Google Scholar

[4]

H. T. Johansson and C. Forssén, Fast and accurate evaluation of Wigner $3j$, $6j$, and $9j$ symbols using prime factorization and multiword integer arithmetic, SIAM J. Sci. Comput., 38 (2016), A376–A384. doi: 10.1137/15M1021908.  Google Scholar

[5]

K. D. LauneyT. Dytrych and J. P. Draayer, Symmetry-guided large-scale shell-model theory, Prog. Part. Nucl. Phys., 89 (2016), 101-136.  doi: 10.1016/j.ppnp.2016.02.001.  Google Scholar

[6]

M. F. O'Reilly, A closed formula for the product of irreducible representations of SU(3), J. Math. Phys., 23 (1982), 2022-2028.  doi: 10.1063/1.525258.  Google Scholar

Figure 1.  Data layout of buffer used for transfer of the Wigner coupling coefficients and quantum numbers of irreducible tensors computed on CPU to GPU
Figure 2.  Data layout of buffer holding records with triplets of references to records in buffer $ B_W $. Each such triplet thus represents one irreducible tensor
Table3 
Table 1.  Performance results obtained on the Blue Waters system. Notation $ +1 $ in the column with MPI processes denotes one master process which just assigns HO shells to other MPI processes. The master process is not taken into account for efficiency evaluation
$ N_{\max} $ MPI procs. CPU only CPU+GPU
Time [s] Efficiency Time [s] Speed-up
$ 8 $ 7+1 295.3 75.3 3.92
15+1 137.6 1 36.8 3.73
31+1 66.5 1 17.5 3.79
63+1 39.4 0.83 9.23 4.26
127+1 32.8 0.56 6.35 5.17
255+1 31.0 0.52 5.22 5.94
$ 10 $ 7+1 2219 648 3.42
15+1 1034 1 318 3.24
31+1 499 1 151 3.28
63+1 248 0.99 74 3.32
127+1 165 0.74 43 3.75
255+1 138 0.59 32 4.25
$ 12 $ 7+1 13083 4493 2.91
15+1 6097 1 2116 2.88
31+1 2943 1 1054 2.79
63+1 1447 1 515 2.80
127+1 776 0.92 269 2.88
255+1 565 0.68 169 3.33
$ 14 $ 5+1 64865 26104 2.48
15+1 30227 1 12204 2.47
31+1 14596 1 5944 2.45
63+1 7179 1 2932 2.44
127+1 3581 0.99 1461 2.45
255+1 2142 0.83 838 2.55
$ N_{\max} $ MPI procs. CPU only CPU+GPU
Time [s] Efficiency Time [s] Speed-up
$ 8 $ 7+1 295.3 75.3 3.92
15+1 137.6 1 36.8 3.73
31+1 66.5 1 17.5 3.79
63+1 39.4 0.83 9.23 4.26
127+1 32.8 0.56 6.35 5.17
255+1 31.0 0.52 5.22 5.94
$ 10 $ 7+1 2219 648 3.42
15+1 1034 1 318 3.24
31+1 499 1 151 3.28
63+1 248 0.99 74 3.32
127+1 165 0.74 43 3.75
255+1 138 0.59 32 4.25
$ 12 $ 7+1 13083 4493 2.91
15+1 6097 1 2116 2.88
31+1 2943 1 1054 2.79
63+1 1447 1 515 2.80
127+1 776 0.92 269 2.88
255+1 565 0.68 169 3.33
$ 14 $ 5+1 64865 26104 2.48
15+1 30227 1 12204 2.47
31+1 14596 1 5944 2.45
63+1 7179 1 2932 2.44
127+1 3581 0.99 1461 2.45
255+1 2142 0.83 838 2.55
Table 2.  Performance results obtained for 32 MPI processes running on two 16-core IBM Power9 CPUs and a single NVIDIA V100 GPU
$ N_{\max} $ CPU only CPU+GPU
Time [s] Time [s] Speed-up
8 41.65 15.78 2.63
10 274.14 97.99 2.79
12 1649.7 611.1 2.69
14 7761.9 3407.6 2.27
$ N_{\max} $ CPU only CPU+GPU
Time [s] Time [s] Speed-up
8 41.65 15.78 2.63
10 274.14 97.99 2.79
12 1649.7 611.1 2.69
14 7761.9 3407.6 2.27
[1]

Laurent Di Menza, Virginie Joanne-Fabre. An age group model for the study of a population of trees. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020464

[2]

Vieri Benci, Sunra Mosconi, Marco Squassina. Preface: Applications of mathematical analysis to problems in theoretical physics. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020446

[3]

Youshan Tao, Michael Winkler. Critical mass for infinite-time blow-up in a haptotaxis system with nonlinear zero-order interaction. Discrete & Continuous Dynamical Systems - A, 2021, 41 (1) : 439-454. doi: 10.3934/dcds.2020216

[4]

Harrison Bray. Ergodicity of Bowen–Margulis measure for the Benoist 3-manifolds. Journal of Modern Dynamics, 2020, 16: 305-329. doi: 10.3934/jmd.2020011

[5]

Xuhui Peng, Rangrang Zhang. Approximations of stochastic 3D tamed Navier-Stokes equations. Communications on Pure & Applied Analysis, 2020, 19 (12) : 5337-5365. doi: 10.3934/cpaa.2020241

[6]

Alberto Bressan, Sondre Tesdal Galtung. A 2-dimensional shape optimization problem for tree branches. Networks & Heterogeneous Media, 2020  doi: 10.3934/nhm.2020031

[7]

Wenjun Liu, Yukun Xiao, Xiaoqing Yue. Classification of finite irreducible conformal modules over Lie conformal algebra $ \mathcal{W}(a, b, r) $. Electronic Research Archive, , () : -. doi: 10.3934/era.2020123

[8]

Xin-Guang Yang, Lu Li, Xingjie Yan, Ling Ding. The structure and stability of pullback attractors for 3D Brinkman-Forchheimer equation with delay. Electronic Research Archive, 2020, 28 (4) : 1395-1418. doi: 10.3934/era.2020074

[9]

Denis Bonheure, Silvia Cingolani, Simone Secchi. Concentration phenomena for the Schrödinger-Poisson system in $ \mathbb{R}^2 $. Discrete & Continuous Dynamical Systems - S, 2020  doi: 10.3934/dcdss.2020447

[10]

Zuliang Lu, Fei Huang, Xiankui Wu, Lin Li, Shang Liu. Convergence and quasi-optimality of $ L^2- $norms based an adaptive finite element method for nonlinear optimal control problems. Electronic Research Archive, 2020, 28 (4) : 1459-1486. doi: 10.3934/era.2020077

[11]

Lei Liu, Li Wu. Multiplicity of closed characteristics on $ P $-symmetric compact convex hypersurfaces in $ \mathbb{R}^{2n} $. Discrete & Continuous Dynamical Systems - A, 2020  doi: 10.3934/dcds.2020378

2019 Impact Factor: 1.233

Metrics

  • PDF downloads (43)
  • HTML views (200)
  • Cited by (1)

[Back to Top]