| Alg./size of $ p $ | 256 | 512 | 1024 |
| Montgomery | 121 | 432 | 1709 |
| This work | 122 | 443 | 1728 |
The Polynomial Modular Number System (PMNS) offers an alternative to the conventional binary multi-precision representation system for large integers. Its effectiveness has been demonstrated for various cryptosystems using prime field arithmetic [2,4,6], with prime sizes ranging from 256 to 736 bits. However, as the size of $ p $ increases, the relative performance of PMNS compared to standard arithmetic diminishes. Furthermore, the generation process of a PMNS has a worst-case complexity of $ {\mathcal O}(2^n) $, where $ n $ denotes the number of symbols used to represent an integer modulo $ p $ in this representation system. In this paper, we present several alternatives and improvements to the construction and implementation processes of PMNS, which are tailored to the size of $ p $.
| Citation: |
Table 1. Number of cycles to compute a PMNS modular multiplication, gcc 12.1.0, i9-11900KF
| Alg./size of $ p $ | 256 | 512 | 1024 |
| Montgomery | 121 | 432 | 1709 |
| This work | 122 | 443 | 1728 |
Table 2. Optimal polynomial degrees vs practical ones used in implementation per size of prime considered
| size of $ p $ | 256 | 512 | 1024 | 2048 | 4096 | 8192 |
| $ n_{\text{opt}64} $ | 5 | 9 | 17 | 33 | 65 | 129 |
| $ n_{64} $ | 5 | 9 | 19 | 40 | 83 | 187 |
| $ n_{\text{opt}128} $ | 3 | 5 | 9 | 17 | 33 | 65 |
| $ n_{128} $ | 3 | 5 | 9 | 18 | 36 | 72 |
Table 3. Comparative table of performances for Red-64 and Red-128 in number of processor cycles for one modular multiplication on Intel processor i9-11900KF with gcc 12.1.0
| size of $ p $ | 256 | 512 | 1024 | 2048 | 4096 | 8192 |
| Red-64 | 122 | 443 | 1709 | 7690 | 33471 | 169070 |
| Red-128 | 245 | 720 | 2367 | 10920 | 39446 | 156310 |
Table 4. Number of cycles to compute a modular multiplication using GnuMP and PMNS on Intel processor i9-11900KF with gcc 12.1.0
| Size | 1024 | 2048 | 4096 | 8192 |
| GnuMP | ||||
| $ n $ | 16 | 32 | 64 | 128 |
| Low level | 1715 | 5501 | 16897 | 53042 |
| Classical Mont. | 1534 | 4653 | 14522 | 44587 |
| Mont. CIOS | 1187 | 4661 | 16793 | 61478 |
| This work | ||||
| $ n $ | 19 | 40 | 84 | 187 |
| Red-64 | 1709 | 7690 | 33471 | 169070 |
| $ n $ | - | - | - | 72 |
| Red-128 | - | - | - | 156310 |
Table 5.
Number of cycles to compute a modular multiplication between two 8192-bit integers using GnuMP and PPMNS (for any
| 5-core | 6-core | 8-core | Low level | Classical Mont. | Mont. CIOS |
| $ n $=187 | $ n $=128 | ||||
| 51979 | 46224 | 40778 | 53042 | 44567 | 61478 |
Table 6.
Number of cycles to compute a modular multiplication between two 8192-bit integers using GnuMP and PPMNS (for
| 6-core | Low level | Classical Mont. | Mont. CIOS |
| $ n $=189 | $ n $=128 | ||
| 42510 | 53042 | 44567 | 61478 |
Table 7. Number of cycles to compute a modular multiplication for 2048-bit and 4096-bit integers using GnuMP, PMNS and PPMNS on an Intel processor i9-11900KF with gcc 12.1.0
| Method/Size | 2048 | 4096 |
| $ n $ | 32 | 64 |
| Low level | 5501 | 16897 |
| Classical Mont. | 4653 | 14524 |
| Mont. CIOS | 4607 | 16793 |
| $ n $ | 40 | 84 |
| Red-64 | 7491 (1-core) 10165 (2-core) | 36195 (1-core) 17756 (6-core) |
| Red-128 | 10920 (1-core) 15314 (2-core) | 39446 (1-core) 25797 (4-core) |
Table 8. Number of cycles to compute a modular multiplication for 2048, 4096 and 8192-bit integers using Toeplitz form vs random matrix form on Intel processor i9-11900KF with gcc 12.1.0
| Size | 2048 | 4096 | 8192 |
| 8 cores | |||
| Toeplitz Montgomery-like | 14421 | 18193 | 37026 |
| This work | 14529 | 22782 | 40778 |
Table 9. Number of cycles to compute a modular multiplication for 1024, 1664 and 2048-bit integers using Toeplitz form vs GnuMP on Intel processor i9-11900KF with gcc 12.1.0
| Size | 1024 | 1664 | 2048 |
| Low level | 1715 | 4039 | 5501 |
| Classical Mont. | 1539 | 3466 | 4653 |
| Montgomery CIOS | 1187 | 3153 | 4607 |
| Toeplitz (this work) | 1449 | 3454 | 5489 |
| [1] |
J.-C. Bajard, L. Imbert and T. Plantard, Modular number systems: Beyond the mersenne family, In Selected Areas in Cryptography, 11th International Workshop, SAC 2004, Waterloo, Canada, 2004,159-169.
doi: 10.1007/978-3-540-30564-4_11.
|
| [2] |
C. Bouvier and L. Imbert, An alternative approach for sidh arithmetic, In Juan A. Garay, editor, Public-Key Cryptography – PKC 2021, 27-44, Cham, 2021. Springer International Publishing.
doi: 10.1007/978-3-030-75245-3_2.
|
| [3] |
R. P. Brent and P. Zimmermann, Modern Computer Arithmetic, volume 18 of Cambridge Monographs on Applied and Computational Mathematics, Cambridge University Press, 2010.
|
| [4] |
T. Coladon, P. Elbaz-Vincent and C. Hugounenq, MPHELL: A fast and robust library with unified and versatile arithmetics for elliptic curves cryptography, In ARITH 2021, Transactions on Emerging Topics in Computing, Torino, Italy, 2021.
doi: 10.1109/ARITH51176.2021.00026.
|
| [5] |
L.-S. Didier, F. Y. Dosso and P. Véron, Efficient modular operations using the adapted modular number system, Journal of Cryptographic Engineering, 10 (2020), 111-133.
doi: 10.1007/s13389-019-00221-7.
|
| [6] |
F. Y. Dosso, J.-M. Robert and P. Véron, PMNS for efficient arithmetic and small memory cost, IEEE Transactions on Emerging Topics in Computing, 10 (2022), 1263-1277.
doi: 10.1109/ARITH54963.2022.00023.
|
| [7] |
H. Fan and M. A. Hasan, Alternative to the karatsuba algorithm for software implementation of GF(2n) multiplication, IET Information Security, 3 (2009), 60-65.
doi: 10.1049/iet-ifs.2007.0132.
|
| [8] |
M. A. Hasan and C. Nègre, Multiway splitting method for toeplitz matrix vector product, IEEE Transactions on Computers, 62 (2013), 1467-1471.
doi: 10.1109/TC.2012.95.
|
| [9] |
Ç. K. Koç, T. Acar and B. S. Kaliski, Analyzing and comparing montgomery multiplication algorithms, IEEE Micro, 16 (1996), 26-33.
doi: 10.1109/40.502403.
|
| [10] |
P. L. Montgomery, Modular multiplication without trial division, Mathematics of Computation, 44 (1985), 519-521.
doi: 10.1090/S0025-5718-1985-0777282-X.
|
| [11] |
C. Negre and T. Plantard, Efficient modular arithmetic in adapted modular number system using lagrange representation, In Information Security and Privacy, 13th Australasian Conference, ACISP 2008, Wollongong, Australia, (2008), 463-477.
doi: 10.1007/978-3-540-70500-0_34.
|
| [12] |
P. van Emde Boas, Another np-complete problem and the complexity of computing short vectors in a lattice, Technical report, University of Amsterdam, Department of Mathematics, Netherlands, 1981.
|