Int. J. Communications, Network and System Sciences, 2011, 4, 609615 doi:10.4236/ijcns.2011.410073 Published Online October 2011 (http://www.SciRP.org/journal/ijcns) Copyright © 2011 SciRes. IJCNS Integer Factorization of SemiPrimes Based on Analysis of a Sequence of Modular Elliptic Equations* Boris S. Verkhovsky Computer Science Department, New Jersey Institute of Technology, Newark, USA Email: verb73@gmail.com Received September 4, 2011; revised October 3, 2011; accepted October 11, 2011 Abstract In this paper is demonstrated a method for reduction of integer factorization problem to an analysis of a se quence of modular elliptic equations. As a result, the paper provides a nondeterministic algorithm that computes a factor of a semiprime integer n=pq, where prime factors p and q are unknown. The proposed algorithm is based on counting points on a sequence of at least four elliptic curves 222 modyxxbn , where b is a control parameter. Although in the worst case, for some n the number of required values of pa rameter b that must be considered (the number of basic steps of the algorithm) substantially exceeds four, hundreds of computer experiments indicate that the average number of the basic steps does not exceed six. These experiments also confirm all important facts discussed in this paper. Keywords: Integer Factorization, Factorization of SemiPrimes, NonDeterministic Algorithm, Elliptic Curves, Counting Points on Elliptic Curves, CryptoImmunity, Dual Elliptic Curves 1. Introduction and Problem Statement Security of information transmission via communication networks is provided by various cryptographic protocols. Cryptoimmunity of these protocols is mostly based on hardness of either the integer factorization or the discrete logarithm problem. There are several algorithms that factorize a semi prime n=pq, where n is known, but its integer factors p and q are not. Fermat, Euler and other mathemati cians/computer scientists introduced various algorithms for integer factorization. A survey of methods for factor ing is provided in [1], and modern factoring algorithms are described in [2]. Various special methods are considered in [35]; an application of cubic forms for factorization, as one of these special methods, is pro vided in [6]. A comparison and analysis of factoring algorithms with exponential time complexity is provided in [7]. Algorithms based on the quadratic sieve (QS) are discussed in [8,9] while integer factoring via the number field sieve (NFS) is provided in [10]. Both the QS and NFS are the algorithms with subexponential time complexity. The application of special devises for fac toring is described in [11,12]. A pioneering paper on application of quantum computing for integer factoriza tion is discussed in [13]. A new factoring algorithm proposed in this paper is based on the analysis of several modular elliptic equa tions {called elliptic curves} and counting how many integer points {integer pairs (x,y)} satisfy these curves. The application of elliptic curves for factoring is de scribed in [1417]. Methods of counting points on elliptic curves are considered in [18,19] and more generally on modular equations with several variables in [20,21]. A relationship between integer factorization and constrai ned discrete logarithm problems is analyzed in [22]. Consider n=pq, where both p and q are multidigit long primes. There are three special cases: where 1) each factor is congruent to 1 modulo4: 1mod4pq ; (1.1) 2) mod 40pq ; (1.2) 3) 3mod4pq . In this paper we discuss the factorization algorithm for (1.1) and (1.2) cases only. Consider a sequence of elliptic curves (EC) modulo n: 222 (,): modEnb yxxbn . (1.3) *Dedicated to the memory of my mother Alla L. Verkhovsky.
B. VERKHOVSKY 610 Here is an integer control parameter. 1b Let denote the number of points on the EC (1.3). (,)Pnb 2. Integer Factorization Algorithm Input: n is a semiprime; Output: integer factors p and q of n; 1: if nmod4=3, then for a randomly chosen b co m pute ; (,)Pnb 2: {Assign}: :gcd, ,;pPnb n:qnp; {end of al gorithm}; (2.1) 3: if nmod4=1, then compute; (,1)Pn 4: if , then p=q=3(mod4): the algorithm is not applicable; (,1)Pn n else for b=2, 3, 5, 7, 11,···coprime with n compute until four distinct integers A, Q, R, U are found; (2.2) (,)Pnb else, if b that divides n is found, then p:=b; :qnp ; {end of algorithm}; (2.3) 5: Let Q:=max(A, Q, R, U); U:=min(A, Q, R, U); (2.4) 6: Compute :SQUAR4 ; (2.5) 7: {Assign}: :gcd,pnS ; :qnp; {end of algorithm}. (2.6) Remark 2.1: For sake of reference, the (2.1)(2.6) algo rithm is called the SQUARalgorithm. Remark 2.2: For large n, computation of A, Q, R and U can be performed in paral l el . Remark 2.3: If b is not coprime with n, then b divides n, i.e., either p or q is equal b. Remark 2.4: Notice that S min (A, Q, R, U). This obser vation allows us to simplify the computations of S for large n {see the example with “larger” n in the Appendix}. 3. Numeric Illustrat i ons Table 3.1. Number of points ,k nb on four elliptic curves E(n, b). n ,1 k Pnb ;11k ,2 k Pnb ;2 k ,3 k Pnb ;3 k ,4 k Pnb ; 4 kmax k 24869 37981;1 13993;2 34713;3 12789;4 4 3813809 3850233;1 3774993;2 3674789;3 3955221;11 11 3858521 3996001;1 3652173;3 3717945;4 4067965;17 17 4549289 4255713;1 4558669;3 4852633;4 4530141;7 7 Table 3.2. Major steps of factorization algorithm. n A Q R U S p; q 24869 37981 34713 13993 12789 1118 13; 1913 3813809 3955221 38502333774993367478951298 1973; 1933 3858521 4067965 39960013717945365217334434 1913; 2017 4549289 4852633 455866945301414255713142098 gcd , nS We leave to readers the computation of p and q for the last semiprime in Table 3.2. RSA Factoring Challenge Consider a product of two integers: n:=pq, where p=34905295108476509491478496199038\ 98133417764638493387843990820577; and q=327691329932667095499619881908344\ 61413177642967992942539798288533. This n is called a RSA129 Challenge [23]: given n, it was necessary to find its factors p and q. Since in the RSA129 nmod4=1 and, therefore the proposed algorithm (2.1)(2.6) is applicable to solve this problem. 1mod4pq 4. Algorithm Validation Definition 4.1: A nonzero integer a is called a quad ratic residue (QR) modulo p if there exists an integer z such that 2modza p; (4.1) otherwise a is called a quadratic nonresidue (QNR) modulo p. By the Euler criterion [1], if p is a prime, then a is a QR if and only if (1)/2 mod 1 p ap . The algorithm (2.1)(2.6) is based on the following pro position. Conjecture 4.1: If p=q=1(mod4), then there ex ist two positive integers c<p and d<q, and four sets such that 123 4 , , and SSS S 1:, :SbPnbUpcqd ; (4.2) 2:, :SbPnbApcqd; (4.3) 3:, :SbPnbRpcqd ; (4.4) C opyright © 2011 SciRes. IJCNS
B. VERKHOVSKY611 4:, :SbPnbQpcqd ; (4.5) and where for every ijij SS 5,7,11,1 3, ; and ; {set of all pri m es} (4.6) 1234 SSSS 1, 2,3, Example 4.1: For semiprime n=1352513 the sets are as follows: 123 4 , , and SSS S 1{1,2,7,41, ;1267905}Sb U ; 25,13,17,43,61,67,71,79,;1313517Sb A ; 33,11,23,29,37,47,59,73, ;1389325Sb R ; 419,31,53,;1439305Sb Q . A priori it is not obvious how this conjecture can help to factorize n, but, as it is shown below, this is the case. It is assumed in this paper that there exists an efficient algo rithm that computes A, Q, R and U {see (4.2)(4.5)}. From the definitions (4.2)(4.5), it is easy to see that Q>max(A, R) and U<min(A, R), i.e., that and max ,;npdqccd AR . min , Rnpdqccd RA On the other hand, , otherwise it implies that pd=qc. Since both p and q are primes, the latter equation holds only if c=p and d=q, which is impossible by the conjecture. Consider the smallest product U and the largest product Q {see (4.2) and (4.5)}. As a result, we derive a system of three equations with four integer unknowns p, q, c and d: pq=n; ; (4.7) ; pcqdQ pcqdU i.e., ; (4.8) ()npdqdcdQ and . (4.9) ()npdqccdU Therefore, (4.8) and (4.9) imply that 2.pdqcQ U (4.10) Now consider another system of three equations with the same four unknowns: pq=n; and pcqdApcqd R ; (4.11) i.e., ; (4.12) ()npdqccdA and . (4.13) ()npdqccdR Then (4.12) and (4.13) imply that pd–qc 2AR; (4.14) and, as a result, from (4.10) and (4.14) 224qcQ UAR c . (4.15) Finally, from it follows that qc n gcd ,qnq, and .pnq (4.16) 5. Alternative Computation of Factors It is easy to see that one of the factors is an average arithmetic of two greatest common divisors gcd ,gcd ,2pAQRU , and qnp ; (5 .1). However, computation of (5.1) is twice more complicated than the previously described procedure. Indeed, in (5.1) we must compute the two greatest common divisors and one addition, while in (2.6) it is necessary to compute only one greatest common divisor and three subtractions. 6. Generalized Factorization Algorithm The factorization procedure described in (2.1)(2.6) {SQUARalgorithm} can be generalized and based on one of two following conjectures. Consider a set of modular elliptic curves 22 (,): modEnayxx an , (6.1) where 0a , n=pq, p=q=1(mod4) an d P(n, a) denotes the number of p oints on E(n, a). Conjecture 6.1: If a is a quadratic residue modulo n, then there exist two positive integers c<p and d<q, and four sets such that 123 4 , , and SSS S 11 :, :SaPnau pcqd; (6.2) 22 :, :SaPnau pcqd ; (6.3) 33 :, :SaPnau pcqd; (6.4) 44 :, :SaPnau pcqd; (6.5) an d w h e r e for every ij ; ij SS and 1234 set of all modulo SSSSQR n . (6.6) Conjecture 6.2: If a is a quadratic nonresidue modulo n, then there exist two positive integers g<p an d h<q, and four sets such that 123 4 , , and TTTT 11 :, :TaPnawpgqh; (6.7) :, :TaPna wpgqh 22 ; (6.8) 33 :, :TaPnaw pgqh ; (6.9) 44 :, :TaPna wpgqh; (6.10) and where for every ij ; and ij TT 1234set of all modulo TTTTQNRn . (6.11) 7. Algorithm Acceleration The numeric experiments provided in Tables 3.1 and 7.1 show that for n=3813809 it is necessary to compute six times {for b=1,2,3,5,7 and 11} and for (,)Pnb Copyright © 2011 SciRes. IJCNS
B. VERKHOVSKY 612 ) n=3858521 eight times {for b=1,2,3,5,7,11,13 and 17} until four distinct integers A, Q, R and U are found. These numbers can be decreased if the following prop erty holds. Let p=q=1(mod4); n=pq and M(n, b) denote the number of points on a dual EC. 222 ()(mod xx bn; (7.1) Conjecture 7.1: If primes p and q are randomly se lected, then with probability 3/4 ,1 ,1Pn Mn; (7.2) and, as a result, for every integer b ,Pnb Mnb,; (7.3) otherwise for every integer b ,Pnb Mnb, . (7.4) Proposition 7.2: If the factors p and q are congruent to 1 modulo n=pq, then the following identities hold for every positive integer m: 11 ,2 ,2 mm Pn Mn and 1 ,2 ,2 m Mn Pn 1m . (7.5) A proof is provided in the Appendix. Proposition 7.3: If the factors p and q are congruent to 1 modulo n=pq, and 12 bb a nd 12 ,,Pnb Mnb, then 1 ,2 , nbP nb. (7.6) Example 7.1 : Compute P(3813809,1)=38500233 and M(3813809,1)=3774993. Hence, (7.5) implies that there is no reason to compute P(3813809,2). Instead, compute P(n,3)=3674789. Since P(n,3) is the third distinct value, thus, (7.2) and (7.3) imply that M(n,3) is the forth distinct value. Indeed, M(n,3)=3955221. There fore, the algorithm requires only four basic steps instead of six. However, this acceleration does not work in 25% of the cases. For instance, it does not works for n=3858521, since neither (7.2) nor (7.3) hold. Indeed, because P(3858521,1)=M(3858521,1), the computation of M(3858521,3) provide s no acceleration. 8. Dual Factorization Algorithm It is assumed that nmod4=1 and , otherwise the algorithm is not applicable. (,1)Pn n 1) Compute and (,1)Pn (,1) n (7.1); Table 7.1. Excerpts from Table 3.1. N=3813809 3850233;1 3774993;2 3674789;3 3955221;11 N=3858521 3996001;1 3652173;3 3717945;4 4067965;17 2) Case (,1) (,1)PnMn : for b=3, 5, 7, 11,…coprime with n compute until (,)Pnb * (,)(,1) PnbMnand * (, )(,1)Pnb Pn (8.1) if b that divides n is found, then p:=b; q:=n/p; {end of algorithm}; (8.2) else compute * (, ) nb ; {all four distinct integers A, Q, R, and U are found}; 3) Let ** :min (,), (,1), (,),(,1)UPnb MnMnb Pn; (8.3) ** :max(, ),(,1),(, ),(,1)QPnb MnMnbPn; (8.4) and let A and R be other two values, {i.e., U < A < Q and U<R<Q}; (8.5) 4) Compute :SQUAR/4; (8.6) 5) {Factors}: :gcd ,pnS; q:=n/p; {end of algori thm}; (8.7) 6) Case (,1) (,1)Pn Mn : for b=3,5,7,11,… coprime with n compute until four distinct integers A, Q, R, U, are found (2.2); (,)Pnb 7) Let U:=min(A, Q, R, U); Q:=max(A, Q, R, U) (2 .4); 8) Compute :SQUAR4 (2.5); 9) {Factors}: :gcd ,pnS; q:=n /p; {end of algor ithm} (2.6). For the sake of simplicity of notations, let :(,) kk PPnb and:(, k) k Mnb (8.8) Example 8.1: For semiprime n=6525401 the sets are as follows: 123 4 , , and SSS S 1{2,5,13,23,37,59, ;6055665}Sb U ; 23,19,47, ;6514053SbA ; 3{Sb 43, ; (8.9) 53,67,83,;6519205}R 41,7,11,17,29,31,41, ;7012681Sb Q . Therefore, the algorithm (2.1)(2.6) requires at least fifteen basic steps, because 43 is the fourteenth prime (8.9). Yet, 12 ;PP and ; imply that for every 2 PM1 1kkk MP (7.2)(7.4). Hence, instead of counting points 1235743 ,, ,,,..,PPPPP P in fifteen elliptic curves, compute 36519205M . Thus, the algorithm (8.17) requires only four basic steps instead of fifteen. Proposition 8.1: Let 1 PP 2 ; and suppose that there is k>2 for whi ch (8.10) 12 1 ,,, kk PPP P t hen 12 1 ,,..,, kkk PPP P (8.11) C opyright © 2011 SciRes. IJCNS
B. VERKHOVSKY613 i Proof: Let assume that kj PP , where and i=1 or i=2. 3jk1 i Therefore, (7.5)(7.6). However, this contradicts with the assumption (8.10). Q.E.D. 3kji PM MP 9. Conclusions An algorithm and its generalizations for the integer fac torization are proposed. These algorithms are as compu tationally efficient as an algorithm that counts points on elliptic curves (1.3). Numerous computer experiments demonstrate that, if P(n,b) is computed for sequential values of prime b, then on ave rage there are four distinct values among the first six ones. The SQUARalgorithm (2.1)(2.6) and its enhanced modification (8.1)(8.7) described above is based on Conjecture 4.1 and its generalization {Conjecture 5.1}. Although an analogous algorithm can be designed on the basis of Conjecture 5.2, such an algorithm is computa tionally less efficient since it is a timeconsuming pro cedure to find a QNR modulo n. 10. Acknowledgements I express my appreciation to Professor W. Gruver, Pro fessor I. V. Semushin and to R. Rubino for constructive suggestions that improved the style of this paper. For computer experiments, the counting of points on the EC is performed with applets created by S. Sadik and by my former student B. Saraswat. I am grateful to them, and to my former students Dr. Y. S. Polyakov and S. Medicherla for their assistance in running computer experiments and to the reviewers for their fruitful comments. I deeply appreciate advice of Dr. D. Kanevsky. 11. References [1] R. Crandall and C. Pomerance, “Prime Numbers: A Computational Perspective,” Springer, New York, 2001. [2] H. Cohen, “A Course in Computational Algebraic Num ber Theory,” SpringerVerlag, New York, 1996. [3] D. Shanks, “Class Number, a Theory of Factorization and Genera,” Proceedings of Symposium of Pure Mathemat ics, Vol. 20, 1969, pp. 415440. [4] S. Lehman, “Factoring Large Integers,” Mathematics of Computation, Vol. 28, 1974, pp. 637646. doi:10.1090/S00255718197403401632 [5] J. Pollard, “Theorems on Factorization and Primality Testing,” Mathematical Proceedings of the Cambridge Philosophical Society, Vol. 76, 1974, pp. 521528. doi:10.1017/S0305004100049252 [6] J. Pollard, “Factoring with Cubic Integers,” The Devel opment of the Number Field Sieve, Lecture Notes in Mathematics, Vol. 1554, 1993, pp. 410. doi:10.1007/BFb0091536 [7] C. Pomerance, “Analysis and Comparison of Some Inte ger Factoring Algorithms,” In: H. W. Lenstra and R. Ti jdeman, Eds., Computational Methods in Number Theory, Math Centre Tracts—Part 1, Math Centrum, Amsterdam, 1982, pp. 89139. [8] C. Pomerance, “The Quadratic Sieve Factoring Algori thm,” Advances in Cryptology, Proceedings of Eurocrypt’84, LNCS, SpringerVerlag, Berlin, 1985, 169182. [9] R. D. Silverman, “The Multiple Polynomial Quadratic Sieve,” Mathematics of Computation, Vol. 48, 1987, pp. 329339. doi:10.1090/S00255718198708661198 [10] J. Buhler, H. W. Lenstra and C. Pomerance, “Factoring Integers with the Number Field Sieve,” In: A. K. Lenstra and H. W. Lenstra, Eds., The Development of the Number Field Sieve, Lecture Notes in Mathematics, SpringerVerlag, Berlin, Vol. 1554, 1993, pp. 5094. doi:10.1007/BFb0091539 [11] A. K. Lenstra and A. Shamir, “Analysis and Optimization of the TWINKLE Factoring Device,” Advances in Cryp tology—EUROCRYPT 2000, Lecture Notes in Computer Science, SpringerVerlag, New York, Vol. 1807, 2000, pp. 3552. [12] A. Shamir and E. Tromer, “Factoring Large Numbers with the TWIRL Device,” Advances in Cryptology— CRYPTO 2003, Lectu re Notes in Computer Scie nce, Sprin gerVerlag, New York, Vol. 2729, 2003, pp. 126. [13] P. W. Shor, “PolynomialTime Algorithms for Prime Fac torization and Discrete Logarithms on a Quantum Com puter,” SIAM Journal on Computing, Vol. 26, No. 5, 1997, pp. 14841509. doi:10.1137/S0097539795293172 [14] R. P. Brent, “Some Integer Factorization Algorithms Using Elliptic Curves,” Proceedings of 9th Australian Computer Science Conference, Canberra, January 1985. [15] H. W. Lenstra Jr., “Factoring Integers with Elliptic Cur ves,” Annals of Mathematics, Vol. 126, No. 2, 1987, pp. 649673. doi:10.2307/1971363 [16] P. L. Montgomery, “A FFT Extension of the Elliptic Curve Method of Factorization,” PhD Thesis, University of California, Los Angeles, 1992. [17] R. Schoof, “Counting Points of Elliptic Curves over Fi nite Fields,” Journal de Théorie des Nombres de Bor deaux, Vol. 7, No. 1, 1995, pp. 219254. doi:10.5802/jtnb.142 [18] R. Lencier, D. Lubicz and F. Vercauteren, “Point Count ing on Elliptic and Hyperelliptic Curves,” In: H. Cohen and G. Frey, Eds., Handbook of Elliptic and Hyperelliptic Curve Cryptography, Chapman & Hall/CRC, Boca Raton, 2006, pp. 407453. [19] A. G. B. Lauder and D. Wan, “Counting Points on Varie ties over Finite Fields of Small Characteristics,” In: J. P. Buhler and P. Stevenhagen, Eds., Algorithmic Number Theory, Cambridge University Press, Cambridge, 2008, pp. 579612. Copyright © 2011 SciRes. IJCNS
B. VERKHOVSKY Copyright © 2011 SciRes. IJCNS 614 [20] A. Weil, “Number of Solutions of Equations in Finite Fields,” Bulletin of American Mathematical Society, Vol. 55, 1949, pp. 497508. doi:10.1090/S000299041949092194 [22] Boris S. Verkhovsky, “Integer Factorization: Solution via Algorithm for Constrained Discrete Logarithm Problem,” Journal of Computer Science, Vol. 5, No. 9, 2009, pp. 674679. doi:10.3844/jcssp.2009.674.679 [23] “RSA Factoring Challenge,” http://en.wikipedia.org/wiki/RSA_Factoring_Challenge [21] A. G. B. Lauder, “Counting Solutions to Equations in Many Variables over Finite Fields,” Foundation of Com putational Mathematics, Vol. 4, No. 3, 2004, pp. 221267. doi:10.1007/s102080030093y Appendix A1: Integer fa ctori zation of “l arger” n Suppose that n = 5,912,473,983,049,810,121,582,491,435,559,753. 1. Compute four distinct values A, Q, R, and U {see (2.2) and Table A.1 }, where Q > max(A, R) > min(A,R) > U; 2. Reduce ; 28 28 2828 :mod10 ;:mod10 ;:mod10 ;:mod10QQAARR UU 3. ; :()/4SQUAR 4. p=gcd(n,S) = 59,604,64 4,783,353,249; 5. q=n/p=99,194,853,094,755 ,497. Table A.1. Values of A, Q, R, U and S. Outputs Number of Points on Elliptic Curves and S A 5,912,473,961,382,574,071,288,527,255,437,165 Q 5,912,474,044,194,428,121,806,637,304,594,777 R 5,912,474,004,717,045,895,410,530,286,258,045 U 5,912,473,921,905,192,397,824,270,895,949,025 S 19,738,690,974,965,090,844,456,218 A2: Proof of Proposition 7.2 Consider two elliptic curves for an integer positive m: ECP: ; (A.1) 221 2mod m yxx p ECN: . (A.2) 221 2mod m YXX p Let us show that there exist such integers u and w that substitutions x:=uX modp; and y:=wY modp (A.3) establish an onetoone correspondence between points of ECP and ECN for every in teger m. First of all, from (A.1) we derive 222 21 2mod m wYuXu Xp . (A.4) Let us select integers u and w, each coprime with p, for which hold 23 md ;owu p (A.5) and ; or . (A.6) 3 4moduu p 24modu p If integer solutions of (A.5) and (A.6) exist, then after cancellation of equal terms in both sides of (A.4) we derive (A.2). Therefore, from (A.6) 21modup p; (A.7) and from (A.5) 3modwu p 1 . (A.8) Integer u exists if pmod4=1; since (p–1)/2 is even. Therefore, , i.e., p–1 is a QR. 1/2 1mod pp On the other hand, integer w also exists, because u itself is QR modulo p.
B. VERKHOVSKY615 Indeed, 1/2 1/2 1/2 2121 mod1 pp p pp ;p (A.9) since, as it shown in Table A.2, both 2 and 1p are simultaneously either QR or QNR modulo p. Q.E.D. Table A.2. Parity of quadratic residuosities of 2 and 1p . pmod8 pmod8=1 pmod8=5 2 QR QNR 1p QR QNR ExampleA1: Let p=13; find u and w, such that 23 m3od1wu; and , i.e., 29mod13u3u . Then 23 27 mod131.wu Therefore w=1 and u=3. Indeed, 23 3 13mod13; 343mod13. Therefore, . 3 ;(mod13)xXyY ECP ECN Table A.3 shows onetoone c orrespondence betwe en ECP and ECN Table A.3. Correspondence between (x,y) and (X,Y). ECP (0,0) (3,0)(10,0)(2,4)(2,9)(11,6)(11,7) ECN (0,0) (1,0)(12,0)(5,4)(5,9)(8,6)(8,7) ExampleA2: Let p=41; find u and w, such that 23 m1od4wu; and , i.e., 18. 2437mod41u u Then 23583210 mod41wu , where both 10 and 31 are QR modu lo 4 1. Theref o re w=16; and u=18. I nd eed, 23 3 1618mod41; 18418mod41 . Thus, . 18 ;16mod41xXyY ECP ECN Table A.4 shows onetoone corres po ndence between points on ECP and ECN for several nonBlum primes. Table A.4. (u,w) as function of p. p 13 29 37 41 (u,w) (3, 1) (5, 3) (12, 10) (18, 16) Copyright © 2011 SciRes. IJCNS
