Factorization cliff-edge? #29

hvds · 2022-04-17T20:38:44Z

This may well be expected, but reporting it just in case - what's worst case time expected to be for factorizing something just under 100 digits?

I'm running my code [1] on a slightly harder case than before, involving slightly higher numbers (by a few digits). However it appears to have got stuck on one number for over 20h so far, when previous runs tended to stall for at most a few minutes.

I attached to the process to get the current stack trace, below. Assuming mpz_t is storing numbers in an obvious way, 64 bits per limb, I believe the number it is factorizing is: 363277160032028232403370760029415771263823736047870707805583581797268868441456882284309326109905669 (about 3.6e98).

(gdb) where
#0  next_prime_in_segment (p=5793758587, segment_bytes=24560, 
    segment_start=5793529590, 
    sieve=0x55d9ce5a4010 "\377\377\277\377\376_\347\362\376\375{T\177.\373\377\377\375\367\376\377\263\037\333\177\377\333\257\347{\365\311\376\373\377\377\363\337\357\367\371gow\377\337\366\377\377\377\225\277\377\377\333\372\357w\277\376\237j\377\377\334\363\377\377\357\277\373\377\357\376\327\337w}\177\363\357\377\357\377\377\376\337\377\327\375\272\377\376\377\177']\367~\377\367\355\337\177\317\177\365\377\377\377\376w\347\337\351\376}\365\377\377\371\377\337\336\343\376\377\275\377m\377\337\067\377\277\377?\377\376\373\357{\277\376\357\371\372\373\276\277\177\317\377\177\377\177\377\373\337\177\276?\375\343\377\377\377\223\356\337\257{g\345\337\177\377\373\375\377\366\377\374\277\347\376\376\376\377\367\267\301\377\335z\337\355\177\377{"...) at prime_iterator.c:40
#1  next_prime_in_segment (p=5793758587, segment_bytes=24560, 
    segment_start=5793529590, 
    sieve=0x55d9ce5a4010 "\377\377\277\377\376_\347\362\376\375{T\177.\373\377\377\375\367\376\377\263\037\333\177\377\333\257\347{\365\311\376\373\377\377\363\337\357\367\371gow\377\337\366\377\377\377\225\277\377\377\333\372\357w\277\376\237j\377\377\334\363\377\377\357\277\373\377\357\376\327\337w}\177\363\357\377\357\377\377\376\337\377\327\375\272\377\376\377\177']\367~\377\367\355\337\177\317\177\365\377\377\377\376w\347\337\351\376}\365\377\377\371\377\337\336\343\376\377\275\377m\377\337\067\377\277\377?\377\376\373\357{\277\376\357\371\372\373\276\277\177\317\377\177\377\177\377\373\337\177\276?\375\343\377\377\377\223\356\337\257{g\345\337\177\377\373\375\377\366\377\374\277\347\376\376\376\377\367\267\301\377\335z\337\355\177\377{"...) at prime_iterator.c:28
#2  prime_iterator_next (iter=iter@entry=0x7ffe5805e170)
    at prime_iterator.c:437
#3  0x00007fadbd940e9d in ec_stage2 (f=0x7ffe5805e2d0, z=0x7ffe5805e1c0, 
    x=0x7ffe5805e1b0, B2=8192000000, B1=81920000) at ecm.c:535
#4  _GMP_ecm_factor_projective (n=n@entry=0x7ffe5805e2e0, 
    f=f@entry=0x7ffe5805e2d0, B1=B1@entry=81920000, B2=8192000000, B2@entry=0, 
    ncurves=ncurves@entry=100) at ecm.c:667
#5  0x00007fadbd93b3a6 in factor (input_n=input_n@entry=0x7ffe5805efe0, 
    pfactors=pfactors@entry=0x7ffe5805ef70, 
    pexponents=pexponents@entry=0x7ffe5805ef78) at factor.c:303
#6  0x00007fadbd93d1a2 in is_semiprime (n=0x7ffe5805efe0) at factor.c:821
#7  0x00007fadbd96a2b7 in XS_Math__Prime__Util__GMP_liouville (
    cv=<optimized out>) at XS.xs:768
...
(gdb) up 6
(gdb) p n
$1 = (__mpz_struct *) 0x7ffe5805efe0
(gdb) p *n
$2 = {_mp_alloc = 7, _mp_size = 6, _mp_d = 0x55d9cd2976c0}
(gdb) p n->_mp_d[0]@6
$3 = {14963268348004307717, 17357550526725585807, 9582347205594536685, 
  15962625503134170351, 1376293721990814124, 170}
(gdb) p /x n->_mp_d[0]@6
$4 = {0xcfa83548a1a49305, 0xf0e2683d645f7b8f, 0x84fb560397eb5aed, 
  0xdd86a3524ce73cef, 0x131993d925dcd9ac, 0xaa}

[1] Using the injecting branch, I'm running ./oul -x32e104 -g1000000 92 6 with perl-5.34.0, MPU-0.73, MPUG-0.52, configured with:

./Configure -des -Dprefix=/opt/maths-5.34.0 -Dcc=gcc-9 -Doptimize="-g -O6" -Duse64bitall -Uversiononly

The text was updated successfully, but these errors were encountered:

hvds · 2022-04-18T00:11:46Z

The machine crashed 3 hours later, so I don't get to discover if it would have completed. :(

I'll focus on other inputs that use smaller numbers for now. If there is no bug here - if 24+ hours is a realistic time to expect for these factorizations - I'll need to find a different approach to have a chance at finding those larger values.

danaj · 2023-06-26T03:27:34Z

Unfortunately this module doesn't have state of the art factoring. The ECM is pretty good, but tuned more for smaller numbers than GMP-ECM. ECM really struggles past 60 or so digits unless there is a small factor. The Quadratic Sieve is great to have but certainly isn't close to state of the art (the code is over 15 years old now). An improved version could perhaps work on this size (I mean, either SIMPQS 2.0 or new code that might practically make it into this module -- yafu's QS definitely works at this size).

We only run our QS for numbers between 30 digits and 300 bits (90-ish digits). Running this number I get lots of small attempts then it goes into the loop doing obnoxious amounts of ECM. ECM for a semiprime this size has a miniscule chance of succeeding (this is quite a bit larger than the ECM record, so I would basically give it zero chance), but we're really out of options in this module and we still aren't sure if maybe there is a small-ish factor (20-30 digits) that will let us succeed.

$ tmmpu 'prime_set_config(verbose=>3); say for factor("363277160032028232403370760029415771263823736047870707805583581797268868441456882284309326109905669");'
# p-1 trying 363277160032028232403370760029415771263823736047870707805583581797268868441456882284309326109905669 (B1=20000 B2=200000)
# p-1: no factor
# ecm trying 363277160032028232403370760029415771263823736047870707805583581797268868441456882284309326109905669 (B1=200 B2=20000 ncurves=4)
# ecm: no factor
<.......>
starting large ECM on 363277160032028232403370760029415771263823736047870707805583581797268868441456882284309326109905669
# ecm trying 363277160032028232403370760029415771263823736047870707805583581797268868441456882284309326109905669 (B1=1280000 B2=128000000 ncurves=100)
# ecm: no factor
<......>

I'd recommend YAFU for these sorts of numbers. With a bit of infrastructure it can be called from Perl.

danaj · 2023-06-26T09:34:13Z

I tested with Pari/GP 2.14.0. It successfully factored after 5 hours 48 minutes.

danaj · 2023-06-26T19:06:13Z

SIMPQS 2.0 factored successfully in 9.5 hours.

hvds · 2024-04-04T21:15:02Z

FWIW I tried this today with the code from #48, after applying the patch below as a naive extrapolation of appropriate configuration parameters. This was on a loaded machine (load average 6):

...
56545 full, 62305 combined, 796640 partial
11469876 curves, 796640 partials.
118850 relations found in total!
reduced to 103983 x 104047 in 4 passes
lanczos halted after 1646 iterations
63 nullspace vectors found.
  32177148682644876313598264901319792229056727
  11289911471489891284233137775351271575289506866245788547
25785.92user 0.44system 7:09:45elapsed 100%CPU (0avgtext+0avgdata 597528maxresident)k
0inputs+0outputs (0major+183517minor)pagefaults 0swaps

.. which I think would probably beat your time for PARI had the machine been unloaded. So (depending on the difference between our computers) I believe a less naive extension of the config values would likely support extending the range of sizes that factor() targets for QS.

--- a/simpqs.c
+++ b/simpqs.c
@@ -2696,14 +2696,15 @@ int _GMP_simpqs(mpz_t n, mpz_t *farray) {
      *   largeprime >= 1000 * 500000 (not 1/10 of that)
      *   errorbits >= 33
      */
-    numPrimes = 64000;
+    int excess = decdigits - 91;
+    numPrimes = 80000 + 4000 * excess;
     Mdiv2 = 192000 / SIEVEDIV;
-    largeprime = numPrimes * 10 * decdigits;
+    largeprime = 1000 * (500000 + 50000 * excess);
     secondprime = SECONDPRIME;
     midprime = MIDPRIME;
-    firstprime = 30;
-    errorbits = decdigits / 4 + 2;
-    threshold = 43 + (7 * decdigits) / 10;
+    firstprime = 30 + excess / 5;
+    errorbits = 34 + excess / 5;
+    threshold = 102 + excess;
   }
 
   if (verbose > 2)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Factorization cliff-edge? #29

Factorization cliff-edge? #29

hvds commented Apr 17, 2022 •

edited

Loading

hvds commented Apr 18, 2022

danaj commented Jun 26, 2023

danaj commented Jun 26, 2023

danaj commented Jun 26, 2023

hvds commented Apr 4, 2024

Factorization cliff-edge? #29

Factorization cliff-edge? #29

Comments

hvds commented Apr 17, 2022 • edited Loading

hvds commented Apr 18, 2022

danaj commented Jun 26, 2023

danaj commented Jun 26, 2023

danaj commented Jun 26, 2023

hvds commented Apr 4, 2024

hvds commented Apr 17, 2022 •

edited

Loading