RcppBigIntAlgos
R Package for Factoring Big Integers using the C Library GMP (GNU Multiple Precision Arithmetic)
Install / Use
/learn @jwood000/RcppBigIntAlgosREADME
RcppBigIntAlgos
Overview
RcppBigIntAlgos is an R package for efficiently factoring large integers. It is multithreaded and uses the C library GMP (GNU Multiple Precision Arithmetic). For very large integers, prime factorization is carried out by a variant of the quadratic sieve algorithm that implements multiple polynomials. For smaller integers, a simple elliptic curve algorithm is attempted followed by a constrained version of the Pollard’s rho algorithm (original code from https://gmplib.org/… this is the same algorithm found in the R gmp package called by the function factorize). Finally, one can quickly obtain a complete factorization of a given number n via divisorsBig.
Installation
## install.packages("RcppBigIntAlgos")
## Or install the development version
## devtools::install_github("jwood000/RcppBigIntAlgos")
The Quadratic Sieve
The function quadraticSieve implements the multiple polynomial quadratic sieve algorithm. Currently, quadraticSieve can comfortably factor numbers with less than 70 digits (~230 bits) on most standard personal computers. If you have access to powerful computers with many cores, factoring 100+ digit semiprimes in less than a day is not out of the question. Note, the function primeFactorizeBig(n, skipECM = T, skipPolRho = T) is the same as quadraticSieve(n).
library(gmp)
#>
#> Attaching package: 'gmp'
#> The following objects are masked from 'package:base':
#>
#> %*%, apply, crossprod, matrix, tcrossprod
library(RcppBigIntAlgos)
## Generate large semi-primes
semiPrime120bits <- prod(nextprime(urand.bigz(2, 60, 42)))
#> Seed default initialisation
#> Seed initialisation
semiPrime130bits <- prod(nextprime(urand.bigz(2, 65, 1)))
#> Seed initialisation
semiPrime140bits <- prod(nextprime(urand.bigz(2, 70, 42)))
#> Seed initialisation
## The 120 bit number is 36 digits
nchar(as.character(semiPrime120bits))
#> [1] 36
## The 130 bit number is 39 digits
nchar(as.character(semiPrime130bits))
#> [1] 39
## The 140 bit number is 42 digits
nchar(as.character(semiPrime140bits))
#> [1] 42
## Using factorize from gmp package which implements pollard's rho algorithm
##**************gmp::factorize*********************
system.time(print(factorize(semiPrime120bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 638300143449131711 1021796573707617139
#> user system elapsed
#> 100.788 0.291 101.112
system.time(print(factorize(semiPrime130bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 14334377958732970351 29368224335577838231
#> user system elapsed
#> 1253.060 2.612 1256.065
system.time(print(factorize(semiPrime140bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 143600566714698156857 1131320166687668315849
#> user system elapsed
#> 1673.666 3.154 1676.838
##**************quadraticSieve*********************
## quadraticSieve is much faster and scales better
system.time(print(quadraticSieve(semiPrime120bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 638300143449131711 1021796573707617139
#> user system elapsed
#> 0.041 0.000 0.042
system.time(print(quadraticSieve(semiPrime130bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 14334377958732970351 29368224335577838231
#> user system elapsed
#> 0.051 0.001 0.052
system.time(print(quadraticSieve(semiPrime140bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 143600566714698156857 1131320166687668315849
#> user system elapsed
#> 0.077 0.001 0.078
Using Multiple Threads
As of version 0.3.0, we can utilize multiple threads. Below, are a few examples:
- The largest Cunnaningham Most Wanted number from the first edition released in 1983 in less than 14 seconds.
- RSA-79 under 2 minutes.
- A 300-bit (91-digits) semiprime in 1 hour.
- RSA-99 under 5 hours.
- RSA-100 under 10 hours.
Below are my machine specs and R version info:
## MacBook Air (2022)
## Processor: Apple M2
## Memory; 24 GB
sessionInfo()
#> R version 4.3.1 (2023-06-16)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Ventura 13.4.1
#>
#> Matrix products: default
#> BLAS: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: America/New_York
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] RcppBigIntAlgos_1.1.0 gmp_0.7-2
#>
#> loaded via a namespace (and not attached):
#> [1] compiler_4.3.1
## Maximum number of available threads
stdThreadMax()
#> [1] 8
Most Wanted 1983
mostWanted1983 <- as.bigz(div.bigz(sub.bigz(pow.bigz(10, 71), 1), 9))
quadraticSieve(mostWanted1983, showStats = TRUE, nThreads = 8)
#>
#> Summary Statistics for Factoring:
#> 11111111111111111111111111111111111111111111111111111111111111111111111
#>
#> | MPQS Time | Complete | Polynomials | Smooths | Partials |
#> |--------------------|----------|-------------|------------|------------|
#> | 10s 940ms | 100% | 16183 | 4089 | 4345 |
#>
#> | Mat Algebra Time | Mat Dimension |
#> |--------------------|--------------------|
#> | 2s 107ms | 8301 x 8434 |
#>
#> | Total Time |
#> |--------------------|
#> | 13s 173ms |
#>
#> Big Integer ('bigz') object of length 2:
#> [1] 241573142393627673576957439049
#> [2] 45994811347886846310221728895223034301839
RSA-79
rsa79 <- as.bigz("7293469445285646172092483905177589838606665884410340391954917800303813280275279")
quadraticSieve(rsa79, showStats = TRUE, nThreads = 8)
#>
#> Summary Statistics for Factoring:
#> 7293469445285646172092483905177589838606665884410340391954917800303813280275279
#>
#> | MPQS Time | Complete | Polynomials | Smooths | Partials |
#> |--------------------|----------|-------------|------------|------------|
#> | 1m 35s 89ms | 100% | 91221 | 5651 | 7096 |
#>
#> | Mat Algebra Time | Mat Dimension |
#> |--------------------|--------------------|
#> | 5s 175ms | 12625 x 12747 |
#>
#> | Total Time |
#> |--------------------|
#> | 1m 40s 558ms |
#>
#> Big Integer ('bigz') object of length 2:
#> [1] 848184382919488993608481009313734808977
#> [2] 8598919753958678882400042972133646037727
Random 300-bit Semiprime
semi_prime_300_bit <- prod(nextprime(urand.bigz(nb = 2, size = 150, seed = 42)))
#> Seed default initialisation
#> Seed initialisation
nchar(as.character(semi_prime_300_bit))
#> [1] 91
quadraticSieve(semi_prime_300_bit, showStats=TRUE, nThreads=8)
#>
#> Summary Statistics for Factoring:
#> 1598678911004402782180963020655649301157676037614983547537229086778878619660017752047003277
#>
#> | MPQS Time | Complete | Polynomials | Smooths | Partials |
#> |--------------------|----------|-------------|------------|------------|
#> | 1h 3m 4s 604ms | 100% | 2447868 | 6835 | 12143 |
#>
#> | Mat Algebra Time | Mat Dimension |
#> |--------------------|--------------------|
#> | 13s 202ms | 18892 x 18978 |
#>
#> | Total Time |
#> |--------------------|
#> | 1h 3m 18s 733ms |
#>
#> Big Integer ('bigz') object of length 2:
#> [1] 1262920060924323380524693068285713630353017593
#> [2] 1265859146963220756974764147023611562195937589
RSA-99
rsa99 <- "256724393281137036243618548169692747168133997830674574560564321074494892576105743931776484232708881"
quadraticSieve(rsa99, showStats=TRUE, nThreads=8)
#>
#> Summary Statistics for Factoring:
#> 256724393281137036243618548169692747168133997830674574560564321074494892576105743931776484232708881
#>
#> | MPQS Time | Complete | Polynomials | Smooths | Partials |
#> |--------------------|----------|-------------|------------|------------|
#> | 4h 41m 28s 410ms | 100% | 7351308 | 9203 | 15846 |
#>
#> | Mat Algebra Time | Mat Dimension |
#> |--------------------|--------------------|
#> | 31s 359ms | 24929 x 25049 |
#>
#> | Total Time |
#> |--------------------|
#> | 4h 42m 1s 55ms |
#>
#> Big Integer ('bigz') object of length 2:
#> [1] 4868376167980921239824329271069101142472222111193
#> [2] 52733064254484107837300974402288603361507691060217
RSA-100
rsa100 <- "152260502792253336053
