SkillAgentSearch skills...

RcppBigIntAlgos

R Package for Factoring Big Integers using the C Library GMP (GNU Multiple Precision Arithmetic)

Install / Use

/learn @jwood000/RcppBigIntAlgos

README

CRAN status R build status Dependencies Coverage status Codacy Badge

RcppBigIntAlgos

Overview

RcppBigIntAlgos is an R package for efficiently factoring large integers. It is multithreaded and uses the C library GMP (GNU Multiple Precision Arithmetic). For very large integers, prime factorization is carried out by a variant of the quadratic sieve algorithm that implements multiple polynomials. For smaller integers, a simple elliptic curve algorithm is attempted followed by a constrained version of the Pollard’s rho algorithm (original code from https://gmplib.org/… this is the same algorithm found in the R gmp package called by the function factorize). Finally, one can quickly obtain a complete factorization of a given number n via divisorsBig.

Installation

## install.packages("RcppBigIntAlgos")

## Or install the development version
## devtools::install_github("jwood000/RcppBigIntAlgos")

The Quadratic Sieve

The function quadraticSieve implements the multiple polynomial quadratic sieve algorithm. Currently, quadraticSieve can comfortably factor numbers with less than 70 digits (~230 bits) on most standard personal computers. If you have access to powerful computers with many cores, factoring 100+ digit semiprimes in less than a day is not out of the question. Note, the function primeFactorizeBig(n, skipECM = T, skipPolRho = T) is the same as quadraticSieve(n).

library(gmp)
#> 
#> Attaching package: 'gmp'
#> The following objects are masked from 'package:base':
#> 
#>     %*%, apply, crossprod, matrix, tcrossprod
library(RcppBigIntAlgos)

## Generate large semi-primes
semiPrime120bits <- prod(nextprime(urand.bigz(2, 60, 42)))
#> Seed default initialisation
#> Seed initialisation
semiPrime130bits <- prod(nextprime(urand.bigz(2, 65, 1)))
#> Seed initialisation
semiPrime140bits <- prod(nextprime(urand.bigz(2, 70, 42)))
#> Seed initialisation

## The 120 bit number is 36 digits
nchar(as.character(semiPrime120bits))
#> [1] 36

## The 130 bit number is 39 digits
nchar(as.character(semiPrime130bits))
#> [1] 39

## The 140 bit number is 42 digits
nchar(as.character(semiPrime140bits))
#> [1] 42

## Using factorize from gmp package which implements pollard's rho algorithm
##**************gmp::factorize*********************
system.time(print(factorize(semiPrime120bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 638300143449131711  1021796573707617139
#>    user  system elapsed 
#> 100.788   0.291 101.112 

system.time(print(factorize(semiPrime130bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 14334377958732970351 29368224335577838231
#>      user   system  elapsed 
#>  1253.060    2.612 1256.065 
 
system.time(print(factorize(semiPrime140bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 143600566714698156857  1131320166687668315849
#>     user   system  elapsed 
#> 1673.666    3.154 1676.838 


##**************quadraticSieve*********************
## quadraticSieve is much faster and scales better
system.time(print(quadraticSieve(semiPrime120bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 638300143449131711  1021796573707617139
#>    user  system elapsed 
#>   0.041   0.000   0.042

system.time(print(quadraticSieve(semiPrime130bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 14334377958732970351 29368224335577838231
#>    user  system elapsed 
#>   0.051   0.001   0.052

system.time(print(quadraticSieve(semiPrime140bits)))
#> Big Integer ('bigz') object of length 2:
#> [1] 143600566714698156857  1131320166687668315849
#>    user  system elapsed 
#>   0.077   0.001   0.078

Using Multiple Threads

As of version 0.3.0, we can utilize multiple threads. Below, are a few examples:

  1. The largest Cunnaningham Most Wanted number from the first edition released in 1983 in less than 14 seconds.
  2. RSA-79 under 2 minutes.
  3. A 300-bit (91-digits) semiprime in 1 hour.
  4. RSA-99 under 5 hours.
  5. RSA-100 under 10 hours.

Below are my machine specs and R version info:

## MacBook Air (2022)
## Processor: Apple M2
## Memory; 24 GB

sessionInfo()
#> R version 4.3.1 (2023-06-16)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Ventura 13.4.1
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRblas.0.dylib 
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.11.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: America/New_York
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] RcppBigIntAlgos_1.1.0    gmp_0.7-2
#> 
#> loaded via a namespace (and not attached):
#> [1] compiler_4.3.1

## Maximum number of available threads
stdThreadMax()
#> [1] 8

Most Wanted 1983

mostWanted1983 <- as.bigz(div.bigz(sub.bigz(pow.bigz(10, 71), 1), 9))
quadraticSieve(mostWanted1983, showStats = TRUE, nThreads = 8)
#> 
#> Summary Statistics for Factoring:
#>     11111111111111111111111111111111111111111111111111111111111111111111111
#> 
#> |      MPQS Time     | Complete | Polynomials |   Smooths  |  Partials  |
#> |--------------------|----------|-------------|------------|------------|
#> |      10s 940ms     |   100%   |    16183    |    4089    |    4345    |
#> 
#> |  Mat Algebra Time  |    Mat Dimension   |
#> |--------------------|--------------------|
#> |      2s 107ms      |     8301 x 8434    |
#> 
#> |     Total Time     |
#> |--------------------|
#> |      13s 173ms     |
#> 
#> Big Integer ('bigz') object of length 2:
#> [1] 241573142393627673576957439049           
#> [2] 45994811347886846310221728895223034301839

RSA-79

rsa79 <- as.bigz("7293469445285646172092483905177589838606665884410340391954917800303813280275279")
quadraticSieve(rsa79, showStats = TRUE, nThreads = 8)
#> 
#> Summary Statistics for Factoring:
#>     7293469445285646172092483905177589838606665884410340391954917800303813280275279
#> 
#> |      MPQS Time     | Complete | Polynomials |   Smooths  |  Partials  |
#> |--------------------|----------|-------------|------------|------------|
#> |     1m 35s 89ms    |   100%   |    91221    |    5651    |    7096    |
#> 
#> |  Mat Algebra Time  |    Mat Dimension   |
#> |--------------------|--------------------|
#> |      5s 175ms      |    12625 x 12747   |
#> 
#> |     Total Time     |
#> |--------------------|
#> |    1m 40s 558ms    |
#> 
#> Big Integer ('bigz') object of length 2:
#> [1] 848184382919488993608481009313734808977 
#> [2] 8598919753958678882400042972133646037727

Random 300-bit Semiprime

semi_prime_300_bit <- prod(nextprime(urand.bigz(nb = 2, size = 150, seed = 42)))
#> Seed default initialisation
#> Seed initialisation

nchar(as.character(semi_prime_300_bit))
#> [1] 91

quadraticSieve(semi_prime_300_bit, showStats=TRUE, nThreads=8)
#> 
#> Summary Statistics for Factoring:
#>     1598678911004402782180963020655649301157676037614983547537229086778878619660017752047003277
#> 
#> |      MPQS Time     | Complete | Polynomials |   Smooths  |  Partials  |
#> |--------------------|----------|-------------|------------|------------|
#> |   1h 3m 4s 604ms   |   100%   |   2447868   |    6835    |    12143   |
#> 
#> |  Mat Algebra Time  |    Mat Dimension   |
#> |--------------------|--------------------|
#> |      13s 202ms     |    18892 x 18978   |
#> 
#> |     Total Time     |
#> |--------------------|
#> |   1h 3m 18s 733ms  |
#> 
#> Big Integer ('bigz') object of length 2:
#> [1] 1262920060924323380524693068285713630353017593
#> [2] 1265859146963220756974764147023611562195937589

RSA-99

rsa99 <- "256724393281137036243618548169692747168133997830674574560564321074494892576105743931776484232708881"
 
quadraticSieve(rsa99, showStats=TRUE, nThreads=8)
#> 
#> Summary Statistics for Factoring:
#>     256724393281137036243618548169692747168133997830674574560564321074494892576105743931776484232708881
#> 
#> |      MPQS Time     | Complete | Polynomials |   Smooths  |  Partials  |
#> |--------------------|----------|-------------|------------|------------|
#> |  4h 41m 28s 410ms  |   100%   |   7351308   |    9203    |    15846   |
#> 
#> |  Mat Algebra Time  |    Mat Dimension   |
#> |--------------------|--------------------|
#> |      31s 359ms     |    24929 x 25049   |
#> 
#> |     Total Time     |
#> |--------------------|
#> |   4h 42m 1s 55ms   |
#> 
#> Big Integer ('bigz') object of length 2:
#> [1] 4868376167980921239824329271069101142472222111193 
#> [2] 52733064254484107837300974402288603361507691060217

RSA-100

rsa100 <- "152260502792253336053
View on GitHub
GitHub Stars13
CategoryDevelopment
Updated1y ago
Forks1

Languages

C++

Security Score

65/100

Audited on Aug 16, 2024

No findings