SkillAgentSearch skills...

Msieve

MSIEVE: A Library for Factoring Large Integers

Install / Use

/learn @upiter/Msieve
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

MSIEVE: A Library for Factoring Large Integers

Jason Papadopoulos

Introduction

Msieve is the result of my efforts to understand and optimize how integers are factored using the most powerful modern algorithms.

This documentation corresponds to version 1.54 of the Msieve library. Do not expect to become a factoring expert just by reading it. I've included a relatively complete list of references that you can and should look up if you want to treat the code as more than a black box to solve your factoring problems.

What Msieve Does

Factoring is the study (half math, half engineering, half art form) of taking big numbers and expessing them as the product of smaller numbers.

If I find out 15 = 3 * 5, I've performed an integer factorization on the number 15. As the number to be factored becomes larger, the difficulty involved in completing its factorization explodes, to the point where you can invent secret codes that depend on the difficulty of factoring and reasonably expect your encrypted data to stay safe.

There are plenty of algorithms for performing integer factorization.

The Msieve library implements most of them from scratch, and relies on optional external libraries for the rest of them.

Trial division and Pollard Rho is used on all inputs; if the result is less than 25 digits in size, tiny custom routines do the factoring.

For larger numbers, the code switches to the GMP-ECM library and runs the P-1, P+1 and ECM algorithms, expending a user-configurable amount of effort to do so.

If these do not completely factor the input number, the library switches to the heavy artillery. Unless told otherwise, Msieve runs the self-initializing Quadratic Sieve (QS) algorithm, and if this doesn't factor the input number then you've found a library problem.

If you know what you're doing, Msieve also contains a complete implementation of the Number Field Sieve (NFS), that has helped complete some of the largest public factorization efforts known. Information specific to the quadratic sieve implementation is contained in Readme.qs, while the number field sieve variant is described in Readme.nfs

The maximum size of numbers that can be given to the library is hardwired at compile time. Currently the code can handle numbers up to ~310 digits; however, you should bear in mind that I don't expect the library to be able to complete a factorization larger than about 120 digits by itself.

The larger size inputs can only really be handled by the number field sieve, and the NFS sieving code is not efficient or robust enough to deal with problems larger than that. Msieve can complete very large NFS factorizations as long as you use the NFS sieving tools from other open-source packages.

Goals

Msieve was written with several goals in mind:

  • To be fast. I originally started this project in 2003 because I was dismayed at the lack of progress in high-performance implementations of the quadratic sieve. By 2006 msieve was the fastest quadratic sieve available anywhere, by a wide margin. Libraries like YAFU have since pushed the performance envelope for the quadratic sieve much harder, and the focus in msieve has shifted to the number field sieve.

  • To be complete and comprehensive. I've tried to make as many parts of the library state-of-the-art as possible. Parts of the NFS code have turned into research platforms in their own right.

  • To be simple to use. The only input is the integer to be factored. Everything else happens automatically, though a certain amount of user control is possible if you know what you're doing.

  • To be free (as in beer). The entire code base is released into the public domain. This is hobby stuff for me, and the more it's used the better. I have made every effort to avoid using external code that is licensed under the GPL, opting instead for less restrictive licenses.

If you choose to use Msieve, please let me know how it goes. I welcome bug reports, suggestions, optimizations, ports to new platforms, complaints, boasts, whatever.

Getting Msieve

My web page ([www.boo.net/~jasonp][jasonp.site]) used to be the main distribution venue for Msieve source and binaries. As the codebase has grown and more people have voluneered to help with it, it became less and less convenient to base things there, and so Msieve development and binary releases are now available on SourceForge ([msieve.sourceforge.net][msieve.sf]).

The source distribution comes with a unix [makefile][msieve.makefile] you can use if you want to build msieve from source. If you have Microsoft Visual Studio, Brian Gladman has kindly provided a set of build files that will generate Windows binaries.

Using Msieve

Just to be confusing, there are two things that I call 'Msieve' interchangeably.

The source distribution builds a self-contained static library 'libmsieve.a', that actually performs factorizations, and also builds a 'msieve' demo application that uses the library. The library has a very lightweight inter- face defined in [msieve.h][msieve.header], and can be used in other applications. While the demo application is (slightly) multithreaded, most the library is single- threaded and all of its state is passed in. The linear algebra code used in the quadratic- and number field sieve is multithread aware, and the entire library is supposed to be multithread-safe.

The demo application has only one job: to act as a delivery vehicle for integers to be factored. Numbers can come from a text file, from redirected input, from the command line directly, or can be manually typed in one at a time. Batch input is also supported, so that you can just point the application to a text file with a collection of numbers, one per line. By default, all output goes to a logfile and a summary goes to the screen.

For the complete list of options, try:

msieve -h

Arithmetic expressions

Starting with v1.08, the inputs to msieve can be integer arithmetic expressions using any of the following operators:

| | | | | |---: |:--- |------------- |------------ | |+ |- |plus, minus |(lowest priority) |% | |integer remainder |* |/ |multiply, integer divide |^ | |power |( |) |grouping |(highest priority)

Hence for example:

	(10^53 - 1) / 9

gives the value:

11111111111111111111111111111111111111111111111111111

The integers used in an expression can be of any length but all intermediate results and the final result are restricted to 275 or less decimal digits.

Intermediate information

While factoring an integer, the library can produce a very large amount of intermediate information. This information is stored in one or more auxiliary savefiles, and the savefiles can be used to restart an interrupted factorization. Note that factoring one integer and then another integer will overwrite the savefiles from the first integer.

Memory usage

The amount of memory that's needed will depend on the size of the number to be factored and the algorithm used. If running the quadratic sieve or the number field sieve, the memory requirements increase towards the end of a factorization, when all of the intermediate results are needed at the same time. For a 100-digit quadratic sieve factorization, most of the time Msieve needs 55-65MB of memory, with the last stage of the factorization needing 100-130MB. The final part of the number field sieve can use up incredible amounts of memory; for example, completing the factorization of a 512-bit number like an RSA key needs about 2GB of memory.

\pagebreak

Frequently Asked Questions

Factor much bigger numbers

Q: I want to factor much bigger numbers. Can't Msieve solve problems larger than you say?

Q: I'm trying to break my ex-girlfriend's RSA key with Msieve, and it's not working. What's wrong?

A: The quadratic sieve really is not appropriate for factoring numbers over ~110 digits in size. On a fast modern CPU, a 110-digit QS factorization takes nearly 120 hours for Msieve, and the time increases steeply beyond that. If you have really big factorization needs, there are several packages that you can use to make up for the low-performance parts of Msieve's NFS code:

  • [CADO-NFS][cado.nfs.web] is a complete NFS suite that includes state-of-the-art (or nearly so) implementations of all phases of the Number Field Sieve. The sieving tools in CADO-NFS produce output compatible with GGNFS and Msieve, so it's up to you which package you want to use for a given phase of NFS. In addition, CADO-NFS is under constant very active development, and in my opinion represents the future of the high-performance NFS community.

  • [GGNFS][ggnfs.sf] has state-of-the-art tools for NFS sieving and NFS polynomial selection. Using the latter is optional, since Msieve actually has very good NFS polynomial selection code, but the former is an absolute must if you are factoring numbers larger than about 95 digits.

  • [Factor-by-GNFS][factor-by-gnfs.sf] is an older, separate package that's not really compatible.

See later for a list of web resources that give an idea of the largest actual factorizations that Msieve has performed.

Network aware client-server Msieve version

Q: Can you make Msieve network aware? Can you make it a client-server thingy? Can I use the internet to factor numbers?

A: The demo application for the Msieve library is just that, a demo. I don't know anything about network programming and I'm not qualified to build a client-server application that's safe in the face of internet threats. If you have these kinds of smarts, you can use Msieve in your own code and I'll help as much as I can. The demo is good enough for people with a few machines on a small private LAN, and this is ~100% of the user community right now.

Library documentation

Q: Where's all the documentation on how your library actually works?

A: There reall

View on GitHub
GitHub Stars15
CategoryDevelopment
Updated13d ago
Forks2

Languages

C

Security Score

80/100

Audited on Mar 15, 2026

No findings