12

The RSA public key encryption requires two very large prime numbers as part of its encryption process that serve as secrets. These are typically generated with cryptographically secure random number generators of some kind.

However, random number generators generate random bits; they don't specifically generate primes; and proving that a number is prime would require attempting to factor it -- the very problem that makes RSA hard.

How are the prime numbers found for use in RSA?

watou
  • 259
  • 2
  • 15
Billy ONeal
  • 251
  • 1
  • 2
  • 7

4 Answers4

11

Very important:

Determining whether an integer is prime is significantly easier than factoring it (at least in the current state of our knowledge). We can easily determine whether integers having thousands of decimal digits are prime, but such integers are far beyond the reach of current factoring algorithms.

Now to answer your question: "plain" RSA does not specify how the prime numbers are to be generated. For any particular implementation, you would need to look at its source code or documentation to see how it does it. A popular method is the "generate-and-test" paradigm:

  1. Generate a random number of appropriate length.
  2. See if it is prime, using either a probabilistic or a deterministic test depending on your level of paranoia. If it is prime, you have your prime. Otherwise, go back to step 1.

A simple application of the prime number theorem can give you an estimate of the probability that the number you picked is prime (depending on its length). This is generally not the same thing as the probability that your primality test will consider it prime, however, if you are using a probabilistic test.

I'll leave it to you as an exercise to implement this in your favourite CAS (e.g., PARI/GP or Sage) and see how it works in practice.

EDIT:

The two "tricks" given in the other answer will not make any difference in most cases.

  • Ensuring that we only give odd numbers to the primality test is useless: any decent primality test will immediately discard even numbers, precisely so as to not waste time on them.
  • Adding $2$ to the previous number instead of generating a new one from scratch also does not make any difference (at least on my machine). I can't see any reason to believe it should, either.
fkraiem
  • 8,242
  • 2
  • 28
  • 38
7

You can use the TRIVIAL method, as shown by fkraiem. This generates uniform results, but does use a lot of RNG calls, as Ilmari pointed out. In practice this may be a non-issue or may be very important (e.g. getting results from /dev/random may stall; getting good entropy on an isolated embedded device may be difficult).

You can optimize this slightly by generating the inner $n-2$ bits and setting the first and last bits to one, which saves no time in any sane primality test, but uses less randomness on average.

There are some faster methods, such as Fouque and Tibouchi (2011 and 2014). These no longer generate with perfect uniformity, but bound it to an extremely small value. They use many fewer random bits.

I believe what Begueradj was getting at with his "+2" step is the PRIMEINC method, which is basically saying "generate a random value $t$, if $t$ isn't prime, then return ${\rm nextprime}(t)$." This is fast but very non-uniform. While nobody knows how to exploit this for the sizes of numbers we're talking about, it seems dubious to knowingly use a method which returns some numbers many orders of magnitude more often than others.

There are also the methods of Maurer and Shawe-Taylor. The latter is described in FIPS 186-4. Menezes et al. describes them both. These build up provable random primes -- no primality testing needed (though if you are implementing this I strongly recommend doing tests and assertions throughout the code -- including perhaps having it output a primality proof which is verified by other code). (Added 2017:) As mentioned in a comment, a GMP implementation of Shawe-Taylor is very similar in speed to the trivial method. Maurer's algorithm is perhaps 1.5x slower (it works harder to maintain diversity vs. S-T).

For primality testing, see FIPS 186-4 for some detailed info including detailed algorithms and numbers of tests recommended for given security levels. In practice, random composites of this size are highly unlikely to pass even a single Miller-Rabin test, hence you will almost certainly be doing only 1 test per composite, followed by the full suite for your probable prime.

Details of test performance optimization can be seen in something like this answer. The goal is to reject composites as fast as possible.

As fkraiem points out, primality testing is far, far easier than factoring. Some example times for 2048-bit (616 digit) numbers:

  • a single M-R test takes ~2 milliseconds
  • random primes using BPSW plus some extra M-R tests takes 0.25 seconds on a Macbook.
  • random constructive proven primes takes 0.25 - 1 second.
  • random BPSW prime plus generic proof (APR-CL or ECPP) in about 30 seconds

The last (random prime followed by proof) is almost never used, except perhaps for the smaller DSA primes (proving a 160-bit or 256-bit number is very fast). But this shows even for this size it can be done in a reasonable time.

DanaJ
  • 256
  • 1
  • 5
4

By the end of 2009, RSA-768 has been cracked using the special number field sieve. This exploit forced RSA algorithms to have a key length not inferior to 512 bits (or 154 decimal numbers). This means that the generated prime numbers you are talking about must have at least 154 decimal numbers too. So the question we may ask: does enough prime numbers with 154 decimal numbers exist? This question is important because it is directly relevant to the security of RSA protocol.

The answer is yes. Why? We already know that the density of prime numbers around n is 1/ln(10154). What this does mean? It means if you take randomly 1000 numbers that have 154 decimal numbers, you will surely find at least 3 of them are prime numbers. As you can guess, you can get billions of prime numbers of this length.

This is said, we can summarize how the prime numbers are generated in practice, following these steps:

  • Choose RANDOMLY a number of 154 decimal digits (in terms of processors, understand this by choosing randomly a number of 512 bits length as in coin flopping method). Once this number is picked, we add to its length one last bit that we must set to 1. Why to add this bit? It is simply because it is easier and faster to process odd numbers.

  • This is done, the next steps consists in testing if the number we picked is prime or not using probabilistic tests which have proved to be faster than the algorithms of Fermat and Miller-Rabin.

  • If the probalistic test says this randomly picked number is not prime, then we must add 2 to it. The addition of 2 instead of 1 is used to obtain the following odd number. Then we perform the test of the second step above.

This method proved to be too fast to generate prime numbers for RSA algorithm. We need to test 300 times to get a prime number with the classic method of Miller-Rabin.

To give you a hint about how fast this can generate p and q, when you scroll down this page you will find an application that allows you to get p and q of 512 bits with a simple click.

Maarten Bodewes
  • 96,351
  • 14
  • 169
  • 323
kafir
  • 141
  • 2
0

Maurer has an algorithm to generate random, almost uniformly distributed, k-bit primes that are provably prime, in contrast to schemes that generate highly probably primes (commonly employing the Miller-Rabin test). See A. J. Menezes et al., Handbook of Applied Cryptography, p.153 (freely available online). I have a Python code of that algorithm in http://s13.zetaboards.com/Crypto/topic/7234475/1/

Mok-Kong Shen
  • 1,302
  • 1
  • 11
  • 15