3

I have a program (a game) that uses Java's nextInt function with parameter 0x10000 to get a random 16-bit number.
Java's nextInt() is an 48-bit seed LCG with the following formula:

seed[n+1] = (seed[n] * 0x5DEECE66DL + 0xB) mod (1 << 48)

nextInt() is implemented in such a way that the random number I receive is the current seed's most significant 16 bits when given argument 0x10000. [Java only does this for an argument that is a power of two]

So let me describe the situation: In the game, I get to have two sequential outputs of nextInt(0x10000), R[i] and R[i+1] (which are respectively seed[i] and seed[i+1]'s top 16 bits).

My question is: what information can I extrapolate about the next output of randInt() (R[i+2]) ?

EDIT: I'm in a situation in which a bruteforce of even more than 30,000 steps is not practical..and all I have are these two consecutive outputs. So, is it even feasible extracting some info given these restrictions? (And please be gentle, I don't have any knowledge in math beyond high school)

Mike Edward Moras
  • 18,161
  • 12
  • 87
  • 240
Zerith
  • 31
  • 3

1 Answers1

5

Assuming this generator is well-seeded, you probably can't learn much about the next output. You observe two outputs, so 32 bits in total. The state of the generator is 48 bits. Thus, there will probably be about $2^{16}$ states of the generator that are compatible with your two observations. This means that most of the 16-bit values for the next output are (heuristically) likely to be possible.

Thus, your best bet might be to look at how the generator is seeded. I have a vague recollection that Java's PRNG is poorly seeded, using a seed that has less than 48 bits of entropy. If this is the case, then it might be possible to predict the next output, by enumerating candidates for the initial seed and seeing which ones are consistent with the outputs you've observed.


If you need to enumerate which values for the next output are possible, here is a way to do it.

The naive way is to enumerate all $2^{32}$ possible values of seed[i] (you know its top 16 bits, thanks to your knowledge of R[i], so you just need to enumerate all possibilities for its low 32 bits), then test which ones are compatible with the observed value of R[i+1], and for each one that survives, compute the resulting value of R[i+2]. This takes about $2^{32}$ steps of computation, so you can probably make it run in a few seconds or tens of seconds of computation power.

With a little bit of cleverness, you can make this run a lot faster. Since we know the value of R[i+1], we know the range of possible values for seed[i+1]: in particular, we know that 2^32 * R[i+1] <= seed[i+1] <= 2^32 * R[i+1] + 2^32 - 1. Also, we know that seed[i+1] = 0x5DEECE66D * seed[i] + 11 - c * 2^48 for some c in the range 0x5DEEC * R[i] <= c <= 0x5DEED * (R[i]+1). Moreover, we know the top 16 bits of seed[i], so we can write seed[i] = 2^32 * R[i] + x, for some integer x satisfying 0 <= x <= 2^32 - 1.

Combining all of this, it follows that

2^32 * R[i+1] <= 0x5DEECE66D * 2^32 * R[i] + 0x5DEECE66D * x + 11 - c * 2^48 <= 2^32 * R[i+1] + 2^32 - 1

or equivalently,

(2^32 * R[i+1] - 0x5DEECE66D00000000 * R[i] - 11 - c * 2^48)/0x5DEECE66D <= x <= (2^32 * R[i+1] - 0x5DEECE66D00000000 * R[i] + 2^{32} - 12 - c * 2^48)/0x5DEECE66D

This range is of width less than 1, so if you know the left-hand-side and the right-hand-side, you can immediately infer x.

Thus, if we know c, then the value of x is determined, which reveals seed[i] and thus the next output of the generator. Finally, there are only about 0x5DEED$\approx 380,000$ possible values of c.

This gives an algorithm that enumerates all possible values for the next output, using about $380,000$ steps of computation. We enumerate over all $380,000$ possible values of c; for each possible value of c, we round (2^32 * R[i+1] - 0x5DEECE66D00000000 * R[i] + 2^{32} - 12 - c * 2^48)/0x5DEECE66D down to the nearest integer, call it x, and then we infer the value of seed[i] (namely, seed[i] = 2^32 * R[i] + x) and run the generator for two steps to observe the implied next output. We collect all possible next outputs of the generator. This gives us the set of all possible values for the next output. Each iteration uses just a few simple arithmetic operations, and we do $380,000$ iterations, so this procedure should run extremely fast: in milliseconds or less.

D.W.
  • 36,982
  • 13
  • 107
  • 196