13

Suppose that I am generating random numbers with Python's random module, so that there is a known random number generator (Mersenne Twister in this case). I've read: "[...] observing a sufficient number of iterations (624 in the case of MT19937, since this is the size of the state vector from which future iterations are produced) allows one to predict all future iterations." This also got into how to actually crack it: Cracking Random Number Generators - Part 3

What if, instead of peeking at random numbers from the generator, you saw repeated rankings of the random numbers? As an example, the generator repeatedly produces vectors of 100 ranks for 100 uniform 0-1 numbers:

first 100: (47, 22, 1, 12, ...)
second 100: (18, 44, 99, 24, ...)

How much would this slow down the cracking process? I feel that this has a pretty precise answer, but I have no idea how one would even get started with the analysis/math.

otus
  • 32,462
  • 5
  • 75
  • 167
dcc310
  • 273
  • 2
  • 5

1 Answers1

1

Assuming $w=32$ (32-bit integers for internal state), $624 * 32 = 19968$ bits of information is found in the internal state. On the other hand, a maximum of $\lfloor\log_2(100!)\rfloor=524$ bits of information is revealed by each ranking of 100 outputs. $\lceil19968/524\rceil=39$ rankings of 100 outputs each would be the minimum information I would expect to need to identify the state. I do not, however, know of an algorithm to use the revealed ranking information to calculate the state.

You talk about "leaking" information and ask about a "cracking process". I hope you are not attempting to use a Mersenne Twister when a cryptographically secure pseudo-random number generator is needed.

Adrian Self
  • 139
  • 8