3

I have heard quotes by many that a SHA-512 hash output is random.

Does anyone know what method was used to come to this conclusion? If it is not randomized, how could that be shown?

Maarten Bodewes
  • 96,351
  • 14
  • 169
  • 323
Jon Hutton
  • 71
  • 7

2 Answers2

3

Whilst the random oracle model(1) is the theoretical construct that should produce random 'looking' hash output, the scientific principle also requires that we test the hypothesis for proof.

So let us assume a simple set of inputs to the function. If the null hypothesis is that the output will appear random(2), the output will have certain testable properties:-

  • For a simple counter input, we expect that any output bit will all uniformly distributed with $P(X = 0, X = 1) = \frac{1}{2}$, and for multiple conditional bits, $P(X | Y) = \frac{1}{2}$. These formulae extend as $P(\dots)= \frac{1}{2^n}$ and $P(\dots|\dots)= \frac{1}{2^n}$ for $n$ bit tuples.
  • For randomly entered bits, we can count the numbers of set input and output bits and determine whether an avalanche effect is occurring. We expect that 50% of the bits will change per any random input. The actual probability mass function of the changes is expected to be Binomial, tending to a Normal probability distribution as the block width $(n)$ increases. We'd expect $\mu = 0.5n, \sigma = 0.5 \sqrt{n}$.
  • Standardised randomness testing (Diehard, NIST etc.) can also be used, and we expect SHA-512 to pass all of them whilst hashing simple inputs. There is also no specialised distinguisher and can separate the hash output from a pseudo random sequence (3).

This 'random' hypothesis has so far held in the laboratory. So in your parlance, the output is random, and no one has yet proved otherwise. Some of these experiments might have shown it up otherwise, depending on the degree of bias.


Notes.

(1) A very imperfect model, as explained here.

(2) By random output, I take it you mean independent and identically distributed bits, not easily related to the input, or each other.

(3) Just be aware that these lab tests cannot determine any form of security measure.

Paul Uszak
  • 15,905
  • 2
  • 32
  • 83
2

SHA-512's output is neither random nor pseudo-random. A reasonable assumption is that one can define a function $F_k(x)$ based on SHA512 that will be a pseudorandom function. However, this is only when a random (secret) key $k$ is used. Having said all of the above, the random oracle model is a heuristic that can be applied when using SHA512. I suggest looking up the random oracle model to see what people mean by this.

Yehuda Lindell
  • 28,270
  • 1
  • 69
  • 86