Why XOFs are more convenient than Hash Functions in modeling Random Oracles

Question

In this answer, it is mentioned that

Easier instantiation of random oracles. Some security proofs rely on the so-called random oracle model to prove the security of a given scheme. Normally you'd use some artificial construction around a fixed-size hash function to get the desired output size, but with XOFs you can just plug them right in without having to fear any mistakes on your side potentially breaking the proofs (most people don't know/understand in many cases, me included).

and NIST.FIPS.202 says

Extendable-output functions are different from hash functions, but it is possible to use them in similar ways, with the flexibility to be adapted directly to the requirements of individual applications, subject to additional security considerations.

Is it just the fixed-size problem of hash functions? Like in this question Can MGF1 within OAEP and PSS be replaced by a XOF?

Actually, we can create arbitrary length from the hash functions by $h_0 = H(m)$ then $h_i = H(h|i), i>0$ and the output is $h_0\|h_1\|\ldots$ - I'm not claiming this is collision-free.

At page 24 of the NIST document, however, with the section titled as Additional Consideration for Extendable-Output Functions;

XOFs have the potential for generating related outputs—a property that designers of security applications/protocols/systems may not expect of hash functions.

So;

Why XOFs are better than Hash Functions in modeling Random oracles?

score 3 · Accepted Answer · answered Sep 11 '19 at 15:07

Suppose you are instantiating a cryptosystem, such as RSA-FDH signatures with 2048-bit modulus, or EdDSA over a $b$-bit curve for $b \gg 256$ like edwards448, which needs a sizable hash function modeled by a random oracle. But the hash functions available at your disposal are only SHA-2.

EdDSA needs a $2b$-bit hash function. For edwards25519, it was enough to use SHA-512, but not for edwards448.
RSA-FDH with a 2048-bit modulus needs a ${\approx}2048$-bit hash function. None of the SHA-2 hash functions are big enough for this.

What do you do? In order to instantiate one of these cryptosystems, you have to pick some way to compose the SHA-2 functions into a larger-output hash function. It's not a priori obvious what instantiation to choose.

One might naively choose $H(x) \mathbin\| H(H(x)) \mathbin\| \dotsb \mathbin\| H^n(x)$, but this is manifestly not indifferentiable from a random oracle. (You and I might not choose this, but I have seen people reach for it in practice.)
One might choose $H(x \mathbin\| 1) \mathbin\| H(x \mathbin\| 2) \mathbin\| \dotsb \mathbin\| H(x \mathbin\| n)$; of course, one must also choose how wide the numbers are, how they're padded, etc.
One might choose $H(H(x) \mathbin\| 1) \mathbin\| H(H(x) \mathbin\| 2) \mathbin\| \dotsb \mathbin\| H(H(x) \mathbin\| n)$ like you suggested, but it's not clear what purpose the extra $H(x)$ serves.

Personally I would choose option (2) with little-endian 64-bit integers, but justifying that choice requires further thought and explanation, and not everyone will make the same choice. With a standard XOF there is no need to think about these questions: you just ask SHAKE128 for as long an output as you want.

This is part of why Ed448 was designed not to use SHA3-512 (in some counter mode or whatever) by analogy with Ed25519's use of SHA-512, but instead uses SHAKE256 with an appropriately long output. The complexity of the OAEP standard arises in part because there were no XOFs available so one had to be invented (the ‘MGF’); it would have been much simpler and easier to understand if XOFs were available.

Why XOFs are more convenient than Hash Functions in modeling Random Oracles

1 Answers1