20

Are there any cryptographic hash functions for which there is a known pre-image attack, or a known second pre-image attack, but not both?

The attack doesn't have to be practical - just anything that beats the security claim of the hash function.

Intuitively, 2nd pre-image attacks would be easier to find than pre-image attacks, but I'm not aware of any examples or that either property implies the other.

I believe there are no known pre-image attacks on the common crop of hash functions in use today, so I'm interested in historic, obsolete and proposed-but-not-adopted hash functions, but not toy hash functions constructed specifically to demonstrate the possibility of such attacks.

Michael
  • 1,509
  • 10
  • 19

3 Answers3

28

A cryptographic hash function $f : \{0,1\}^{*} \to \{0,1\}^n$ has three properties: (1) preimage resistance, (2) second-preimage resistance, and (3) collision resistance. Even further, these properties form a hierarchy where each property implies the one before it, i.e., a collision-resistant function is also second-preimage resistant, and a second-preimage resistant function is also preimage-resistant (with a condition on $f$).

In the case of (3) ⇒ (2), it's not too hard to see why: if an adversary cannot find any colliding message pairs, then they certainly cannot find a colliding message when one of the messages is fixed.

However, (2) ⇒ (1) is substantially trickier. For some intuition, consider a second-preimage resistant hash function $f$ that was not preimage resistant (modeled by being given access to a preimage-finding oracle). Suppose you were given a $m_1$; then you could compute $H(m_1)$ and consult the oracle for the preimage of $H(m_1)$. The oracle would then return a $m_2$ such that $H(m_1) = H(m_2)$.

This is very nearly a second preimage. The only question is if $m_1 \ne m_2$. Intuitively, given that $f$ maps infinitely-many inputs to a finite number of outputs, there "should be" a high probability that $m_1 \ne m_2$. For all real-life hash functions, this is pretty much the case, so a second-preimage resistant hash function should not lack preimage resistance.

However, it is possible to define "pathological" hash functions that have perfect, provable second-preimage resistance but not preimage resistance. The example given in chapter 9 of the Handbook of Applied Cryptography is this:

$$f(x) = \begin{cases} 0 || x & \text{if } x \text{ is } n \text{ bits long}\\ 1 || g(x) & \text{otherwise} \end{cases}$$

where $g(x)$ is a collision-resistant hash function. In this case, for digests beginning with $0$, it's trivial to find a preimage (indeed, it's just the identity function), but such cases are provably second-preimage resistant, as there are no possible second preimages. In other words, this $f$ is bijective across the space of $n$-bit inputs.

To be more precise about when (2) ⇒ (1), Rogaway and Shrimpton have presented a theoretical analysis of the various relations between the three properties listed above in their Cryptographic Hash-Function Basics. Essentially, their analysis treats a hash function as having a finite, fixed-length domain, i.e. $f : \{0,1\}^m \to \{0,1\}^n$, wherein they show

  1. "conventional implications", like the implication (3) ⇒ (2); these are essentially "true" implications in the sense that they are unconditional, and

  2. "provisional implications", like the implication that (2) ⇒ (1); these are conditional in nature, relying on how much $f$ compresses the message space (as the message space gets larger relative to the digest space, the "stronger" the implication in a probabilistic sense).

So, provisional implications are essentially true if a hash function compresses the message space to a sufficient degree. (The "sufficient" example they provide is a hash compressing 256-bit messages to 128 bits.) Hence, second-preimage resistance implies preimage resistance only if the function in question compresses its input sufficiently. For length-preserving, length-extending, or low-compression functions, second-preimage resistance does not necessarily imply preimage resistance (as stated by the authors on page 8 about halfway down the page).

This should be intuitive given the above algorithm for finding second preimages given a preimage oracle. If you are expanding 6-bit inputs to 256 bits, it's actually quite unlikely that a preimage oracle would be able to find a second preimage. This isn't a formal argument, by any means, but it's a nice heuristic one.


Now, back to real life. Given the above algorithm for using a preimage oracle to find second preimages, I would not expect any real-life hash functions to have preimage attacks and not second-preimage attacks, especially since real hash functions typically compress data well.

On the other hand, I'm not personally aware of any historically-used, non-toy cryptographic hash function which has a second-preimage attack but not a preimage attack. Typically, collision resistance is the first thing attacked by cryptanalysts since it is (in a sense) the "hardest" property to satisfy. But if a hash function is found to be broken with regard to collisions, cryptanalysts typically go straight for the heart: preimage attacks. So, I don't know how much luck you'll have trying to find such a hash function.

You can look at the hash function lounge for some historic hash functions; it hasn't been updated since 2008, apparently, but still contains some useful info. I glanced through a few attacks and found mostly collision and preimage attacks, but you may have more luck.

Reid
  • 6,879
  • 1
  • 40
  • 58
8

In their paper Second Preimages on $n$-Bit Hash Functions for Much Less than $2^n$ Work, Kelsey and Schneier provide:

a second preimage attack on all $n$-bit iterated hash functions with Damgard-Merkle strengthening and $n$-bit intermediate states, allowing a second preimage to be found for a $2^k$-message-block message with about $k\times2^{n/2+1}+ 2^{n-k+1}$ work

as opposed to the expected $2^n$ work. This is firmly in the realms of theoretical attacks, but meets the criteria of being a second-preimage attack, not a preimage attack, and it applies to all the commonly used hash functions prior to SHA3.

The example they provide: a second pre-image can be found for a $2^{60}$ byte message processed with SHA1 in $2^{106}$ work rather than $2^{160}$. This is the largest message SHA1 can process; if we pick a more plausible but still large message size, say a SHA1 hash over a 4TB disk, or approximately $2^{38}$ message blocks, the work to find a second pre-image is approximately $38 \times 2^{{160/2} + 1} + 2^{160-38+1} \approx 2^{123}$.

Thanks to Reid for the pointer to the Hash Function Lounge where I found this paper.

Michael
  • 1,509
  • 10
  • 19
2

Great answers here already! That said, the specific example given in Reid's answer kind of confuses my brain. So, here's a really simple example, to help out anyone else in the same boat.

Take any cryptographic hash function $H_0$, and define $H_1(x) = H_0(r(x))$, where $r$ is defined to be the function that strips at most one trailing zero from the message it's given. Clearly $H_1$ is not second-preimage resistant, since if we fix a message $x$, we can easily find another message (namely $x \| 0$) that collides with it. Nevertheless, solving $y = H_1(x)$ is not really any easier than solving $y = H_0(x)$. To see this, assume toward a contradiction that we manage to find a message $x$ such that $y = H_1(x).$ Clearly, this message must also satisfy $y = H_0(r(x))$. Since we've found $x$, we can also find $r(x)$. But this contradicts the assumption that finding a value $x'$ with $y = H_0(x')$ should be infeasible - QED.

More generally, if we truncate the original message, or otherwise ignore easily-describable bits of information contained in that message, this will always break second preimage-resistance, but it will often leave first preimage resistance intact.