2

Let $f$ be a collision resistant function i.e. it is computationally impossible to find $x_0, x_1$ such that $f(x_0) = f(x_1)$. If a computationally bounded adversary demonstrates that he knows some $x_0$ and $f(x_0)$, in what sense is he unable to determine $x_1$?

My instinct is that he can do no better than choosing a random guess from the domain of $f$ but is this how collision resistance is defined? Moreover, if $f$ has a trapdoor, is the definition still the same i.e. collision resistance implies that either the adversary knows the trapdoor or he can do no better than pick a random element from the domain of $f$?

I apologize that this is a bit of a yes/no question although if the answer is no, then there is more to say!

EDIT: Alternatively, what is the mathematical statement that captures the idea that there is no efficient algorithm to find a collision?

user1936752
  • 778
  • 1
  • 4
  • 17

2 Answers2

2

Let the range $Y$ of the function $f$ have size $N$ A collision resistant function cannot be better than an ideal function which maps uniformly into $Y$ since collision probability is minimized by the uniform distribution. Thus, given $(x_0,f(x_0))$ the number $k$ of random points $x_i\neq x_0$ that must be tried before the probability that $f(x_i)=f(x_0)$ is significant, say at least $1/2$ is $k\approx \sqrt{\ln 2 N}$ by the birthday paradox.

If necessary see @fgrieu's detailed post on the mathematics of the birthday paradox here.

kodlu
  • 25,146
  • 2
  • 30
  • 63
1

Defining a hash function formally is not an easy task. There are even entire papers covering that formalization, especially if there is a requirement that the function should be unkeyed (see for the keyed variant below).

The most popular variant to formalize hash functions are the use of choosing a hash function chosen from a family of hash functions HashFamily(KeyGen, Hash):

  • Given security parameter $1^n$, algorithm KeyGen outputs a key k.
  • Given key k, a family of hashes $H := \{h_0, ..., h_{2^n - 1}\}$ and a message m, algorithm Hash outputs $h_k(m)$ with $h_k \in H$.

Now consider the following game Col:

  • The adversary A is given H and a random key k.
  • The adversary A outputs $m_1$ and $m_2$.
  • The adversary A wins if $h_k(m_1) = h_k(m_2)$ and $m_1 \neq m_2$.

Collision resistant means that $\forall PPT A: Pr[Col(A) = 1] \leq negl(n)$.

The above game implies that the adversary A can compute $Hash(k, H, \cdot)$ by himself. However, since the adversary is computationally bounded, he can only query a limited number of messages (if the adversary is unbounded, then the adversary always wins the game unless $h_\cdot$ is injective of course).

PPT A means probabilistic polynomial time algorithm. That means that given some randomness r, algorithm A computes x using any strategy in $O(n^z)$ (with z being constant and independent of the input size).

I would also recommend you reading Introduction to Modern Cryptography by Katz & Lindell. There is the topic explained in more detail.

Martin Kromm
  • 407
  • 2
  • 8