27

Is there a hash function which has no collisions?

To clarify: it would be some function which would produce variable-length output, and never produce the same output for differing input. It would also be computationally hard to derive the input from the output.

Mike Edward Moras
  • 18,161
  • 12
  • 87
  • 240
benj
  • 371
  • 1
  • 3
  • 3

2 Answers2

20

Your hypothetical hash function would need to have an output length at least equal to the input length to satisfy your conditions, so it wouldn't be a hash function. See the Pigeonhole principle.

Remember an n-bit hash function is a function from $\{0,1\}^∗$ to $\{0,1\}^n$, no such function can meet both of your conditions. Essentially, if it has length $n$ bits, it can only guarantee uniqueness for inputs up to $n$ bits, and even then it would not be a good PRF as it would be a permutation - which is not what hash functions are - so you would want the output size to be longer than the input size, which is now really far from the definition of a hash function.


Now, if you are willing to call it something else than a hash function, then, yes, it is possible to construct such primitives, under the assumption that the output length must be calculated in a way that if there are $m$ possible inputs for an $n$-bit output, then $m \leq 2^n$. The obvious one is a block cipher, which satisfies your conditions except that it has the additional property that all outputs have a corresponding input, which may not be what you want.

As you can see, if you don't want a permutation, you are basically left with a function which "expands" the input pseudorandomly, such that all inputs have outputs but not all outputs have inputs. For instance, CodesInChaos's example of $y = g^x \mod{p}$ is collision-free if $|X| \leq p$ where $X$ is the set of inputs to the function and is one-way for sufficiently large prime $p$ (actually, it needs to have a subgroup of large order, generally $p = 2q + 1$ is a common choice for large prime $q$), as you would need to solve the discrete logarithm problem to reverse it.

Thomas
  • 7,568
  • 1
  • 32
  • 45
5

Yes they are called Perfect hash functions on wiki iv also seen them being called collision free hash functions. If you follow the link at the bottom of the page there are links to articles and source code.

vbms
  • 97
  • 1
  • 2