5

Is there a hashing function $f$ that for each input $x$ if $f(x) = y$, then $f(x \, || \, y) = y$? In other words, if we concatenate its output with the input, the result will not change.

Furthermore, is there a simple construction for such functions?

Example

Let $x =$ "hello" and $f(x) =$ "123". Then $f($"hello123"$) =$ "123".

While on existing hashing algorithms the question of finding the $y$s is clearly hard, it is easy to create hashing functions with a known fixed point (see this answer). This problem is a simple generalization of the fixed point problem, which gives me hope that it is also not much harder.

If security of $f$s is incompatible with the desired property, then how about an $f$ that at least have the uniform $y$ distribution?

Applications

Hashes of files often serve as "links" to them. Some files can link to other files, and this way loops may come up. The requested hashing functions should be able to handle the loops, whereas common hashing algorithms cannot.

prog
  • 105
  • 4

1 Answers1

4

Yes. You can construct such a function. Let $g:\{0,1\}^* \to \{0,1\}^k$ be any hash function. We will define $f$ as follows. On input $x = x_1 \cdots x_n$, compute:

  • For $i := 1,2,3,\dots,n$, do:

    • If $i \ge k+1$ and $z_{i-k} = x_{i-k+1} \cdots x_i$, then set $z_i = x_{i-k+1} \cdots x_i$; otherwise, set $z_i = g(x_1 \cdots x_i)$.
  • Output $z_n$.

We define $f(x)$ to be the output of the above algorithm. This defines a function $f$, and you can verify that it satisfies the condition you required. (Why? The idea is we have the invariant that $f(x_1 \cdots x_j) = z_j$, and moreover the if-statement enforces the desired condition.) Moreover, if $g$ is a good hash function, we can expect $f$ will be a good hash function (at the very least, as good as is possible, given the condition you require).

D.W.
  • 167,959
  • 22
  • 232
  • 500