-2

Say I have SHA256 hash result that produced from 2000 bits inputs. For example, 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef (64 bit hex representation) is the SHA256 output.

Theoretically, the collision is present as the number of bits of inputs is greater than the number of the SHA output bits.

My question is if using the brute-forcing method to try out every single permutation from the 2000 bits input, how much efforts or time needed to find the second or third inputs that can generate the same SHA256 output which is 0123456789abcdef0123456789abcdef0123456789abcdef0123456789abcdef (64-bit hex representation)?

kelalaka
  • 49,797
  • 12
  • 123
  • 211
Pi-Turn
  • 93
  • 6

1 Answers1

1

The generic hash collision is finding 2 different inputs $x$ and $y$ such that $H(x) = H(y)$. The cost of finding such a pair for a cryptographic hash function is $\mathcal{O}(2^{n/2})$ with 50% probability. For SHA256, the cost of finding a collision with 50% probabilit is around $2^{128}$. i.e. one needs to try $c_1 \cdot 2^{128}$ different inputs to find a collision where $c_1 \in \mathbf{Z}$.

What are you seeking is not a collision. You are looking for the second-preimage attack (since one needs to know the bits, and it is not pre-image attack, too). Second-preimage attack is given $x$ and it's hash value $a = H(x)$, finding another input $y \neq x$ such that $h(x) = h(y)$.

The cost of the generic second-preimage attack is $\mathcal{O}(2^{n})$ that is same for the generic pre-image attack. For SHA256 that makes you need to test around $c \cdot2^{256}$ different inputs to find a second-image, where $c_2 \in \mathbf{Z}$. In other words, find $y$ that hashes to the given hash value of $x$, $\operatorname{SHA256}(x) = \operatorname{SHA256}(y)$

For the second and third, the expected cost is the same. Start a new search by using different values. The easiest is using the standard counter to keep track of the inputs.

Your input size of 2000 bits is far from the required search space. So, if we assume that the output of the SHA256 is uniform random, then we expect $2^{2000}/2^{256} = 2^{1744}$ secondary inputs.

Now if your question is for a given input $x$ and $h=\operatorname{SHA256}(x)$ find secondary preimages just permuting the input bits a bit vague problem. This is because we have a permutation with repetition, we have only ones and zeros. This can only generate $$\frac{2000!}{t! \cdot u!}$$ distinct values. Where the $t$ is the number of ones $t$ and $u$ is the number of zeroes.

If we assume that half is one and a half is zero and relying on the uniform distribution of the SHA256's output we can say you can find the second pair as above, and the third, and so on. There are effective permutation generation algorithms, however, I don't aware of an effective algorithm for permutations with repetition.

The interesting cases where $t \gg u$ and $t \ll u$. Some of them cannot construct enough distinct input to reach $2^{256}$. Depending on the input, the number of possible secondary inputs left to the questioner.

kelalaka
  • 49,797
  • 12
  • 123
  • 211