2

I have concerns regarding truncated SHA-256 hashes in an application I am building at the moment:

Nomenclature

secret - the full 256-bit SHA-256 result of hashing 16 random bytes
public - a unique identifier for an object.
hash - the output of SHA-256(secret || - || public)
prefix - the first 8 characters (32-bit) of hex encoded hash.


Scenario

secret is generated once and remains constant throughout the lifecycle of the scenario.
It is not known to the attacker

Identifiers for all objects (publics) are known to the attacker.

prefix will be the base for further computation and information retrieval for a given object.
The first 4 bytes (32 bit) of hash have to be sufficient for that.

It is important that a potential attacker cannot generate a valid prefix for a given public.


Concerns

The attacker's utopia is to find secret so they can generate a valid hash for every object.
This is unrealistic to brute force (computation would take forever).

However, because only the first 32 bit of hash matter, is there a mechanic / cryptographic attribute, that makes it feasible for the attacker to guess / compute a valid secret that would allow them to generate prefix values for given publics?


Example

secret = 'af8b81c94d68...' (256-bit)
public = '123456'
hash = 'fe13c815ab44...' (256-bit)
prefix = hash[0...8] = 'fe13c815'

Can the attacker guess secret such that they would end up with a valid prefix?

Can the attacker use that guessed, validated secret and compute prefix values for different publics?

1 Answers1

2

Can the attacker guess secret such that they would end up with a valid prefix?

Can the attacker use that guessed, validated secret and compute prefix values for different publics?

You don't need a collision attack here. You need the pre-image attack that given a hash value $h$ one needs to find an $x$ such that $h = \operatorname{SHA-256}(x)$. Note that the attacker doesn't need to find the original value, any value that has the required has value will be a success for the pre-image attack and there can be more than one value or none in the range $[0..2^{256}]$. We expect that some will have none due to the collisions.

The 16 bytes hex string makes 128-bit and that is beyond the search for anyone. Therefore looking for the original value will be beyond any attacker since there is no known successful pre-image attack better than searching for SHA-256.

Since you trimmed the result, the pre-image resistance now requires around the 32-bit search to find one. The attacker will look for a candidate secret so that

$$\operatorname{SHA-256}(\operatorname{SHA-256}(\text{candidate secret}) \mathbin\|\text{-} \mathbin\| \text{public})$$ will produce valid output.

Once a successful candidate is found, unlikely to be the original since it was the output of SHA-256, the system will accept this a valid input. The attacker only needs to look for the first 32-bit of a 224-bit where they can set the rest bits to zero.

$$\text{candidate secret} = \overbrace{\texttt{0x00000000 to 0xFFFFFFFF}}^{32 \;bits}\;\overbrace{\texttt{0x00000...0}}^{224 \;bits}$$

That search is quite double if you consider some of the top machines

  1. Summit can reach $\approx 2^{63}$ SHA-1 hashes around one hour, $\approx 2^{72}$ hashes in one year.
  2. Titan can reach $\approx 2^{63}$ SHA-1 hashes around two hours, $\approx 2^{71}$ hashes in one year.
  3. Bitcoin miners reached $\approx 2^{92}$ SHA-256 hashes per year in 06 Agust 2019.

The advice: keep the result at least 16 bytes if you don't fear a possible future quantum attack. Otherwise one needs to double size for Grover's attack and triple size for Brassard et.al's attack. Of course, it is not clear how one can prepare the sequential evaluation without a magnificent delay that can cause an invisibility.

kelalaka
  • 49,797
  • 12
  • 123
  • 211