Are there parts of a SHA-512 hash that occur more often then others?

Question

Suppose I have a 'begin' password that I can remember.

I calculate it's SHA-512 hash, transform it to base-64 and check if it has a least 1 lowercase, 1 number, 1 capital, and 1 sign.

If this is not the case I repeat this procedure (but now use the previous hash as input) until I have a hash that fills these requirements.

I will now use this final hash as the 'final' real password.

Will every possible combination of bits be (approximately) equally likely? (If I ignore the first lowercase, number, capital and sign in the password and only look at the other chars)

In short: Will every possible 'final' password be equally likely?

kelalaka · Answer 1 · 2019-11-28T21:18:15.410

In short: Will every possible 'final' password be equally likely?

Yes, we expect that they equally likely. Cryptographic hash functions expected to have a well-distributed output. If you find any bias, you can convert this into an attack. Even there is a tiny bias then that can be used to attack for the hash function.

You seem like you are going to use the SHA-512 hash function as a password generator. Your scheme's weak point will be your master password. One can still write a password cracker for your construction. And Cryptographic hash functions are not designed to be slow, they are designed to be fast. That is why we use iteration on Key Derivation Functions. A better solution using PBKDF2 or Argon2, and use their output. You can use the iteration number as the next try if not occur. So first look at, 100000 then 100001, etc.

score 4 · Accepted Answer · answered Nov 28 '19 at 21:02

If you skip the padding and final non-padding characters then yes, each character should be equally likely. The output of SHA-256 is indistinguishable from random if you cannot guess the input. This also goes for iterations of the hash calculations over itself.

The final characters are not over just SHA-256 but are padded with zero characters. This is necessary since 256 is not a multiple of 6, and each character in the base 64 alphabet is encoding 6 bits. So the final 256 % 6 = 4 bits need to be encoded by adding two 0 bits (and possibly one = sign for the padding).

Of course, this doesn't matter much since you'd still have all the entropy of the original password encoded into the base 64 representation, if you don't shorten the 43 or 44 characters of the base 64 hash of course; many password entry fields will not accept that many characters. Actually, you would be as secure if you use the hash without special characters. But that's often not accepted either.

Unfortunately, an attacker only has to deal with one additional hash calculation in your scheme. No salt or iteration count is added as would be in a good password hash algorithm. So while your assumption may be correct, the method of deriving a password hash - which this basically is - isn't.

Furthermore, your scheme may be vulnerable against somebody deliberately having your trying a password that will not generate passwords without symbols, as there is only a 1/64 chance of generating either one.

If you decide to shorten the hash or base 64 then you should not perform additional hashes over this shortened notation as somebody might be able to let you enter a cycle using the attack above. This chance is about zero if you keep hashing the full hash output of the previous generation.

Are there parts of a SHA-512 hash that occur more often then others?

2 Answers2

Linked