How difficult is it to guess a salt based on a hash and known plain text?

Question

I have the following use case:

Alice owns the domain alice.com and Bob owns bob.com. I want to users to create a hash for themselves and a specific service e.g. facebook.com or twitter.com. The environment is a client side only application , no servers involved. Therefore each user owns a unique random string that is generated once, stored locally at the user's device and intended to be used as a salt. The goal is to have unique hashes for each user tied to a service without an attacker being able to guess the hash of a user on a specific other service. The hash is intended to be the local part of a catch-all email address.

Alice's generated email address for facebook.com: tcpcAwdfMGw5S22/98avSvqrO8vyEefLGRpmu5zHWKw@alice.com

A pseudo-code example with argon2 as hashing function would look like this: argon2.hash("<service>+<user-domain>", {salt: <user-salt>}) -> hash

A concrete example for Alice, her domain and facebook would be: argon2.hash("facebook.com+alice.com", {salt: <SaltOfAlice>}) -> tcpcAwdfMGw5S22/98avSvqrO8vyEefLGRpmu5zHWKw

Please note, that there is no actual password to be hashed, just a string consisting of the service's domain and the user's domain. Argon2 is alienated to generate a hash based on this string where the salt is the actual secret part.

This leaves us with a table of users, services, personal domains and hashes.

User     Domain         Service         Hashed Mail
-----------------------------------------------------------------------------------
Alice    alice.com      facebook.com    tcpcAwdfMGw5S22/98avSvqrO8vyEefLGRpmu5zHWKw@alice.com    
Alice    alice.com      twitter.com     yQYCTrtMbQCq1ufLGCJAsc+Y1SIIYR7rC7Cbgp6e2IA@alice.com
Bob      bob.com        facebook.com    B+sKf7kXTH13tfj8ueZdmnTIpWRYCUYeNyQzFFx0jWs@bob.com
Bob      bob.com        twitter.com     /EIqXNIG/orOcQfYz/j1IYWTIRvTAoXJ4Di8cESsXwg@bob.com

These hashes are going to be publicly accessible. So when an attacker obtains a list of hashes of Alice and the associated services (facebook.com, twitter.com, etc) and her own domain (alice.com) would the attacker be able to extract the salt?

The attacker should not be able to guess Alice's hashed email address for a new service (e.g. stackexchange.com).

Is this guaranteed even when each user always uses the same salt and the salts are kept private?

Maarten Bodewes · Answer 1 · 2019-02-17T16:20:54.160

It seems you are looking for a good way to creating different "passwords" from a user specific secret. For that you do not need a password hash, you can use a common key based KDF, assuming that your secret has the properties of a symmetric key.

In that case the secret should take the place of the secret key or password, not the hash. The information on the service and domain should then be part of the "other information" or, if you are dead-bound on using a password hash, the salt. The password hash would be needed if it would be too easy for an adversary to guess the secret otherwise, but note that it only gives you a small, constant size amount of protection against offline guessing attacks.

To do this you should probably prefix the byte-encoded values with the size in bytes, to make sure there is only one possible encoding for both values together. It could make sense to first hash the service and domain with a secure hash (SHA-256) if they are larger, because algorithms may not have been designed for a large info-field or salt - the CPU may otherwise be dependent on the encoded size.

So you would get something like:

$$\text{info} = \text{Encode}(\text{service}, \text{domain})$$ $$\text{passwordBits} = \text{KDF}(\text{secret}, \text{info}, \text{passwordSizeInBits})$$ $$\text{password} = \text{Format}(\text{passwordFormatting}, \text{passwordBits})$$

or, if you take into the hashing of the info:

$$\text{info} = \text{Hash}(\text{Encode}(\text{service}, \text{domain}))$$

It is generally possible to include a random salt as well to the KDF. This makes it possible to create a slightly more secure result, at the obvious expense of complexity of the function and additional storage in your table.

I'd also store the $\text{passwordFormatting}$ string in your table of course. Different services will accept differently formatted password strings, even in 2019. You could mimic/copy the password formatting from an open source password manager, where the password is generated from a random value. Check the license first of course.

A very simple formatting scheme that doesn't require a formatting scheme is base64url. Many sites would allow that if you don't make your password - and therefore the size of the password in bits - too large.

The password size in bits of course depends on how much bits you require for your password formatting. It can be too large, e.g. 128 bits, and then you can toss away the bits you don't require during the formatting / base conversion / whatever to create a password.

Good key based KDF's (KBKDF's) are HKDF, but you might also use ANS X9.63 KDF as KDF if none are available to you - that's probably the easiest one to implement.

Below is a more generic explanation on how a password hash should be used:

The goal is to have unique hashes for each user at each service without an attacker being able to guess the hash of a user on a specific other service.

No. The main goal is to keep the password secret (and to verify it when using it in the correct use case, of course). That identical hashes leak information or even the password is the reason for the salt. That the hashes are unique is a means to an end, in other words.

This leaves us with a table of users, services, personal domains and hashes.

Commonly these tables also include the salt, for instance by including it into the password hash. The salt is not kept secret; it would require you to keep a secret value for each user. If that would be possible then you might not need the password hash in the first place; you could just compare the password to one in secure storage.

I would certainly also store something like a version indicator that represents the number of iterations, password derivation function etc. in use, so you can upgrade the entries (one by one) when this is required.

The attacker should not be able to guess Alice's hash for a new service (e.g. stackexchange.com).

This part is not applicable to the question above

You cannot login with the hash on a service; you need the password and that is converted to a password hash by the service. If you enter the password hash it would calculate the password hash over the password hash and the login should fail (unless the implementers have made a huge implementation mistake). Again, protecting the hash is not the goal; the password needs to be protected.

Is this guaranteed even when each user always uses the same salt and the salts are kept private?

Well, yes, if there is a secret part within the input parameters of the password hash, and if this secret is one of the parts that is hashed, and if that secret has enough entropy then it would become impossible to calculate the password hash.

However, as explained above, it doesn't make much sense to keep the salt secret.

What is sometimes performed is to have a secret available that is made part of the salt. Then the part consists of the secret part and a (usually randomly generated) nonce (which could be used as a salt by itself). This is then called a pepper.

Without this secret value an adversary cannot start offline attacks as it is impossible to check if a password guess is correct without the pepper. However, this pepper is then identical for all (or possibly large groups of) passwords. It is not a salt by itself.

For this specific problem, I'd use the scheme written down at the start of the answer instead of using a password hash with a pepper

score 2 · Answer 2 · edited Feb 16 '19 at 15:46

I don't think salt is what you are looking for, and probably not argon2 either. Though it probably works if that is your question. Argon2 is a Key Derivation Function, not a general purpose hash. It is good for hashing passwords. If you want to generate a key from a poor entropy source, argon2 is an excellent choice. Salt's are used to make sure we can't attack two passwords together or use precomputed tables or identify identical passwords, etc. They are normally prefixed to the hash and assumed to be public.

You seem to be asking for something else, Each user has a secret key. And you want to use this key to with service provider names so that each user+service gets a unique string but without the secret key you can't create more such for other services nor identify for which service does a given string belong. It is unclear if you want a user with the key to be able to reverse the process, you might prefer encryption to hash so reversing by legitimate user won't require guessing service provider name.

But let us stick to hashing. Doing something simple like HMAC with the key and service provider should give the desired property. If the user secret is a password as opposed to a good random key you will want to use a KDF such as Argon2 first. But if a user has such strong secret key conoutes he could reuse it with a cheap HMAC with many service names without repeating the expensive Argon2. If the user can store securely a random key, no use of Argon2 or such is required at all.

How difficult is it to guess a salt based on a hash and known plain text?

2 Answers2