I am a bit new to encryption and I've seen a lot of recommendations to use a KDF for encryption and decryption. I'm a bit confused since I've been trying to encrypt the data using PBKDF2 algorithm and salt is generated using get_random_bytes, but when I try to decrypt the data I try to derive a new key using a new salt, this throws off my decryption. When I make the salt something constant for example b'salt' decryption works as expected but if I try to regenerate the salt and then regenerate the key the decryption doesn't work. My question is, should I be using the same salt for key derivation for encryption and decryption? Also if encryption happens on a server and decryption happens on the client, doesn't sending the salt over the network make using a KDF pointless? Where am I going wrong?
1 Answers
As a quick aside, something newer like Argon2, the 2015 password hashing competition winner, is better than PBKDF2. Using a Key Derivation Function (KDF) and the same key to encrypt many messages (for example) isn't particularly secure. There are also methods of exchanging keys like Diffie-Hellman Key Exchange, although they may not be applicable in certain use-cases.
From re-reading your question, it seems you think a KDF is a function that allows you to keep a shared secret key and generate (with random salt) the same derived key. In which case the salt would have no effect whatsoever.
Note that PBKDF2 is built from a hash function. In the case of pycrypto it is using HMAC-SHA512. As such, the salt needs to be the same in order for the derived key to be the same.
When you are generating a new salt, you are generating a different hashed key. That is exactly why salt is used. Actually, for deriving Multiple Keys can be risky with PBDKF2, once your password is found the two keys are gone. You can use PBKDF2 to generate a key and then use KDF.
The purpose of the salt is to prevent against pre-computed attacks on the hash (eg. Rainbow-table). See more here.
Sending the salt over the network strengthens the KDF. As a simple example (I am the attacker):
Without a salt
I download a file with a list of all hashes for likely n character passwords. I watch traffic and match many transmitted digests with my downloaded file. I required internet, significant storage space, and network hacking skills.
With a salt
I cannot download (or generate) a file with all likely passwords as every password has too many possible hashes. I watch traffic, and must generate a new hash file for every salt I see.
As you can see (hopefully), the salt makes it far more difficult to pre-compute anything before you see the salt.
- 49,797
- 12
- 123
- 211
- 1,473
- 5
- 18