What is the correct way to implement PBKDF2 + AES CBC + HMAC?

Question

I've been doing a lot of reading on the proper way to implement AES CBC mode with HMAC authentication. I've seen many explanations, however, I've had a hard time finding an actual real example (with code) on all steps including key derivation, encryption, and decryption. My application uses PBKDF2 for key derivation. Here is what I have so far:

Derive Keys

Generate 256 bits of "key material" with KDF

key_material = pbkdf2(user_password, salt, iterations, size:256, alg:sha256)

Split key material in half yielding x2 128 bit keys for encryption and signing
```
enc_key = key_material.split(0, 127)
mac_key = key_material.split(128, 255)
```

Encrypt

Generate 128 bit initialization vector
```
iv = secureRandomBits(128)
```

Do AES encrypt

cipher = aesEncrypt(plaintext, iv, mode:cbc, key:enc_key)

Compute MAC with concatenated IV and cipher bits

mac = hmac(iv + cipher, alg:sha256, key:mac_key)

Store resulting values to server (considered unsecure storage)
```
storeValuesToServer(cipher, iv, mac)
```

Decrypt

Fetch data from storage

cipher, iv, mac = getValuesFromServer()

Recompute mac

new_mac = hmac(iv + cipher, alg:sha256, key:mac_key)

Compare macs

if(mac != new_mac)
  throw 'Failed MAC check!'

Do AES decrypt

plaintext = aesDecrypt(cipher, iv, mode:cbc, key:enc_key)

Questions

Does this look correct? Main concerns here are:

Have I derived the keys correctly, splitting them in half based on the output of the KDF?
Am I computing the HMAC correctly by combining the iv + cipher. Does it matter if I had done cipher + iv instead?

score 12 · Accepted Answer · edited May 23 '17 at 12:41

You'll find there's a lot of splitting hairs regarding this topic, especially key derivation. But yes, your pseudocode is fine, although you may want to revise (0, 128) => (0, 127) and (129, 256) => (128, 255) ... (correct me if I'm wrong?). Also, you might want to implement a constant-time comparison function for verifying the mac.

Have I derived the keys correctly, splitting them in half based on the output of the KDF?

This is fine. Some people would disagree. See here here and here for more in-depth discussion.

Am I computing the HMAC correctly by combining the iv + cipher. Does it matter if I had done cipher + iv instead?

Yes, you're computing the HMAC correctly. No, the order of cipher + iv does not matter at all. If it's available to you, you might want to consider a mode like GCM, which takes care of encryption and authentication for you with a single key.

score 7 · Answer 2 · answered Apr 27 '17 at 15:35

IMO, you code looks pretty solid. A few things I might suggest taking a closer look at are:

You haven't specified what iteration count you're using for PBKDF2. You should make the iteration count as high as practical.

PKCS #5 suggests a minimum of 1000 iterations, but that recommendation comes from nearly two decades ago. IMO, nowadays there's very little reason to ever use less than a million iterations, and (at least for client-side apps, where denial of service attacks are not a concern) a billion iterations is often more reasonable.

You should also include the iteration count in the public description of your encryption scheme. Just saying that your software uses PBKDF2, without specifying the iteration count, gives very little useful information to users wishing to evaluate the security of your software.
While using PBKDF2 is a lot better than not using any key-stretching KDF at all, there are nowadays even better alternatives like scrypt, Argon2 or Balloon hashing. These new KDFs are designed to consume not just a lot of time but also lots of memory, making them more resistant to massively parallel brute-force password cracking using e.g. GPUs or ASICs.
Requesting 256 bits of key material from PBKDF2-HMAC-SHA256, and splitting it into two 128-bit keys, is a perfectly good choice if you're sure that 128 + 128 bits is all you're going to need. If there's a chance that you might some day need more than that, however (say, because you want to switch to AES-256 for post-quantum security, or because you might need to derive more than one set of keys from the same passphrase), then you might want to consider re-deriving your AES and HMAC keys from the PBKDF2 output using another (non-iterated) KDF like HKDF-Expand.
If your ciphertexts can be long, first concatenating the IV and the ciphertext and then passing the result to HMAC might be needlessly inefficient. Most HMAC implementations allow you to feed the input in multiple chunks, so you don't have to do any explicit string concatenation. Of course, this is just a performance issue, not a security issue.

(Alternatively, if you're working in a low-level language like C, you might be able to allocate the memory for the IV and the ciphertext as one contiguous chunk, effectively letting you treat them as a single concatenated string without having to do any actual string copying.)
As noted by hunter, you should use a constant-time string comparison routine for the MAC comparison. Otherwise, an attacker who could time your decryption code precisely enough might be able to discover the correct MAC for a forged message byte-by-byte, by guessing different MACs and timing how long it takes for your code to reject them to see how many initial bytes they've guessed correctly.

In practice, such timing attacks may or may not be actually feasible, depending on just how noisy the timing information available to the attacker is. But using a constant-time string comparison is an easy way to avoid that particular attack entirely.

What is the correct way to implement PBKDF2 + AES CBC + HMAC?

Derive Keys

Encrypt

Decrypt

Questions

2 Answers2

Linked

Related