429

Most of the time, when some data must be encrypted, it must also be protected with a MAC, because encryption protects only against passive attackers. There are some nifty encryption modes which include a MAC (EAX, GCM...) but let's assume that we are doing old-style crypto, so we have a standalone encryption method (e.g. AES with CBC chaining and PKCS#5 padding) and a standalone MAC (e.g. HMAC with SHA-256). How should we assemble the encryption and the MAC?

  • MAC-then-Encrypt: Compute the MAC on the cleartext, append it to the data, and then encrypt the whole? (That's what TLS does)
  • Encrypt-and-MAC: Compute the MAC on the cleartext, encrypt the cleartext, and then append the MAC at the end of the ciphertext? (That's what SSH does)
  • Encrypt-then-MAC: Encrypt the cleartext, then compute the MAC on the ciphertext, and append it to the ciphertext? (In that case, we do not forget to include the initialization vector (IV) and the encryption method identifier into the MACed data.)

The first two options are often called "MAC-then-encrypt" while the third is "encrypt-then-MAC". What are the arguments for or against either?

Thomas Pornin
  • 88,324
  • 16
  • 246
  • 315

13 Answers13

354

I'm assuming you actually know all of this better than I do. Anyway, this paper neatly summarizes all these approaches, and what level of security they do or don't provide. I shall paraphrase it in English, rather than Mathematical notation, as I understand it.

  • Encrypt-then-MAC:

    • Provides integrity of Ciphertext. Assuming the MAC shared secret has not been compromised, we ought to be able to deduce whether a given ciphertext is indeed authentic or has been forged; for example, in public-key cryptography anyone can send you messages. EtM ensures you only read valid messages.
    • Plaintext integrity.
    • If the cipher scheme is malleable we need not be so concerned since the MAC will filter out this invalid ciphertext.
    • The MAC does not provide any information on the plaintext since, assuming the output of the cipher appears random, so does the MAC. In other words, we haven't carried any structure from the plaintext into the MAC.
  • MAC-then-Encrypt:

    • Does not provide any integrity on the ciphertext, since we have no way of knowing until we decrypt the message whether it was indeed authentic or spoofed.
    • Plaintext integrity.
    • If the cipher scheme is malleable it may be possible to alter the message to appear valid and have a valid MAC. This is a theoretical point, of course, since practically speaking the MAC secret should provide protection.
    • Here, the MAC cannot provide any information on the plaintext either, since it is encrypted.
  • Encrypt-and-MAC:

    • No integrity on the ciphertext again, since the MAC is taken against the plaintext. This opens the door to some chosen-ciphertext attacks on the cipher, as shown in section 4 of Breaking and provably repairing the SSH authenticated encryption scheme: A case study of the Encode-then-Encrypt-and-MAC paradigm.
    • The integrity of the plaintext can be verified
    • If the cipher scheme is malleable, the contents of the ciphertext could well be altered, but on decryption, we ought to find the plaintext is invalid. Of course, any implementation error that can be exploited in the decryption process has been by that point.
    • May reveal information about the plaintext in the MAC. Theoretical, of course, but a less than ideal scenario. This occurs if the plaintext messages are repeated, and the MACed data does not include a counter (it does in the SSH 2 protocol, but only as a 32-bit counter, so you should take care to re-key before it overflows).

In short, Encrypt-then-MAC is the most ideal scenario. Any modifications to the ciphertext that do not also have a valid MAC can be filtered out before decryption, protecting against any attacks on the implementation. The MAC cannot, also, be used to infer anything about the plaintext. MAC-then-Encrypt and Encrypt-and-MAC both provide different levels of security, but not the complete set provided by Encrypt-then-MAC.

j02
  • 5
  • 3
159

Some additional details to the accepted answer.

Encrypt-then-MAC is the mode which is recommended by most researchers. Mostly, it makes it easier to prove the security of the encryption part (because thanks to the MAC, a decryption engine cannot be fed with invalid ciphertexts; this yields automatic protection against chosen ciphertext attacks) and also avoids any trouble to confidentiality from the MAC (since the MAC operates on the encrypted text, it cannot reveal anything about the plaintext, regardless of its quality). Note that the padding oracle attacks, which have been applied in the field to ASP.NET, are chosen ciphertext attacks.

Ferguson and Schneier, in their book Practical Cryptography, have argued the opposite: that MAC-then-encrypt (or MAC-and-encrypt) is the "natural" order and that encrypt-then-MAC is overly complex. The sore point of encrypt-then-MAC is that you have to be careful about what you MAC: you must not forget the initialization vector, or (in case the protocol allows algorithm flexibility) the unambiguous identifier for the encryption algorithm; otherwise, the attacker could change either, inducing a plaintext alteration which would be undetected by the MAC. To prove their point, Ferguson and Schneier describe an attack over an instance of IPsec in which the encrypt-then-MAC was not done properly.

So while encrypt-then-MAC is theoretically better, it is also somewhat harder to get right.

Maarten Bodewes
  • 96,351
  • 14
  • 169
  • 323
Thomas Pornin
  • 88,324
  • 16
  • 246
  • 315
57

Hugo Krawczyk has a paper titled The Order of Encryption and Authentication for Protecting Communications (or: How Secure Is SSL?). It identifies 3 types of combining authentication (MAC) with encryption:

  1. Encrypt then Authenticate (EtA) used in IPsec;
  2. Authenticate then Encrypt (AtE) used in SSL;
  3. Encrypt and Authenticate (E&A) used in SSH.

It proves that EtA is the secure way to use, and both AtE and E&A are subject to attacks, unless the encryption method is either in CBC mode or it is a stream cipher.

The abstract says everything; I emphasized important parts by bolding them:

We study the question of how to generically compose symmetric encryption and authentication when building “secure channels” for the protection of communications over insecure networks. We show that any secure channels protocol designed to work with any combination of secure encryption (against chosen plaintext attacks) and secure MAC must use the encrypt-then-authenticate method. We demonstrate this by showing that the other common methods of composing encryption and authentication, including the authenticate-then-encrypt method used in SSL, are not generically secure. We show an example of an encryption function that provides (Shannon’s) perfect secrecy but when combined with any MAC function under the authenticate-then-encrypt method yields a totally insecure protocol (for example, finding passwords or credit card numbers transmitted under the protection of such protocol becomes an easy task for an active attacker). The same applies to the encrypt-and-authenticate method used in SSH.

On the positive side we show that the authenticate-then-encrypt method is secure if the encryption method in use is either CBC mode (with an underlying secure block cipher) or a stream cipher (that xor the data with a random or pseudorandom pad). Thus, while we show the generic security of SSL to be broken, the current practical implementations of the protocol that use the above modes of encryption are safe.

Sadeq Dousti
  • 1,073
  • 9
  • 20
51

Although there are already many answers here, I wanted to strongly advocate AGAINST MAC-then-encrypt. I fully agree with Thomas' first half of the answer, but completely disagree with the second half. The ciphertext is the ENTIRE ciphertext (including IV etc.), and this is what must be MACed. This is granted.

However, if you MAC-then-encrypt in the straightforward way, then you are completely vulnerable to padding-oracle attacks. by the "straightforward way", what I mean is that you call the "decrypt" function, and afterwards the "mac verify". However, if you get an error in the decrypt function, then you return this straight away, as a padding error. You have now just got a full blown padding oracle attack and you are dead. You can now hack the API and give a single error message only, but the time it takes to return the error has to be the same, whether it's a MAC error or a padding error. If you think that this is easy, then look at the Lucky13 attack on SSL. It's really really really hard (and much harder than just MACing all of the ciphertext).

The argument by Schneier and Ferguson for MAC-then-encrypt has no formal basis at all. The definition of authenticated-encryption is met by encrypt-then-MAC and is NOT met by MAC-then-encrypt. Furthermore, most implementations of MAC-then-encrypt are actually completely vulnerable to padding oracle attacks and so are actually broken in practice. Don't do this!

Having said all of the above, my recommendation is to not use any of this. You should be using GCM or CCM today (GCM is much faster, so use it as long as you are sure that your IV won't repeat). A combined authenticated-encryption scheme, with a single API call, and now you won't get in trouble.

Yehuda Lindell
  • 28,270
  • 1
  • 69
  • 86
25

Moxie Marlinspike calls it in his article https://moxie.org/2011/12/13/the-cryptographic-doom-principle.html the doom principle:

if you have to perform any cryptographic operation before verifying the MAC on a message you’ve received, it will somehow inevitably lead to doom.

He also demonstrates two attacks which are possible because of trying to decrypt before checking the MAC.

To summarize: "Encrypt Then Authenticate" is the way to go.

Mouk
  • 359
  • 3
  • 3
13

I think Encrypt-then-MAC does not deliver Plaintext integrity, but only ciphertext integrity. If the MAC over the ciphertext is OK but then we use the wrong key to decrypt (for whatever reason), then the recipient receives a plaintext that the sender did not send and did not vouch for. If this can happen, this is a violation of plaintext integrity.

So, Encrypt-then-MAC is only secure if you can somehow be sure that decryption won't use the wrong key, and that any other processing/decoding done to the ciphertext after checking the MAC is completely correct. This is a somewhat fragile aspect of Encrypt-then-MAC, and one reason why Ferguson and Schneier advocate against Encrypt-then-MAC.

D.W.
  • 36,982
  • 13
  • 107
  • 196
Josef Schuler
  • 149
  • 1
  • 2
9

The really important thing is, not encrypt-and-mac. The other two, you can debate, but both are at least theoretically sound -- one might just practically be better than the other. Encrypt-and-MAC falls apart for a very simple reason, though: the MAC is not meant to keep the plaintext secret.

The MAC is based on the plaintext. Authentication is not designed to obscure the plaintext. A MAC, therefore, provides some information about the plaintext used to make it.

The not-quite-appropriate-but-easy-to-understand example is a checksum. If we have a nine digit number plaintext and a one digit checksum, and ship it with the first nine digits encrypted but the checksum not, the checksum is going to help me learn things about the first nine digits of plaintext. If I can somehow find out eight of the nine digits, I can use the checksum to find out what the last digit is. There might be a lot of other things I can do with that checksum that ruin the integrity of the first nine digits.

So, as a recap: do not use encrypt-and-mac. Otherwise, whatever, you're good.

Daniel
  • 231
  • 2
  • 6
7

There is no property of a MAC that states that information about the input should not be leaked. As such, you should encrypt the message first, then apply a MAC. This way, even if the MAC leaks information, all that is leaked is ciphertext.

foobarfuzzbizz
  • 3,256
  • 3
  • 24
  • 25
7

Besides the security benefits of encrypt-then-MAC that many other answers have mentioned, there's a performance benefit. Checking the MAC first on the receiving end allows you to reject forged messages without doing the work to decrypt them. Bernstein mentions this in http://cr.yp.to/snuffle/design.pdf (in the section "Should the stream be independent of the plaintext?").

Jack O'Connor
  • 647
  • 6
  • 13
4

If you look at the paper "Tweakable Block Ciphers" by Moses Liskov, Ronald L. Rivest, and David Wagner published in Advances in Cryptology - Crypto 2002, Proceedings, 2442, section 4.3 Tweakable Authenticated Encryption (TAE), the MAC is computed over the plaintext, appended to the plaintext, and encrypted along with the plaintext. They then supply a proof of their Theorem 3 "If E is a secure tweakable block cipher, the E used in TAE mode will be unforgeable and pseudorandom".

TomS
  • 61
  • 1
2

In order to provide message integrity, a hash or message authentication function (MAC) is used. Sometimes, encryption and integrity are used together as:

  1. Encrypt-then-MAC: provides ciphertext integrity, but no plaintext integrity,
  2. MAC-then-encrypt: provides plaintext integrity, but no ciphertext integrity, and
  3. Encrypt-and-MAC: provides plaintext integrity, but no ciphertext integrity

Encrypt-then-MAC is the most secure mode, as any changes to the ciphertext can be filtered out before decryption using a valid MAC code, and this protects the messages against any modification attacks. However, a combination of encryption and MAC, such as Galois/Counter Mode (GCM): combines counter mode of encryption with Galois mode of authentication, or Counter with Cipher Block Chaining (CBC)-MAC (CCM): combines CBC-MAC with the counter mode of encryption, is preferred due to the security strength.

Squeamish Ossifrage
  • 49,816
  • 3
  • 122
  • 230
user24094
  • 145
  • 7
1

Reading all of this leads me thinking that the best solution would be:

MAC-then-Encrypt-then-MAC

Bringing both guarranty on the plain text and cyphertext.

I fully agree both are important :

  • MAC-then-Encrypt if your plain text is not structured and do not permit to confirm its integrity without a MAC
  • Encrypt-then-MAC for the reasons provided in other answers, especially to avoid decrypting bad data
lalebarde
  • 217
  • 1
  • 2
  • 6
-1

In many applications, only part of the data (m) is encrypted, and some so-called Additional Authenticated Data (AAD, usually some header data including nonce) a is only authenticated but not encrypted.

Here is my argument: When AAD is used, Authentication-then-Encryption provides an additional layer of protection for AAD than Encryption-then-Authentication, thus one may argue it could be more secure in certain usages.

When AAD a is used, if we use Encryption-then-Authentication, we will get:

E(m) + A(a + E(m))

for scheme, which means we encrypt m first, and then concatenate it with a, and then encrypt the result. Notice how a is only protected by one layer of cryptographic operation, the MAC operation A.

And if we use Authentication-then-Encryption, we will get

E(m + A(a+m))

which means we first encrypt concatenated a and m, then concatenate the resulted MAC code with m, and then do the encryption. Notice a is effectively protected by two layers of cryptographic operations, both A and E.

Now suppose the authentication method is somehow broken and the encryption is not, which is not that far-fetched since some MAC algorithms (like HMAC-MD5) is indeed found weak, then a will be fully exposed to tampering when using Encryption-then-Authentication. The same cannot be said for Authentication-then-Encryption.

Update on 2016-09-27:

I agree with some of the top comments that applying a cipher multiple times doesn't always lead to better security so I retracted that statement. But it actually is not relevant to my main point of AtE provides additional layer of security since we are not applying the same cipher to the same data twice in these A/E schemes.

Squeamish Ossifrage
  • 49,816
  • 3
  • 122
  • 230
Penghe Geng
  • 346
  • 2
  • 8