154

Frequently, we want to send messages that are (a) encrypted, so passive attackers can't discover the plaintext of the message, and (b) signed with a private-key digital signature, so active attackers can't make Alice think that a message came from Bob when it didn't.

Is it better to

  1. generate the digital signature from the (hashed) plaintext, and then encrypt a file containing both the plaintext message and the digital signature?
  2. encrypt the message first, and then generate a digital signature from the (hashed) encrypted file?
  3. combine encryption and public-key digital signatures in some other way?

A closely related earlier question (Should we MAC-then-encrypt or encrypt-then-MAC?) seems to focus on symmetric-key MAC authentication. As Robert I. Jr. asked earlier, do the same issues with (symmetric key) MAC-then-encrypt apply to (public key) sign-then-encrypt?

avpaderno
  • 155
  • 1
  • 8
David Cary
  • 5,744
  • 4
  • 22
  • 35

7 Answers7

117

Assuming you are asking about public-key signatures + public-key encryption:

Short answer: I recommend sign-then-encrypt, but prepend the recipient's name to the message first.

Long answer: When Alice wants to send an authenticated message to Bob, she should sign and encrypt the message. In particular, she prepends Bob's name to the message, signs this using her private key, appends her signature to the message, encrypts the whole thing under Bob's public key, and sends the resulting ciphertext to Bob. Bob can decrypt, verify the signature, and confirm that this indeed came from Alice (or someone she shared her private key with). Make sure you use an IND-CCA2-secure public-key encryption scheme and a UF-CMA-secure public-key signature scheme (i.e., one that is secure against existential forgery attack).

Justification: The reason to do this is to defeat some subtle attacks. These attacks are not necessarily a problem in all scenarios, but it's best to harden the approach as much as possible. A complete explanation would take more space than is available here, but see below for a sketch of the reasoning.

For a detailed analysis about whether to sign first or encrypt first, the following is a good resource: Defective Sign & Encrypt in S/MIME, PKCS#7, MOSS, PEM, PGP, and XML.

I don't recommend encrypt-then-sign. It could work, but it has some subtle pitfalls in some contexts, because the signature does not prove that the sender was aware of the context of the plaintext. For instance, suppose Alice's SSH client sends the message "Dear SSH server, please append my public key to /root/.ssh/authorized_keys -- and you can know that I am authorized because I know the root password is lk23jas0" (encrypted then signed with Alice's public key), and the SSH server acts on it if the root password is correct. Then Eve can eavesdrop, capture this message, strip off Alice's signature, sign the ciphertext with Eve's own key, and send it to the SSH server, obtaining root-level access even though Eve didn't know the root password.

D.W.
  • 36,982
  • 13
  • 107
  • 196
27

Should we sign-then-encrypt, or encrypt-then-sign? ... Do the same issues with (symmetric-key) MAC-then-encrypt apply to (public-key) sign-then-encrypt?

Yes. From a security engineering standpoint, you are consuming unauthenticated data during decryption if you mac-then-encrypt or sign-then-encrypt. A very relevant paper is Krawczyk's The Order of Encryption and Authentication for Protecting Communications.

The order may (or may not) be problematic in practice for you. But as the SSL/TLS folks have repeatedly shown, its problematic in practice.

Another important paper is cited by D.W. and Sashank: Don Davis' Defective Sign & Encrypt in S/MIME, PKCS#7, MOSS, PEM, PGP, and XML.

I think the primitive of sign vs mac is less important. With all things being equal (like security levels, key management and binding), then one of the top criteria is efficiency. Obviously, a symmetric cipher is more efficient than an asymmetric cipher.


Data authentication is a different property than entity authentication. You can use a MAC for data authentication and a signature for entity authentication.

But its not entirely clear to me if you want data authentication or entity authentication. The security goal you state in (b) begs for data authentication (a MAC), and not entity authentication (a signature).

I think that's why CodesInChaos said he signs then performs authenticated encryption. That's another way to say he signs-then-encrypts-then-macs. If the MAC is good, then he decrypts and verifies the signature to verify who sent the message. If the MAC is bad, then he does not bother decrypting - he just returns FAIL.

If you look at the link provided by Sashank, CodesInChaos' fix is effectively Sign/Encrypt/Sign from Section 5.2 of the paper. And D.W's solution is effectively Naming Repairs from Section 5.1.


There's a third option that's not readily apparent. It combines Encrypt-Then-MAC for bulk encryption with public key cryptography. Its also IND-CCA2 as D.W. suggested you strive for.

The option is an Integrated Encryption Scheme. There are two of them that I am aware. The first is Shoup's ECIES, which operates elliptic curves; the second is Abdalla, Bellare and Rogaway's DLIES, which operates over integers. Crypto++ provides both ECIES and DLIES. Bouncy Castle provides ECIES.

ECIES and DLIES combine a Key Encapsulation Mechanism (KEM) with a Data Encapsulation Mechanism (DEM). The system independently derives a symmetric cipher key and a MAC key from a common secret. Data is first encrypted under a symmetric cipher, and then the cipher text is MAC'd under an authentication scheme. Finally, the common secret is encrypted under the public part of a public/private key pair. The output of the encryption function is the tuple {K,C,T}, where K is the encrypted common secret, C is the ciphertext, and T is the authentication tag.

There's some hand waiving around the use of a symmetric cipher. The schemes use a stream cipher that XORs the plaintext with the output of the KDF. The design choice here was to avoid a block cipher with padding. You could use a block cipher in a streaming mode, like AES/CTR to the same effect.

There is some hand waiving around the "common secret" since its actually the result of applying a Key Agreement function and later digesting the shared secret with a KDF. The Key Agreement function uses the recipient's static public key and an ephemeral key pair. The ephemeral key pair is created by the person doing the encryption. The person performing the decrypt uses their public key to perform the other half of the key exchange to arrive at the "common secret".

The KEM and the DEM avoid padding, so padding oracles are not a concern. That's why a KDF is used to digest the large "common secret" under the KEM. Omitting the padding vastly simplifies the security proofs of the system.

18

It depends on your requirement,

  • If you Sign-then-encrypt then only receiver can decrypt and then
    verify .
  • If encrypt-then-sign then anybody can verify the
    authenticity and only receiver can decrypt it .

But in practice, both are not enough , ideally We have to sign-encrypt-sign , am not able to recollect the paper which discusses this

There is one more paper that is popular and discusses this issue in general

sashank
  • 6,234
  • 4
  • 36
  • 68
4

Should we (a) sign-then-encrypt, (b) encrypt-then-sign[, or (c) do something else]?

The answer is: (c), do something else. In specific, it's safest to use authenticated encryption (AE) in encrypt-then-mac mode with associated data (AEAD), as well as to hash the target with associated data (signAD), whether or not the target of the signature is the plaintext (when it's encrypted afterward) or the ciphertext. Both procedures, signAD-then-AEAD and AEAD-then-signAD rely on AEAD, and associated data (AD) generally, to mitigate security issues by committing to contextual values (which can be secret or public) and binding them to jointly achieve authenticated encryption, authentication of associated data, and entity authentication.

Do the same issues with (symmetric-key) MAC-then-encrypt apply to (public-key) sign-then-encrypt?

Yes, both symmetric and asymmetric versions of authenticate-then-encrypt lead to similar security failures. The general problem stems from trusting unverified plaintexts(0). In this mode, since the authentication tag is itself part of the plaintext that it needs to be checking the authenticity and integrity of, it can be toyed with if the ciphertext can be toyed with.

Justification for answer (c):

Even though using AEAD to canonicalize the context of a communication channel (the sender, recipient, time, purpose, transcript summary, etc...) mitigates many of the issues that can arise in hybrid schemes, like surreptitious forwarding of plaintexts & swappable signatures on ciphertexts, it is insufficient. Disambiguating the who's, where's and why's in the authenticated encryption, without also doing so in the hash that's signed, doesn't prevent all forms of these kinds of attacks(1)(2). Let's look at some examples.


sign-then-AEAD:

Bob: $C_{b_0}=$ sign-then-AEAD$($"do you want to get pizza, Alice?"$) \quad\to$ Alice

Alice: $C_{a_0}=$ sign-then-AEAD$($"sure, Bob! let's go at dawn."$) \quad\to$ Eve $\to$ Bob

Bob: $C_{b_1}=$ sign-then-AEAD$($"Alice, do you want to do that dangerous thing?"$) \quad\to$ Alice

Eve: $C_{e_0}=$ $C_{a_0} \quad\to$ Bob

Bob: $C_{b_2}=$ sign-then-AEAD$($"okay, Alice, I trust you. let's do the dangerous thing at dawn."$) \quad\to$ Alice

Consider an $\mathrm{AEAD}_K$ function that includes associated data $A$ and produces a ciphertext $C$ of three concatenated values: the recipient's public key $R_\mathrm{public}$, a plaintext $P$, and the signature $\sigma$ using a sender's identity key $S$ to sign a hash $h_\sigma$ such that:

$$h_\sigma = H(R_\mathrm{public} \space || \space P)$$ $$\sigma = S.\mathrm{sign}(h_\sigma)$$ $$C = \mathrm{AEAD}_K(\sigma \space || \space R_\mathrm{public} \space || \space P, \space A, \space \cdot \space)$$

The goal of $\mathrm{AEAD}_K$ here is to conceal $\sigma$, $R_\mathrm{public}$ and $P$, while binding knowledge of them to $A$, together to be taken as a message sent by an entity that knows the shared secret $K$ and a signature of $P$ by $S$ to $R$. That last part is crucial. Even when we assume good practice, where $A$ and $h_\sigma$ contain acknowledgement of the sender and receiver's public keys, $\sigma$ is free to be passed around after the fact, by anyone, which proves that the holder of $S$ signed the hash of $P$ to $R$, but that could have been for any purpose at any time in the past. $\sigma$ doesn't know anything about the ciphertext it will be embedded within. So, even though the ciphertext may prove linkage to the whole context of the message, the signature does not.

This is clearly inadequate. For one, it allows for replay attacks, as shown by Eve being able to send Bob a ciphertext $C_{a_0}$ they captured from Alice, and trick Bob into thinking it was a legitimate response from Alice to a new question. Thankfully, $\mathrm{AEAD}_K$ could come to the rescue here. If a unique message number is also included within $A$, then $C_{a_0}$ can't be replayed. So, as long as $\sigma$ remains secret, $A$ prevents this surreptitious forwarding of signatures within ciphertexts. However, what's to stop a group of people from communicating using sign-then-AEAD with a common secret in some multi-party protocol? In that case $\sigma$ won't remain secret, and it can directly be injected into unintended contexts, leaving us once again vulnerable. Not to mention that in this example, the plaintext grows with the number of recipients.

Therefore, it would be wise to include as much information within $A$ as possible (even the nonce, the IV, or SIV), to fully and uniquely define a usage context (this message, right now, to x, from y, for this purpose...). $A$ can be passed into the calculation of $h_\sigma$ (signAD) too, so that both the signature and ciphertext cannot be used in an unintended context. As a side note, leaving associated data implicit if possible may be a good idea. It saves on space and communication costs. So, for instance, consider not concatenating recipients to the plaintext. Stuff them into $A$ instead. If you know you know, and if you don't, then you probably don't need to know the details of the channel / message.

keyed-signAD-then-AEAD $(\ref{eq:keyed-signAD-then-AEAD})$:

This example solution is better. It commits to everything and leaks nothing as long as the cipher leaks nothing. A keyed hash is calculated on the data prior to signing. That makes the signatures ephemeral, as well as bound to all the secret and nonsecret values of the channel / message.

\begin{array}{|c|ccc|} \hline \\ \mathrm{\quad \bf{designation} \quad} & & K & S & R & P & \sigma \\ \hline \mathrm{secret} & & \checkmark & \checkmark & \checkmark & \checkmark & \checkmark & \\ \hline \mathrm{nonsecret} & & & & & & & \\ \hline \end{array}

$\label{eq:keyed-signAD-then-AEAD}\tag{$\alpha_0$}$ $$K_E, K_A = \mathrm{KDF}(K, \space \mathrm{canonicalize}(\alpha_0, \space A, \space S_\mathrm{public}, \space R_\mathrm{public}))$$ $$\sigma = S.\mathrm{sign}(H_{K_A}(\mathrm{canonicalize}(P)))$$ $$C = \mathrm{AEAD}_{K_E}(\sigma \space || \space P, \space \cdot \space)$$


AEAD-then-sign:

Bob: $C_{b_0}=$ AEAD-then-sign$($"you never showed up at the dangerous thing, Alice..."$) \quad\to$ Eve $\to$ Alice

Alice: $C_{a_0}=$ AEAD-then-sign$($"sorry, Bob! someone has been following me, so I stayed home."$) \quad\to$ Eve $\to$ Bob

Bob: $C_{b_1}=$ AEAD-then-sign$($"Alice, that's awful. why are they following you?"$) \quad\to$ Eve $\to$ Alice

Alice: $C_{a_1}=$ AEAD-then-sign$($"idk, Bob! maybe they know we're talking to each other?..."$) \quad\to$ Eve $\to$ Bob

Consider a similar set of values such that:

$$_iC = \mathrm{AEAD}_K(P, \space A, \space \cdot \space)$$ $$h_\sigma = H(R_\mathrm{public} \space || \space _iC)$$ $$\sigma = S.\mathrm{sign}(h_\sigma)$$ $$C = \sigma \space || \space _iC$$

The goal of $\mathrm{AEAD}_K$ here is to conceal $P$, while binding knowledge of it to $\sigma$, $R_\mathrm{public}$ and $A$, together to be taken as a message sent by an entity that knows the shared secret $K$ and the signature of $C$ by $S$ to $R$. In this scenario, $\sigma$ is bound to $C$, which is good in the sense that $C$ encapsulates all of the contextual data of the message. But, it's not so good in that anyone can strip off the signature and replace it with their own. $\mathrm{AEAD}_K$ prevents this by adding $S_\mathrm{public}$ and $R_\mathrm{public}$ to $A$, which means the recipient will get a failure-to-decrypt message ($\bot$) if the wrong signing key is used to make $\sigma$.

But there's another big problem. Alice realizes that he is being followed, and hints that it may be because of his communications with Bob. Maybe this is just paranoia on Alice's part, but AEAD-then-sign can enable this kind of tracking. How? Well, first Eve gets hold of a ciphertext $C$ from listening in on either Alice or Bob. Then Eve tests some known directory of public keys to find a pair which satisfies

$$S'.\mathrm{verify}(\sigma, \space H(R'_\mathrm{public} \space || \space _iC))$$

This is an issue which stuffing more public values into $A$ or $h_\sigma$ would not solve. It would be understandable to consider if such a structure is even valuable when it can be so detrimental to privacy. Though, there are some documented use cases in the area of publicly verifiable ciphertexts. In which case, it may be appropriate to remove $R_\mathrm{public}$ from the calculation of $h_\sigma$, so the signature can be verified without knowing the recipients. If the verifier isn't fully public, say it's some semi-trusted service, it may also be appropriate to add a secret value, or use a keyed-hash, when calculating $h_\sigma$. That way only entities who are given the secret value will have permission to verify recipients.

AEAD-then-keyed-signAD-1way $(\ref{eq:AEAD-then-keyed-signAD-1way})$:

This example solution enables an entity outside of the communication channel to verify that a ciphertext has been issued by a signing party using an ephemeral key $K_A$. The outside entity doesn't learn anything about the recipient's identity information since $K_A$ is ephemeral.

\begin{array}{|c|ccc|} \hline \\ \mathrm{\quad \bf{designation} \quad} & & K & S & R & P & \sigma & \\ \hline \mathrm{secret} & & \checkmark & & \checkmark & \checkmark \\ \hline \mathrm{nonsecret} & & & \dagger & & & \dagger \\ \hline \end{array}

$\label{eq:AEAD-then-keyed-signAD-1way}\tag{$\alpha_1$}$ $$K_E, K_A = \mathrm{KDF}(K, \space \mathrm{canonicalize}(\alpha_1, \space A, \space S_\mathrm{public}, \space R_\mathrm{public}))$$ $$C = \mathrm{AEAD}_{K_E}(P, \space \cdot \space)$$ $$\sigma = S.\mathrm{sign}(H_{K_A}(\mathrm{canonicalize}(S_\mathrm{public}, \space C)))$$

AEAD-then-keyed-signAD-2way $(\ref{eq:AEAD-then-keyed-signAD-2way})$:

This example solution enables an entity outside of the communication channel to verify that a ciphertext has been issued by a signing party to specific recipients, using an ephemeral key $K_A$.

\begin{array}{|c|ccc|} \hline \\ \mathrm{\quad \bf{designation} \quad} & & K & S & R & P & \sigma & \\ \hline \mathrm{secret} & & \checkmark & & & \checkmark \\ \hline \mathrm{nonsecret} & & & \dagger & \dagger & & \dagger \\ \hline \end{array}

$\label{eq:AEAD-then-keyed-signAD-2way}\tag{$\alpha_2$}$ $$K_E, K_A = \mathrm{KDF}(K, \space \mathrm{canonicalize}(\alpha_2, \space A, \space S_\mathrm{public}, \space R_\mathrm{public}))$$ $$C = \mathrm{AEAD}_{K_E}(P, \space \cdot \space)$$ $$\sigma = S.\mathrm{sign}(H_{K_A}(\mathrm{canonicalize}(S_\mathrm{public}, \space R_\mathrm{public}, \space C)))$$


Conclusion:

The main takeaways, which are supported in recent works and in the design of modern libraries, are that key material, signatures, and hashes, should unambiguously commit to specific and unique usage contexts. Doing so with care can make both the safest options (signAD-then-AEAD) and the riskier options (AEAD-then-signAD) safer.

aiootp
  • 1,182
  • 4
  • 11
1

How about having three different keys: $S_K1, C_K$ and $S_K2$.

  1. $S_K1$ is used to sign the cleartext message.
  2. $C_K$ is used to encrypt the concatenation of the signature generated in (1) and the cleartext.
  3. finally $S_K2$ is used to sign the encrypted output of (2). Then, the message to send is the concatenation of the output of (2) with the output of (3).

I think this is what is done by the milimail extension of Thunderbird.

Mike Edward Moras
  • 18,161
  • 12
  • 87
  • 240
daruma
  • 385
  • 3
  • 13
-1

For Authenticated Encryption, the best practice is "encrypt and then MAC". Encrypt and then MAC is always AE secure (assuming the encryption is CPA secure and the MAC is secure), but MAC then Encrypt is not always secure. The NIST AES-GCM AEAD scheme is based on "Encrypt then MAC". The SSL padding attack exploits the fact that its AE is based on "MAC and then Encrypt". Also, when you "MAC and then Encrypt" you operate encryption on a data that its entropy is decreased, due to the addition of the MAC, that is dependent on the data, and does not add entropy. So, for AEAD it is better to use encrypt and then MAC. Nevertheless, you are asking about digital signature, and intuitively I think that the same practice should be used (i.e. Encrypt and then sign), although I did not find a security prove.

Evgeni Vaknin
  • 1,155
  • 8
  • 20
-4

The only difference between these approaches has to do with hiding information about the sender. If you don't want attackers to know who the signer is, you need to sign-then-encrypt. In other cases it doesn't matter.

Patriot
  • 3,162
  • 3
  • 20
  • 66
Pavel Ognev
  • 147
  • 4