4

It looks like the IETF draft specification of Classic McEliece has penciled in separate variants of the algorithm for so-called "pc" and "non-pc" variants (as well as the familiar "f" and non-"f" variants); presumably they're going to be assigned separate OIDs for X.509/CMS/PKCS#8/related AlgorithmIdentifier use.

These "pc" and "non-pc" variants are described in this PDF from classic.mceliece.org (excerpted below), but they are, disturbingly, not present in the reference implementation or the specification itself. (I was unable to find any information about the development or rationale of these variants.)

I understand the point of the "f" variants (faster but with more convoluted key generation code), and they are explicitly stated to be interoperable with the non-"f" variants. But I can't find anywhere either the rationale or interoperability status of the "pc" vs "non-pc" variant.

Are they, like the "f" variants, interoperable with their counterparts?

Most importantly: What is the purpose of this variant? What are the trade-offs? When was it introduced? Where did it come from?


For non-pc parameter sets:

A ciphertext $C$ is an element of $\mathbb{F}_{2^{mt}}$. It is represented as a $\lceil m t / 8\rceil$ byte string.

There are two types of hash inputs: $(1, v, C)$ and $(0, v, C)$. Here $v \in \mathbb{F}_{2^n}$, and $C$ is a ciphertext. The initial $0$ or $1$ is represented as a byte. The vector $v$ is represented as the next $\lceil n / 8 \rceil$ bytes. The ciphertext, if present, is represented as the next $\lceil m t / 8 \rceil$ bytes. All hash inputs thus begin with byte $0$ or $1$, as mentioned earlier.

For pc parameter sets:

A ciphertext $C$ has two components: $C_0 \in \mathbb{F}_{2^{mt}}$ and $C_1 \in \mathbb{F}_{2^\ell}$. The ciphertext is represented as the concatenation of the $\lceil m t / 8\rceil$-byte string representing $C_0$ and the $\lceil \ell / 8 \rceil$-byte string representing $C_1$.

There are three types of hash inputs: $(2, v)$; $(1, v, C)$; and $(0, v, C)$. Here $v \in \mathbb{F}_{2^n}$, and $C$ is a ciphertext. The initial $0$, $1$, or $2$ is represented as a byte. The vector $v$ is represented as the next $\lceil n / 8 \rceil$ bytes. The ciphertext, if present, is represented as the next $\lceil m t / 8 \rceil + \lceil \ell / 8 \rceil$ bytes. All hash inputs thus begin with bytes $0$, $1$, or $2$, as mentioned earlier.

1 Answers1

3

This is the so-called "Plaintext Confirmation" variant, and indeed it was submitted as "the" algorithm in rounds 1–3 of the competition, but was replaced in round 4 with a simpler version that has a shorter ciphertext, because the security features it provides were not strongly valued by the competition judges.

Classic McEliece "pc" is a now-alternate version of the algorithm engineered to keep the cryptosystem secure in these circumstances:

  1. Some decapsulation queries occur.
  2. Then, a single bit is flipped at a random position in a stored private key (either by nature or by an attacker).
  3. Then, further decapsulation queries occur against the modified private key.

To lay out the differences more clearly than in the question:

  • The "round-4" (or "non-pc", or now "regular") version of the algorithm outputs $C := {{({ I \, \vert \, pk })}e}$ as the KEM ciphertext.

  • The "pc" (or "round-3") version of the algorithm outputs $C := { {{({I \, \vert \, pk})}e} \; \Vert \; {H{({2 \; \Vert \; e})}} }$ as the KEM ciphertext.

  • Both algorithms use the same formula, $K := H{({ \text{1} \; \Vert \; e \; \Vert \; C })}$, to derive the shared secret from the ciphertext.

You can clearly see here that the ciphertext has an extra glob, $H{({2 \; \Vert \; e})}$, just appended to it, inflating the ciphertext by 32 bytes — or around 15%–20% — with the $\text{SHAKE256}$ algorithm.

(Notation: $pk$ is the public key, $e$ is a random error vector of fixed weight generated according to the parameters, $I$ is the identity matrix of appropriate size for the given multiplication, $\Vert$ is octet string concatenation, $1$ and $2$ are single octets of their respective values, $\vert$ is matrix augmentation, and $H$ is the hash function chosen in the parameters.)

And, of course, this is not interoperable with the non-pc variant.