Attacks exploiting decryption failures in KYBER

Question

I am going through the portion mentioned under the heading Original KYBER analysis inside Section 5.5 titled Attacks exploiting decryption failures.

$${\sf Pr}[\|v\|> k\sigma \sqrt{m}]< k^m e^{\frac{m}{2}(1-k^2)} \hskip5em (1)$$

Equation $1$ is used for computing the probability of finding a $m$-dimensional vector $v$ (picked from a discrete Gaussian distribution with standard deviation $\sigma$) of norm greater than average norm by a factor of $k$.

Later on, they used the following inequality.

If for any vector $v$, if $z$ is chosen according to a Gaussian with standard deviation $\sigma$, then for any $k$,

$$Pr[|\langle z, v\rangle |> k \sigma \|v\|] \leq 2 e^{-\frac{k^2}{2}} \hskip5em (2)$$

As fas as I understood, the authors are trying to maximize the norm of the term $s^Te_1 + e^Tr$. In equation 1 & 2, take $v = (e_1,r)$ and $z = (s,e)$.

First, the author used equation 1 to find the probability of picking a vector $v = (e_1,r)$ of norm greater than the average norm by a factor of 1.33.
Next, author find the value of $k$ in equation 2 such that RHS is equal to the Decryption Failure Probability of Kyber-768.
Lastly, they concluded that they can decrease $k$ in equation 2 by a factor of 1.33, and computed the probability in equation 2, that is finally used for the complexity calculations.

I have the following doubts:

Q1. In step 2, the authors set the RHS of equation 2 to the Decryption Failure Probability of Kyber-768, and later, in step 3, decreased the value of k by a certain factor. I am not able to understand the reasoning behind these steps.

Q2. In equation 1, it is assumed that distribution of $e_1,r$ is close to Discrete Gaussian with SD $\sigma$. So, what happens in the case of Kyber-512 (where SD for $e_1$ and $r$ is different)

Q3. Any suggestions for any automated tools to compute the complexity of decryption failure attack for a general LWE-based KEM.

Daniel S · Accepted Answer · 2025-02-20T15:37:04.293

A1: The author is trying to show that the total work to exhibit a failure will always be greater than $2^{160}$. They tune failure to occur only when $\langle\mathbf z,\mathbf v\rangle$ is greater than some bound $B$ where $B\approx 15\sigma \tau$ where $\tau$ is the size of a typical $\mathbb v$. They then divide into two cases: in the first they consider the number of messages that must be generated if the adversary is in different to the quality of $\mathbf v$, and just uses a typical $\mathbf v$. In this case, they apply (2) with $\kappa\approx 15$, $||\mathbf v||\approx \tau$ and deduce that the probability of a typical $\mathbf v$ inducing a failure is less than $2^{-160}$ and that $2^{160}$ instances of typical $\mathbf v$ would need to be generated.

In the second case, they assume that the adversary devotes some effort to finding a pathologically large $\mathbf v$ with $||\mathbf v||\approx 1.33\tau$. In this case they can only produce a bound for $\mathbb P(\langle\mathbf z,\mathbf v\rangle>B)$ by taking $\kappa\approx 15/1.33$. It follows that such a pathological $\mathbf v$ would induce a failure with probability about $2^{-90}$ meaning that only $2^{90}$ pathological values of $\mathbf v$ need to be found. However, finding these pathological $\mathbf v$ does not come for free. Equation (1) tells us that exhaustively searching for such $\mathbf v$ would take $2^{110}$ time, so that generating $2^{90}$ cases would take $2^{200}$ candidates. This is more work than is permitted.

To be more rigorous the author should have investigated the work function for the full range of possible sizes for $\mathbf v$. This would involve estimating for all real $\theta$ the chances of finding a $\mathbf v$ of size $\theta\tau$ and multiplying this by the RHS of (2) with $\kappa=15/\theta$.

A2 The easiest may to handle this would be to replace the condition $||\mathbf v||>\kappa\sigma\sqrt m$ with the condition $||\mathbf v||>\kappa\max(\sigma_0,\sigma_1)\sqrt m$.

A3 The equations in (1) and (2) are not too messsy and could be simply coded up in a few lines of sagemath or python. The parameter space is not too large, so I would consider simply writing something from scratch.

Attacks exploiting decryption failures in KYBER

1 Answers1