The following is mostly known to lattices people (and mentioned at the end of the Quanta article), but I also reached out to Keegan to clarify some questions I had.
There are (roughly) two main types of lattice basis reduction
- LLL: Poly-time computable, but relatively weak guarantee on output basis, and
- BKZ: Exponential-time computable, but stronger guarantee on output basis.
Ryan and Heninger's work targets the first kind.
The second kind is used to attack lattice-based PQC.
This is to say that Ryan and Heninger's work is purely about improving an (already) "polynomial time" algorithm. Or, in other words, anything that is broken by their scheme was already "theoretically" broken.
Why do I say "theoretically"? There's an addage that essentially all algorithms that computer scientists research that are "polynomial time" are polynomial time for a small polynomial.
See for example Scott Aaronson's survey on P vs NP, where he says
More specifically, many people object to theoretical computer science’s identification of “polynomial” with “efficient” and “exponential” with “inefficient,” given that for any practical value of $n$, an algorithm that takes $1.0000001^n$ steps is clearly preferable to an algorithm that takes $n^{1000}$ steps. This would be a strong objection, if such algorithms were everyday phenomena. Empirically, however, computer scientists found that there is a strong correlation between “solvable in polynomial time” and “solvable efficiently in practice,” with most (but not all) problems in $P$ that
they care about solvable in linear or quadratic or cubic time, and most (but not all) problems outside P that they care about requiring $c^n$ time via any known algorithm, for some $c$ significantly larger than 1. Furthermore, even when the first polynomial-time algorithm discovered for some problem takes (say) $n^6$ or $n^{10}$ time, it often happens that later advances lower the exponent, or that the algorithm runs much faster in practice than it can be guaranteed to run in theory.
If you read the introduction of Ryan and Heninger's paper, LLL initially took (on lattices of dimension $n$ with $p$-bit entries) $O(n^{5+\epsilon}(p+\log n)^{2+\epsilon})$, and the most concretely practical variant of it runs in time $O(n^{4+\epsilon}(p + \log n)(p + n))$.
This is to say that LLL, despite being very important in a number of areas (the LLL paper has 6000+ citations, and multiple books have been written about it), is a "counterexample" to the "strong correlation" that Aaronson mentions.
Ryan and Heninger's work improves this to $O(n^\omega(p+n)^{1+\epsilon})$ for $\omega$ the matrix multiplication exponent.
This is of course not sub-cubic, but it is much closer than preexisting concretely efficient variants of LLL, which is the reason for the excitement/large potential impact of their work.
This large impact is mostly not to attack lattices though.
You generally need a variant of BKZ for this.
Typically you might first run LLL as a preprocessing step, so this work does improve the (poly-time) preprocessing one often does before mounting a full (exponential-time) attack on PQC, but this is not really the reason their work is so exciting.
Instead, one often uses lattice reduction more broadly in cryptanalysis.
See Section 6.3 of their paper.
I quote below
We implemented Howgrave-Graham's version [28] of Coppersmith's original method [14,15] to solve the problem of decrypting low public exponent RSA with stereotyped messages. We set $e = 3$ for a 2048-bit modulus $N$ , and varied the
number of unknown bits of the message from 400 (solvable with a dimension 5
lattice) and 678 (solvable with a 382-dimensional lattice with 430,000 bit entries).
The asymptotic limit for the method with these parameters (without additional
brute forcing) would be expected to be $\approx \lfloor (\log N )/3\rfloor = 682$ bits.
Coppersmith's method is a typical example of a cryptanalytic task where the (weak) LLL basis suffice to break things, but one often has to run this "poly-time" algorithm on massive input sizes.
As an example, for PQC lattices have closer to $\approx 32$-bit entries (and are perhaps dimension 500-1000.
For FHE this is closer to $\approx 200$-bit entries, and perhaps dimension 2000-16000.
430,000 bit entries is massive in comparison.
Often preexisting LLL implementations would struggle with these large problem instances (see Section 6 of Ryan and Heninger's paper for examples).
Ryan and Heninger's work makes many of these (known to be theoretically attackable) instances of Coppersmith's method practically attackable.
There is an application of their work to breaking Gentry's initial 2009 FHE scheme (Section 6), which is based on a highly non-standard lattice problem.
Gentry's scheme has been broken for a while.
For example, the BKZ 2.0 paper (2011) predicted that LLL should suffice for all but the "Large" parameter set, though as LLL is "slow" poly-time they only computed this on a toy parameter set.
For this application, Ryan and Heninger find dramatic speed ups, as they have a dramatically faster version of LLL.
For example, the toy parameter set took 31 core days (in 2011), but 31 (single-threaded) core hours, or 4 (multi-threaded) core hours, now.
This is obviously a dramatic speedup.
But this dramatically faster version of LLL doesn't let them solve problems that LLL is not known to be able to solve.
This is to say that for lattice-based PQC, where one attacks them with BKZ, a (dramatically) improved LLL variant doesn't have many obvious implications (besides speeding up the poly-time preprocessing, as mentioned before).