4

Can someone explain, how $a^x \mod N$ can be speeded up, when $a$ and $N$ are known constants? How big is the gain and what resources are needed?

https://www.imperialviolet.org/2013/05/10/fastercurve25519.html


Just to mention: it can speed up SRP hashes bruteforce, which is calculated as $v = g^x \mod N$ where $x = hash(username, salt, password)$

Smit Johnth
  • 1,731
  • 4
  • 18
  • 27

2 Answers2

5

One obvious way is to precompute values $a^{k_1} \bmod N$, $a^{k_2} \bmod N$, ...,$a^{k_i} \bmod N$, and (depending on the value of $x$) multiply together the appropriate elements.

To take a simple example, if we precompute $a^1 \bmod N, a^2 \bmod N, a^4 \bmod N, ... a^{2^k} \bmod N$, and (based on the value of $x$ in binary, multiply the appropriate elements together); this gives a method which takes an average of $1/2 log_2 N$ multiplies, which is an obvious improvement over what you can do without any precomputation.

The paper gives a slightly more aggressive example (treating $x$ as a base 16 rather than a base 2 number).

However, you can do even better: see this paper (or the extended abstract at Eurocrypt 1992) for a survey of the various possibilities.

One note: if you are performing operations on an Elliptic Curve (that is, doing a point addition rather than a modular multiplication), then the operation of computing the inverse of an element is cheap; even though the cited paper doesn't cover that case, that can be used to reduce the number of operations even further.

fgrieu
  • 149,326
  • 13
  • 324
  • 622
poncho
  • 154,064
  • 12
  • 239
  • 382
2

In brief: If you know $(a,N)$, you can speed the computation up by precomputing some of the powers of $a$.


Let $x=x_n\dots x_1x_0=\sum_{i=0}^n x_i 2^i$ be the binary expansion of $x$, and let $a_j=a^{2^j}\pmod N$.

Very naively: $$ a^x \pmod N = \overbrace{a*(a*(a*\dots*(a))\dots))}^{\text{x terms}} $$ This requires $\theta(x)$ multiplications.

Traditional Square and Multiply: $$ a^x \pmod N = a^{x_n \dots x_0} = a^{2^n x_n+ \dots+ x_0} =(a^{2^n})^{x_n} (a^{2^{n-1}})^{x_{n-1}} \dots a^{2^0})^{x_n} =\prod_{i=0}^n a_i^{x_i} $$ So, to use this for efficient multiplication, we maintain a product value $y$, and exponentiation variable $e$ - initialised with $(y,e)=(1,a)$. Then, we simply continue to square $e$, multiplying $y$ by $e$ each time we reach some power $a^{2^j}$ for which $x_j= 1$. How much work does this require? Well, we must calculate $n=\log_2(x)$ multiplications to calculate the $a_j$, and then on average $n/2$ multiplications to calculate $a^x$ (where we assume that an "average" value of $x$ has half it's bits set), and at most $n$ multiplications. Total? $2n$ worst case, $\frac{3}{2}n$ on average.

Precomputational Optimisations of Square and Multiply

Calculating each $a_j=a^{2^j}\pmod N$ in advance, when we calculate $a^x$ we only need to do the (at most) $n$ multiplications. That is, we do not need to do any exponentiations at all. However, if $x$ may be very large, this will involve storing a large amount of data, but by storing some subset of these $j$ we can reduce the number of exponentiations required to reach the remaining values. Moreover, if one so wished this could store values such as $b=a_4*a_2$, which would reduce the online cost of calculating $a^{1010b}$ to the cost of looking up $b=a^{1010b}$.

Deciding a balance for this trade-off provides an interesting question, since at some point storing too many powers becomes unreasonable. For example, it would be possible to precompute and store $a^x\pmod N$ for all $x\in\{0,\dots,2^t\}$. This would reduce calculating $a^x$ to the look-up cost, but such a table would have size $\theta(2^t)$, which may well be impractically large.

Cryptographeur
  • 4,357
  • 2
  • 29
  • 40