22

I've been learning about fast exponentiation when I found this algorithm:

int expo(int a, int b) {
  int result = 1;

  while (b) {
    if (b%2==1) {
      result *= a;
    }
    b /= 2;
    a *= a;
  }

  return result;
}

This is supposed to compute $a^b$ in logarithmic time, but I couldn't understand or find anywhere a way to arrive at this procedure, other than by noting that (this particular formula didn't help me in understanding why the algorithm is correct):

$$ x^n = \begin{cases} 1 & \text{if $n=0$} \\ \frac{1}{x}^{-n} & \text{if $n<0$} \\ x \cdot \left( x^{\frac{n-1}{2}} \right)^2 & \text{if $n$ is odd} \\ \left( x^{\frac{n}{2}} \right)^2 & \text{if $n$ is even} \end{cases} $$

From what I understand, the transition from this particular identity to the actual algorithm is quite obvious, but I honestly don't get it and I've worked by hand quite a few examples for this algorithm.

Can anyone explain?

senshin
  • 513
jayno
  • 231
  • 3
    Run the loop by hand for, say, $b = 5$. Hint: $a^5 = a^{4+1} = (a^2)^2 \cdot a$. Note that b%2 == 1 when the $n^{th}$ bit is set in the binary representation of $b$. – dxiv Aug 20 '16 at 16:46
  • Does that amount to anything more other than the fact that $b$ is odd in that case? – jayno Aug 20 '16 at 16:54
  • Yes. Hope the posted answer explains it better. – dxiv Aug 20 '16 at 17:09
  • 2
    Have you read the wikipedia article on this algorithm? https://en.wikipedia.org/wiki/Exponentiation_by_squaring – BlueRaja - Danny Pflughoeft Aug 20 '16 at 20:46
  • It would be very helpful if you could edit your answer to include where you stumbled across this snippet – polynomial_donut Nov 24 '19 at 16:48
  • The identity in the question relates $x^n$ to lower exponents ($n/2$ etc) and forms the basis of Left-to-Right Binary Exponentiation, which processes bits of the exponent ($n$) MSB to LSB. But the algorithm posted is Right-to-Left Binary algorithm (and the relevant identities already available in answers). – Nitin Verma Aug 06 '21 at 08:58

6 Answers6

26

Write $b = b_0 + b_1 2 + b_2 2^2 + ... + b_n 2^n$ where $b_k$ are the binary digits of the representation of $b$ in base $2$. Note that $n$ varies logarithmically with $b$. Then:

$$ a^b = a^{b_0} \cdot (a^2)^{b_1} \cdot (a^{2^2})^{b_2} \cdot ... \cdot (a^{2^n})^{b_n} $$

The terms where bits $b_k = 0$ evaluate to $1$ and can be skipped. The terms with $b_k = 1$ are of the form $a^{2^k}$ and can be calculated by repeatedly squaring $a$ $k$ times. This is precisely what the posted code does.

dxiv
  • 77,867
6

The algorithm uses the binary expansion of the exponent to reduce the number of multiplications one has to do. If you take $a$ and square it and then square it again and then square it again, you produce the numbers $a, a^2, a^4, a^8,\ldots,a^{2^k}$ until $2^{k+1}>b$ is So that takes about $\log_2 b$ multiplications. Then if you express $b$ in binary, say, $b=1101001$, then $a^b = a^{2^6+2^5+2^3+2^0} = a^{2^6}a^{2^5}a^{2^3}a^{2^0}$, and you've just computed all these powers of $a$, so multiply them together and that's about $\log_2 b$ more multiplications.

B. Goddard
  • 33,728
  • Thank you very much. Could you please explain which part of the code does pick binary bits where they are found to be 1 and then evaluate $a^{2^k}$. This is the part that still confuses me, but the overall approach is clear to me. – Avv Jun 03 '21 at 14:21
  • @Avra The code isn't really picking out the binary bits. It's using the method of converting to binary here: https://indepth.dev/posts/1019/the-simple-math-behind-decimal-binary-conversion-algorithms I think the best way is to work through a couple examples by hand. – B. Goddard Jun 03 '21 at 14:36
2

I hardly understand you code, but here is description of the algorithm in pseudo-code: we consider the successive values of the squares, denoted by $S$, and by $P$ the successive values of the powers of the number $a$:

Input: $\;a, n$

Output: $\;a^n$

$S\leftarrow a$, $P\leftarrow a$ ;

While $n>1$ do

$\qquad$If odd($n$) $\;P\leftarrow P*S\enspace$ fi \begin{align*} &n\leftarrow \Bigl\lfloor n/2\Bigr\rfloor&\hspace25em\\ &S\leftarrow S^2 \end{align*}

Endwhile

$ P\leftarrow P*S$.

End

Bernard
  • 179,256
  • do you find the code poorly formatted ? I'm sorry if that's the case, I will make appropriate edits. – jayno Aug 20 '16 at 17:11
  • That is not the point: I only a mathematician, not a computer scientist,and I don't know the codes. – Bernard Aug 20 '16 at 17:14
  • Somewhere in the translation from code to pseudocode some bugs crept in: when instantiated e.g. with n=2, a^4 is produced instead of a^2. – Jan Stout Aug 21 '16 at 00:39
  • @Jan Stout: You're right. I wrongly copied the order of the instructions in the loop. It's fixed now. Thanks for pointing it! – Bernard Aug 21 '16 at 01:04
1

If you want to do something in logarithmic time, you probably need to do something with the digits of the number in some base, rather than with the number itself. The digital representation of a number is inherently a logarithmy object. Additionally, since $b$ is the main thing that determines how big $a^b$ is, we're probably going to need to think of $b$ in a logarithmy way, or we're not going to have any hope of making our algorithm go logarithmic.

If you now consider $b$ in base $2$, this is the natural algorithm that falls out.

One may also be inspired by the binary search algorithm, which is to find an element in a sorted list. It works by dividing the list into two, determining whether the element is in the first (resp. second) half of the list, and repeating the algorithm using the now-half-sized list which is the first (resp. second) half of the original. This "divide and conquer" idea turns up all over the place if we want to do things in logarithmic time; the algorithm you have given is a natural answer to the question "can we divide-and-conquer this?".

  • Maybe this should be a comment? I don't see how it solves or suggests an approach to the problem at hand.. – jayno Aug 20 '16 at 17:06
  • @jayno I consider it to be a full answer to "This is supposed to compute $a^b$ in logarithmic time, but I couldn't understand or find anywhere a way to arrive at this procedure, other than by noting that…". What in particular is the problem you have, if not that, or is my explanation failing somehow? – Patrick Stevens Aug 20 '16 at 17:07
  • It seems that the problem is in my original post, sorry. I will edit. – jayno Aug 20 '16 at 17:14
0

Try to exponantiate by only adding exponents, do so in the least amount of steps. Then think about how to get an actual exponent, to actually do the multiplication. The way it's done in the algorithm ensures that you can recycle every variable in the algorithm.

Lets define our function

slowExpo(x,y)

steps:

  1. Generate a sequence of additions, that sum up to y. $y_i$
  2. Generate a sequence of $x^{y_i}$.
  3. Multiplicing every member of the sequence from step 2.
  4. Return the ouput of step 3, it's x^y.

This works because of the exponent addition laws.

$a^b * a^c = a^{b+c}$

The sequence generated in your algorithm is designed for binary and to recycle every sequence member generated, and does step 1,2,3 together "in pseudo-parallel" in one loop, this reduces the amount of work, because the actual lists of sequences don't have to be stored/retrieved and the exponentiated version of $x^{y_i}$, don't have to be calculated via recursion as slowExpo(x,y_i).

-1

Some comments on performance which sadly were too long for a comment.

  1. You can do bit shifts instead of division :

    b >>= 1; // same effect as b/=2, but uses logic instead of expensive integer div instr.
    

    shift all bits left one step. You may need to make sure integers are positive (unsigned int) but will def be faster than risk issuing integer "div" instructions.

  2. If your hardware has a "find first set bit" instruction (for some reason called ffs), you can use that to get deterministic bound for your loop which will remove branchery at cost of like one pipelineable cpu cycle.

  3. Avoid ifs in loops, may well be better to do

    result *= a*(b&0x1);
    

    replacing the if with logic and arithmetics. Multiplications are very pipelineable and branches risk breaking pipelines. Help the compiler to make the instructions as predictable as possible.

mathreadler
  • 26,534