I should like to evaluate $\log_2{256!}$ or other large numbers to find 'bits' of information. For example, I'd need three bits of information to represent the seven days of the week since $\lceil \log_2{7}\rceil = 3$, but my calculator returns an error for large numbers.
- 76,540
- 605
-
2There's an important distinction between $\log_2 256!$ and $\lceil\log_2 256!\rceil$. If you only need an approximation, then Stirling is the way to go. If you need the actual value, then computers are more helpful (although there are refinements to Stirling). – Teepeemm Mar 17 '18 at 22:24
-
14Your title is poor English (despite the fact you rejected an improvement to it). "Is there a (simple) way" would be better. – matt_black Mar 18 '18 at 19:19
-
3It was fine as it was originally. In any case, 'improvement' is an opinion and 'simple' would be presumptions. Let's stick to the mathematics. – Red Book 1 Mar 19 '18 at 00:44
-
2Nope - "I should like" is fine. It's rather old-fashioned and perhaps more common in the UK ... e.g. https://english.stackexchange.com/questions/19438/can-one-say-i-should-like-rather-than-i-would-like-is-the-former-grammatica – MartinV Mar 19 '18 at 17:08
-
1You say "or other large numbers" without saying what function produces those large numbers. If the numbers are the result of multiplications or exponentiations, then you should be able to apply the standard rules of logarithms. If you're characterizing large numbers by a function that uses something other than multiplication or exponentiation, say what the function is. Note that there exist well-defined functions from integers to very large integers that we cannot determine the log of even approximately. – Eric Lippert Mar 19 '18 at 18:00
-
4@RedBook1 “Is there are way” is not grammatical. You probably meant either “Is there a way” or “Is there any way”. (Many of the other changes that people tried making to your question seem unnecessary though.) – ShreevatsaR Mar 19 '18 at 18:13
-
@ ShreevatsaR. Yes. You are right. That was my obvious oversight (I changed my mind and missed that) but I 'should' say the rest was fine. – Red Book 1 Mar 19 '18 at 23:08
-
One way to approximate it is by counting the number of primes less than that large number. If you're looking for a reasonable way to compute this, on the other hand... – Robin Goodfellow Mar 20 '18 at 12:42
-
What kind of calculator? If it is using floating point representation, your day of week problem is hopeless. If it has extended precision integer, your problem is trivial. Just do something like casting out nines in decimal arithmetic by casting out sevens in octal. – richard1941 Mar 21 '18 at 17:14
-
The logarithm problem is opposite: if you have floating point representation, the problem is trivial... but some numbers are so large that they cannot be represented in floating point. If they are extended integers, you have more of a problem: count the number of bits, the do the rest of the math to get the fractional part of the base 2 log. For what it's worth, my favorite calculator for serious number crunching is the WP-34. HP has discontinued the platform, but you can get a really good emulator that runs on an iPhone. – richard1941 Mar 21 '18 at 17:19
7 Answers
By the laws of logarithms
$$ \log_2(256!) = \sum_{i=1}^{256} \log_2(i)$$
This is easily evaluated (albeit not on all calculators).
In Python:
>>> sum(math.log2(i) for i in range(1,257))
1683.9962872242136
- 26,841
- 5,517
- 2
- 19
- 28
-
1This seems to me like a better answer given the scale of the numbers the OP asks about, as it will be more accurate. – Davis Yoshida Mar 17 '18 at 22:04
-
17
-
@DavisYoshida: Actually, because of floating-point issues, this is in general likely to be less accurate than using Stirling's approximation to sufficiently many terms. – ShreevatsaR Mar 18 '18 at 08:55
-
3(Hmm computer experimentation doesn't seem to bear out my previous comment. Probably because of the division; need to look deeper… anyway, floating-point issues from adding up many terms is something to take into account, as note that for $n = 256$, the correct value is $1683.9962872242146…$ while the above from adding up the logs seems to give $1683.9962872242136$ which is an error at 12th decimal place. Probably not a big deal yet.) – ShreevatsaR Mar 18 '18 at 09:10
-
4@ShreevatsaR As a rough rule of thumb, the error in adding up 256 floats of roughly the same magnitude should be around 256(machine_epsilon) which would be (for 64 bit floats) $2562^{-52} \approx 2.7 \times 10^{-14}$. Your calculation suggests that the error is slightly worse than that (probably since the numerical method to approximate logs has its own error). It is an interesting question of how large $N$ needs to be for round-off errors make the calculation worthless. – John Coleman Mar 18 '18 at 12:28
-
@Davislor True. Out of curiousity I ran
microbenchmark(sum(log2(1:256)), sum(log2(2:256)))in R and see that on my machine you could save around 100 nanoseconds by starting with 2. Interestingly,1+sum(log2(3:256))does slightly worse than the simplesum(log2(1:256))(probably because it isn't fully vectorized). – John Coleman Mar 18 '18 at 12:46 -
11The error of adding floating point numbers depends on the order and magnitude differenence. In this calculation it's added from smallest to biggest. If you turned the range around, then the result would be a lot more unprecise. – HopefullyHelpful Mar 18 '18 at 14:09
-
5@HopefullyHelpful That is why I stipulated that the numbers being added all be of roughly the same magnitude.
sum(math.log2(x) for x in range(256,0,-1))agrees with the original calculation out to 11 decimal places, so I wouldn't characterize it as "a lot" more imprecise. – John Coleman Mar 18 '18 at 14:18 -
1Indeed, logarithms grow slowly, so this should be pretty good to relatively large N. – Kevin Mar 18 '18 at 16:53
-
2Since you can group the terms in any way,
log (1*256) + log (2*255) + ... + log(127*129) + log(128)is an alternative way to calculate this. This halves the number of terms, while the magnitude of terms now differs only by a factor of 64 instead of 256. – MSalters Mar 19 '18 at 11:54 -
@MSalters That is a clever idea. It would be especially useful for larger $N$. – John Coleman Mar 19 '18 at 11:56
-
-
On second thought, that odd
log(128)term is a bit wrong. It should have beenlog (1*256) + log (2*255) + ... + log(128*129). As noted, you can include or exclude the1term, so it's always possible to create pairs. – MSalters Mar 19 '18 at 13:59 -
2And the python
>>> sum(math.log2(i*(257-i)) for i in range(1,129))gives 1683.9962872242143, which indeed pushes the error back one digit. – MSalters Mar 19 '18 at 14:03 -
@MSalters Nice. I suspect that your approach will be reasonable even for fairly large $N$. – John Coleman Mar 19 '18 at 14:12
-
John, that thing in @MSalters's comment is actually a cleverly disguised "Gaussian" trick. – J. M. ain't a mathematician Mar 19 '18 at 14:40
-
1@J.M.isnotamathematician: Well, not really - the terms here are not identical and need to be calculated individually. Note that I'm multiplying the sides of each pair, not adding them up. It's much closer to https://en.wikipedia.org/wiki/Pairwise_summation – MSalters Mar 19 '18 at 15:04
-
1Why not
math.log2(math.factorial(256))? (Or even more simply,math.lgamma(257)/math.log(2).) – Mark Dickinson Mar 19 '18 at 15:34 -
@MarkDickinson Fair question. OP's question was about how to calculate log factorials given computational limitations which preclude direct evaluation. Using the laws of logarithms is a natural answer to that question. The 1-line Python snippet was purely illustrative. Python itself has good support for big int computations, so in Python itself there isn't a strong motivation for doing this. In something like R (where
log2(factorial(256))returnsInfand gives a warning message) it would be more motivated. – John Coleman Mar 19 '18 at 15:44 -
@MarkDickinson Although in R itself you would naturally use the
lgamma()approach. – John Coleman Mar 19 '18 at 15:57 -
@JohnColeman What you listed is the magnitude of the relative error, not the absolute one. So the result obtained is well within the error range. – Federico Poloni Mar 19 '18 at 16:47
-
@DavisYoshida Actually, the initial comment I made, which I later thought was wrong, was actually correct :-) Compared to writing
sum(math.log2(i) for i in range(1,257)), using Stirling's approximation and writing(n*math.log2(n) - n/math.log(2) + math.log2(n*math.pi*2)/2 + 1/(n*12*math.log(2)) - 1/(n**3*360*math.log(2)))(withn = 256) gives a number closer to the correct value. – ShreevatsaR Mar 19 '18 at 19:05 -
@ShreevatsaR at the time of my comment, I believe only the first two terms of the Stirling series were in the accepted answer, so at the time I believe I was correct – Davis Yoshida Mar 19 '18 at 20:10
-
Expected error grows as sqrt(n), not n. So for 256 terms, it should be 64*machine_epsilon. – Acccumulation Mar 19 '18 at 21:10
-
I think that internally, the math module implements log2(x) as log(x)/log(2), so if you do
sum([math.log(i)+math.log(257-i) for i in range(1,129)])/math.log(2), you only have to do the log(2) division once, giving a very small improvement to accuracy and speed. Andsum([math.log(i)+math.log(257-i) for i in range(1,129)])/math.log(2)seems to be faster thansum(math.log(i)+math.log(257-i) for i in range(1,129))/math.log(2). – Acccumulation Mar 19 '18 at 21:21 -
2@Acccumulation:
log2is easy for binary computers. The integral part is counting bits, and the fractional part is easy with $log(x)=1/16 log(x^16)$ – MSalters Mar 20 '18 at 01:03 -
@MSalters Your formatting could use some work. I assume you mean (1/16)(log(x^16)). This requires four multiplication operations, and gives you four bits. That isn't very precise, or efficient. – Acccumulation Mar 20 '18 at 15:12
-
@Acccumulation: Squaring is cheaper than multiplication, as it only take one argument. It can even be done in-place here, which means you need only one CPU register. As for precision, you're likely to hit representation limits first. 24 fractional bits is all you need for a single-precision IEEE754 float. – MSalters Mar 20 '18 at 15:33
-
1@Acccumulation: You can see the implementation of
math.log2here. It's implemented in a way that's usually more accurate than doinglog(x)/log(2). – Mark Dickinson Mar 20 '18 at 20:29
If it's about factorials, you can use Stirling's approximation:
$$\ln(N!) \approx N\ln(N) - N$$
Due to the fact that
$$N! \approx \sqrt{2\pi N}\ N^N\ e^{-N}$$
Error Bound
Writing the "whole" Stirling series as
$$\ln(n!)\approx n\ln(n)−n+\frac{1}{2}\ln(2\pi n)+\frac{1}{12n} −\frac{1}{360n^3}+\frac{1}{1260n^5}+\ldots $$
it is known that the error in truncating the series is always the opposite sign and at most the same magnitude as the first omitted term. Due to Robbins, we can bound:
$$\sqrt{2\pi }n^{n+1/2}e^{-n} e^{\frac{1}{12n+1}} < n! < \sqrt{2\pi }n^{n+1/2} e^{−n} e^{1/12n}$$
More on Stirling Series in Base $2$
Let's develop the question of Stirling series when we have a base $2$ for example. The above approximation has to be read this way:
$$log_2(N!) \approx \log_2(\sqrt{2\pi N} N^N\ e^{-N})$$
Due to the fact that we have a non-natural log, it becomes
$$\log_2(N!) \approx \frac{1}{2}\log_2(2\pi N) + N\log_2(N) - N\log_2(e)$$
Hence one has to be very careful with the last term which is not $N$ anymore, but $N\log_2(e)$.
That being said one can proceed with the rest of Stirling series.
See the comments for numerical results.
Beauty Report
$$\color{red}{256\log_2(256) - 256\log_2(e) + \frac{1}{2}\log_2(2\pi\cdot 256) = 1683.9958175971615}$$
a very good accord with numerical evaluation (for example W. Mathematica) which gives $\log_2(256!) = 1683.9962872242145$.
-
9
-
-
8@SolomonUcko you can use the laws of logarithms: $$ \log_2 n! = { \ln n! \over \ln 2 } \approx { n \cdot \ln n - n \over \ln 2 } $$ – Tobia Mar 18 '18 at 21:17
-
@SolomonUcko As Tobia suggested, you can use the change of basis rule with ease. – Mar 18 '18 at 21:20
-
@Tobia, thanks. I knew there was something like that, but I didn't remember the details. Also, can it convert from any base, or just base $e$? – Solomon Ucko Mar 19 '18 at 00:28
-
3@SolomonUcko Yes, it can convert from any base. The mnemonic rule is that the base is traditionally written below the word "log", so it goes on the bottom when you convert it to a fraction: $$ \log_{base} arg = { \log_x arg \over \log_x base } $$ and this is true for any $x$ (if the log exists, etc.) – Tobia Mar 19 '18 at 08:52
-
So what's the error on $\log_2 256!$? How many correct digits does this method produce? – Federico Poloni Mar 19 '18 at 16:44
-
I mean, how many provably correct digits does this method produce (with your error bound)? – Federico Poloni Mar 29 '18 at 14:14
In Emacs, C-x * * 256 ! 2 B N will readily deliver
1683.99628722
and of course you are free to increase the precision of your calculation. Stuff definitely is fast enough that there isn't much incentive for designing a solution outside of your editor.
-
2
-
4
-
@PeterA.Schneider Sure, it could be done with Vim, someone would just have to re-implement Emacs Calc, which is a pretty large piece of software by itself, in Vim. I suspect it was easier to implement in Emacs Lisp than it would be in Vimscript, though. :-) – ShreevatsaR Mar 19 '18 at 19:23
-
@PeterA.Schneider you can run commands or other programs, like
bc, from inside vi (and I suppose Vim). For example::r ! echo "( l(8*a(1))/2 + (256+1/2)*(l(256)-1) + 1/2 ) / l(2)" | bc -lwill calculate and bring back into the editor:1683.99581759716152712952. Who needs emacs ... – ypercubeᵀᴹ Mar 19 '18 at 20:19 -
Amazing. I just spent 5 minutes admiring how this just works in Emacs, follwed by about 10 minutes figuring out how to close the calculator window again, as it had meddled with my accustomed Vim bindings (Evil) without which I'm completely lost in Emacs... – leftaroundabout Mar 19 '18 at 21:14
-
1@PeterA.Schneider Sure, if Vim is compiled with the +python option you can do it with ":from math import *" followed by ":py print log(factorial(256))/log(2)" Much more readable than Emacs :P – Paul Evans Mar 20 '18 at 01:16
Just a comment:
Of course, there are many calculators that can handle $\log_2 256!$ and much "worse" expressions directly. For example PARI/GP, if you type
log(256!)/log(2)
you will get a result like:
%1 = 1683.9962872242146194061476793149931595
(the number of digits can be configured with the default called realprecision).
If you want an exact integer logarithm, you can also use logint(256!, 2) which will give you 1683.
Typing 256! alone will give you the full 507-digit decimal expansion of this integer.
If PARI/GP is allowed to use memory (set parisizemax default), it will also immediately say that logint(654321!, 2) is 11697270.
As noted in comment, with reference to answer by Charles, if you want to work with floating-point operations (and not huge exact integers), you can use function lngamma which is equal to $\log\circ\Gamma$ for positive real arguments. Remember that compared to factorial, the Gamma function is shifted by one. So
$$\log_2 n! = \frac{\log n!}{\log 2} = \frac{\log \Gamma(n+1)}{\log 2}$$
and you can type lngamma(654321 + 1)/log(2) in PARI/GP and everything will be floating point operations. This will work for astronomical values of $n$, for example lngamma(3.14e1000) is fine ($\log\Gamma(3.14\cdot 10^{1000})$).
- 5,419
- 23
- 31
-
1The excellent PARI/gp implements too the much faster lngamma function (proposed by Charles) and lngamma(N+1)/log(2) should reduce the possibility of overflows for $N\gg 1$. – Raymond Manzoni Mar 19 '18 at 09:34
As others have mentioned, your example is small enough to be computed directly with many systems. I should mention that many systems implement
$$
\log\Gamma(x)
$$
usually with names like lngamma or lgamma. You can then compute
$$
\log_2(256!)=\frac{\log\Gamma(257)}{\log 2}
$$
with minimal difficulty (and without leaving double precision).
In general some care is needed to work with the gamma function (branch cuts and numerical analysis), but in your case as long as you stick to factorials of numbers from 1 to 10305 or so you should be just fine with 64-bit doubles.
- 32,999
-
Although only the positive integers are relevant for the question it should be noted that $\log(\Gamma(x)) \not= \log\Gamma(x)$ in several common multiprecision programs. (For details see e.g.: D. E. G. Hare's "Computing the Principal Branch of log-Gamma". ) This might also be the case for "normal" (double precision) calculators if they support complex numbers, so a bit of caution is advised. For example the difference between
abs(lngamma(-3.4)) ~ 12.6163andabs(log(gamma(-3.4))) ~ 1.1212can cause some severe headaches if hidden deeply in a large formula. – deamentiaemundi Mar 19 '18 at 03:04 -
@deamentiaemundi Yes, good advice for the general case. When working with positive integers it should not matter. I'll edit in some general remarks. – Charles Mar 19 '18 at 03:12
If the number is not a factorial but rather any large number then a cutest way would be to consider the fundamental theorem of number theory.
says for any integer $n$ we have $$n =\prod p_i^{a_i}$$ where p_i's are prime numbers. then by the law of log you get $$\log n= \sum a_i\log p_i. $$
- 25,237
Since this is a question about evaluating to get a result rather than understanding the method behind that result, the online computational knowledge engine WolframAlpha is always an option. A truly fantastic resource that gives the result to great accuracy almost instantly without the need for programming or even mathematical experience
- 31
-
2If you're using Alpha (or Mathematica for that matter), you can use
LogGamma[256 + 1]/Log[2]instead of separately computing the factorial before taking the logarithm. – J. M. ain't a mathematician Mar 19 '18 at 12:24
