I'm trying to reproduce Excel's MULTINOMIAL function in C# so
$${MULTINOMIAL(a,b,..,n)} = \frac{(a+b +\cdots +n)!}{a!b! \cdots n!}$$
How can I do this without causing an overflow due to the factorials?
I'm trying to reproduce Excel's MULTINOMIAL function in C# so
$${MULTINOMIAL(a,b,..,n)} = \frac{(a+b +\cdots +n)!}{a!b! \cdots n!}$$
How can I do this without causing an overflow due to the factorials?
Another, more efficient way of computing the coefficients exactly that generally shouldn't overflow unless the result does is by using the characterization of the multinomial coefficient as a product of binomial coefficients: $${a+b+c+\cdots+n\choose a\;b\;c\cdots\;n} = {a+b\choose b}{a+b+c\choose c}\cdots{a+b+c+\cdots +n\choose n}$$ This is easy to prove by multiplying out the right hand side and noting that factors of $(a+b)!,\;(a+b+c)!,\;\ldots$ cancel out. This simplifies the problem of computing a multinomial coefficient to just computing a binary one - and there are several straightforward ways of doing that without overflow. The easiest may be use the recursive formula ${n+1\choose k+1} = \frac{n+1}{k+1}{n\choose k}$ along with some careful manipulation of the (exact) results to make sure you're always dividing out GCDs; for more information on this approach, see http://blog.plover.com/math/choose.html . Alternately, you can use Kummer's theorem that the power of a prime $p$ dividing ${a+b\choose b}$ is the number of carries when $a$ and $b$ are written as base-$p$ numbers and added. By doing this for a suitable range of primes, you can find the prime factorization of the result and thus the result itself (indeed, you could just accumulate the result by finding the dividing exponent for all the primes less than the top of your binomial coefficient in turn).
If your computer can handle \logarithms and exponentials of numbers, you could calculate the following: $$\log(\frac{(a+b +\cdots +n)!}{a!b! \cdots n!})=\log(a+b +\cdots +n)!)-\log(a!b! \cdots n!)=$$ $$\log(2)+\log(3)+\dots+\log(a+b +\cdots +n)-\log(a!)-\log(b!)-\ldots -\log(x!)=$$ $$\log(2)+\log(3)+\dots+\log(a+b +\cdots +n)-\log(2)-\ldots-\log(a)-\log(2)\ldots-\log(b)\ldots \log(2)-\ldots \log(x)$$
When you do all of these additions, you exponentiate the result using whatever base you started with (as written the base is ten, but you could choose another base).
As a historical note, this is essentially what people did before computers. They used $\log$ tables. I am not a computer programer, so I do not really know how feasible this is in practice or if their are better methods. You can also multiply huge numbers more quickly with a fast fourier transform, but that might be overkill.
It's question more about programming, but not mathematics :)
There are four solutions (also see code on github):
My implementation:
public static ulong Mutinomonal(params uint[] numbers)
{
uint numbersSum = 0;
foreach (var number in numbers)
numbersSum += number;
uint maxNumber = numbers.Max();
var denomFactorPowers = new uint[numbers.Max() + 1];
foreach (var number in numbers)
for (int i = 2; i <= number; i++)
denomFactorPowers[i]++;
for (int i = 2; i < denomFactorPowers.Length; i++)
denomFactorPowers[i]--; // reduce with nominator;
uint currentFactor = 2;
uint currentPower = 1;
double result = 1;
for (uint i = maxNumber + 1; i <= numbersSum; i++)
{
uint tempDenom = 1;
while (tempDenom < result && currentFactor < denomFactorPowers.Length)
{
if (currentPower > denomFactorPowers[currentFactor])
{
currentFactor++;
currentPower = 1;
}
else
{
tempDenom *= currentFactor;
currentPower++;
}
}
result = result / tempDenom * i;
}
return (ulong)result;
}
Naive implementation:
public static ulong Mutinomonal(params uint[] numbers)
{
uint numbersSum = 0;
foreach (var number in numbers)
numbersSum += number;
ulong nominator = Factorial(numbersSum);
ulong denominator = 1;
foreach (var number in numbers)
denominator *= Factorial(number);
return nominator / denominator;
}
public static ulong Factorial(ulong n)
{
ulong result = 1;
for (ulong i = 1; i <= n; i++)
result = result * i;
return result;
}
I've checked both implementations on $1, 2, 3, 4, 5, 6$ numbers and in last implementation overflow has been occurred, unlike the first.
UPDATE
I estimed @Baby Dragon answer and wrote this logarithm implemetation on C#:
public static List<double> Logarithms = new List<double>();
public static ulong Multinominal(params uint[] numbers)
{
uint numbersSum = 0;
foreach (var number in numbers)
numbersSum += number;
double sum = 0;
for (int i = 2; i <= numbersSum; i++)
{
if (i - 2 < Logarithms.Count)
sum += Logarithms[i - 2];
else
{
var log = Math.Log(i);
Logarithms.Add(log);
sum += log;
}
}
foreach (var number in numbers)
for (int i = 2; i <= number; i++)
sum -= Logarithms[i - 2];
return (ulong)Math.Exp(sum);
}
Another implementation (in pari/GP) avoids logarithms and works completely in integer mode. It is an extension of the method to compute the binomial sequentially : [update] $\binom {a+b}{a} = {a+1\over 1} \cdot{a+2\over 2} \cdot{a+3\over 3}\cdot{a+4\over 4}\cdots{a+b\over b} = (((((a+1)/1 \cdot (a+2) )/2 \cdot (a+3) )/ 3 \cdots(a+b) / b$. (This is also in the answer of Steven) [/update]
Here is the code:
{mulbin(v)=local(pr,g,maxf);
pr = vector(#v,r,1);
g = vector(#v); g[1]=1;for(k=2,#v,g[k]=g[k-1]+v[k-1]);
maxf=1; for(k=2,#v,maxf=max(maxf,v[k]));
for(f=1,maxf,
for(k=2,#v , if(f>v[k],next());
pr[k] *= g[k]; pr[k] /=f;
g[k] ++ ;
);
);
return(prod(k = 2,#v,pr[k])); }
gp> mulbin([8,2,3])
%293 = 12870
It is an optimization to provide the elements of the vector with the highest number first; because the outer loop (maxf) is the maximum of the second to last entries in the vector only.
The method has the further advantage that when one decides to use logs anyway: then because of the organization of the outer and the inner loop it needs not compute the log of a number twice (which is a costly operation). The code would simply change to:
{logmulbin(v)=local(logpr,g,maxf,logf);
logpr = vector(#v,r,0);
g = vector(#v); g[1]=1;for(k=2,#v,g[k]=g[k-1]+v[k-1]);
maxf=1; for(k=2,#v,maxf=max(maxf,v[k]));
for(f=1,maxf, logf=log(f);
for(k=2,#v , if(f>v[k],next());
logpr[k] += log(g[k]); logpr[k] -= logf;
g[k] ++ ;
);
);
return(sum(k = 2,#v,logpr[k])); }
gp> exp(logmulbin([8,3,2]))
%294 = 12870.0000000
The answer that Steven Stadnicki gave is simple to implement using the Python programming language; see this solution (and just updated in January / 2020).
Every multinomial coefficients can be broken down into a product of binomial coefficients. If it is critical to optimize the execution time, a thin/sparse/diagonal table of binomial coefficients could be stored in a lookup table stored in memory with the high level specification
$\tag 1 {\text{BC-Table}_{\;n,k} =\displaystyle {n \choose k}} \text{ where } k \approx \frac{n}{2}$
The program would employ logic to breakdown $\text{MULTINOMIAL}(a,b,..,n)$ into the product of at least one binomial coefficient and two other multinomial coefficients, trying to get the binomial factor to be in the table; if the controlling algorithm 'gives up on a table match', the work on the binomial piece would be passed to an algorithm using Steven's method.
The program would also employ parallel/concurrent programming methods on the generated pieces.
Note: One (of many) complete breakdowns of $\text{MULTINOMIAL}(a_1,a_2,\dots,a_k)$ into binomial coeficients is given by
$\tag 2 \displaystyle{\binom{n}{a_1,a_2,\dots,a_k} = \binom{n}{a_1}\binom{n-a_1}{a_2}\binom{n-a_1-a_2}{a_3}\cdots \binom{a_k}{a_k}}$
Using $\text{(2)}$ we can write
$\quad \displaystyle{\text{MULTINOMIAL}(1,2,3,4,5,6,7) = \binom{28}{1}\binom{27}{2}\binom{25}{3}\binom{22}{4}\binom{18}{5}\binom{13}{6}\binom{7}{7}}$
But our (hypothetical) program would begin with
$\tag 3 \displaystyle{\text{MULTINOMIAL}(1,2,3,4,5,6,7) =}$
$\quad \quad \quad \displaystyle{\text{MULTINOMIAL}(14,14) \times \text{MULTINOMIAL}(1,2,5,6) \times \text{MULTINOMIAL}(3,4,7)}$
If you need to check $\text{(3)}$ see wolfram.
Completing the decomposition into binomial coefficients,
$\tag 4 \displaystyle{\text{MULTINOMIAL}(1,2,5,6) =}$
$\quad \quad \quad \displaystyle{\text{MULTINOMIAL}(7,7) \times \text{MULTINOMIAL}(1,6) \times \text{MULTINOMIAL}(2,5)}$
and
$\tag 5 \displaystyle{\text{MULTINOMIAL}(3,4,7) =}$
$\quad \quad \quad \displaystyle{\text{MULTINOMIAL}(7,7) \times \text{MULTINOMIAL}(3,4) \times \text{MULTINOMIAL}(7)}$