1

If I roll, say, 20 dice, what is the probability that at least 5 of them will be the same?

Specifically, I am not asking for the probability of e.g. rolling 5 sixes out of 20 dice. For that I believe I could use the binomial distribution and arrive at ~12.9%

I have made a Monte Carlo simulation using Python, where I rolled 20 dice a million times. From each iteration (1 iteration = rolling 20 times), I took the highest number of occurrences of the same number, ignoring what number it was. Then I calculated the number of times each number of occurrences was the highest. Then I calculated cumulative probability of each max number of occurrences. From my simulation, I arrived at ~92.8% probability that at least 5 of 20 rolls are the same.

I would love to see how this could be calculated using a specific formula, similar to the binomial distribution, so that I could reproduce it and be able to calculate e.g. probability of having at least 10 the same out of 30 etc.

Many thanks in advance for your advice!

Maciej
  • 13
  • 3
  • Not so bad in this case, as there are relatively few ways to fail. Some faces must occur $4$ times...count the number of those that do. You could have ${4^5,0}$ meaning that five faces occur $4$ times each and one face does not occur. Or ${4^4,3,1}$, ${4^4,2,2}$ and so on. Easy to enumerate and easy to get the probability of each. Of course, this method becomes unwieldy as the numbers grow. – lulu Nov 09 '21 at 12:56
  • Hey @lulu, I see where your suggestion is going! I am just starting to learn all about probability, so could you help me and give an example of how you would count the number of combinations for the examples you provided? Thanks in advance! – Maciej Nov 09 '21 at 13:25
  • It's actually easier to enumerate them than it is to count them (unfortunately, as that makes it hard to be sure that you got a full list of them). I wrote down most of them already. Left off ${4^3,3^2,2}$ and ${4^2,3^4}$. I think that's all of them. – lulu Nov 09 '21 at 13:27
  • here is a discussion of the generalized counting problem. As you can see, it isn't pleasant. – lulu Nov 09 '21 at 13:30
  • I see - so in the end, being able to conduct a monte carlo simulation isn't a bad approach. Thanks for your help! – Maciej Nov 09 '21 at 13:35
  • Oh, absolutely. I would simulate this, as you have done. If nothing else, your simulator will (I assume) be flexible...so that if someone changes the $20$ to a $21$, you can quickly modify the code. Note that my counting scheme isn't very flexible at all. – lulu Nov 09 '21 at 13:37

1 Answers1

1

Such problems can be solved with generating functions, but it's best to have a computer algebra system around to do the heavy work. Readers interested in learning about generating functions can find many resources in the answers to this question: How can I learn about generating functions?

Consider the complementary problem: What is the probability that a die is rolled $20$ times and no face appears more than $4$ times? The exponential generating function for the probability that a die is rolled $n$ times and no face appears more than $4$ times is $$f(x) = \left(1 + \frac{x}{6} + \frac{1}{2!} \left( \frac{x}{6} \right)^2 + \frac{1}{3!} \left( \frac{x}{6} \right)^3 + \frac{1}{4!} \left( \frac{x}{6} \right)^4 \right)^6$$ The probability we want is $20!\; [x^{20}]f(x)$, where $[x^{20}]f(x)$ is the coefficient of $x^{20}$ when $f(x)$ is expanded. This is where a computer algebra system is handy. (I used Mathematica.) The result is $$ 20!\; [x^{20}]f(x) = \frac{151355579375}{2115832430592} = 0.0715348$$ So the answer to the original problem, the probability that at least one face appears $5$ or more times, is $1 - 0.0715348 = 0.928465$.

awkward
  • 15,626
  • Thank you very much for explaining this, @awkward! Also, thanks for actually calculating the result - now I know my simulation is correct!

    All in all, it seems that for me personally it might be better to use simulations.

    Nevertheless, it is impressive how you are able to actually come up with a function for this!

    – Maciej Nov 09 '21 at 13:51