47

Description

In the game of Yahtzee, 5 dice are rolled to determine a score. One of the resulting rolls is called a Yahtzee.

To roll a Yahtzee you must have 5 of a kind. (5 1's or 5 2's or 5 3's etc..).

In the game of Yahtzee you can only have 5 dice. However, for the purpose of this question I want to entertain adding more dice to the equation. Therefore I'd like to define a Yahtzee as follows:

To roll a Yahtzee you must have exactly 5 of a kind, no more or no less. (5 1's or 5 2's or 5 3's etc..).

Examples

Let's look at some rolls with 6 dice

The following would be a Yahtzee:

1 1 1 1 1 4

6 3 3 3 3 3

5 5 3 5 5 5

The following would not be a Yahtzee:

1 1 1 3 3 3

1 1 1 1 5 3

1 1 1 1 1 1

- Note that the last roll does technically contain 5 1's, however because the roll as an entirety contains 6 1's this is not a Yahtzee.


Let's look at some rolls with 12 dice

The following would be a Yahtzee:

1 1 2 1 2 1 4 4 1 3 6 2

1 1 1 1 1 2 2 2 2 2 3 3

1 1 1 1 1 2 2 2 2 2 2 2

- Note that the first roll is a Yahtzee with 5 1's, this roll is to illustrate that order doesn't matter.

- Note that the second roll has 2 Yahtzees, this is a roll that counts as a Yahtzee

- Note that the third roll has a Yahtzee with 1's but has 7 2's. This roll is a Yahtzee because it contains exactly 5 1's. The 7 2's do not nullify this roll.

The following would not be a Yahtzee:

1 1 1 2 2 2 3 3 3 4 4 4

1 1 1 1 1 1 6 6 6 6 6 6

- Note that the last roll has 6 1's and 6 6's. Because exactly 5 of one number (no more, no less) is not present, this roll does not contain a Yahtzee.

The Question

What is the optimal number of dice to roll a Yahtzee in one roll?

A more generalized form of the question is as follows: Given $n$ dice, what is the probability of rolling a Yahtzee of length $y$ in one roll.

RobPratt
  • 50,938
  • 5
    Is 11111222222222 ok or not? – Emil Jeřábek Feb 06 '20 at 16:24
  • @EmilJeřábeksupportsMonica That roll is okay. I will add an example that covers this. – Michael King Feb 06 '20 at 16:30
  • 3
    is 121212121 a Yathzee? (PS: I'm not quite sure what is the research angle...? PPS: did you try to estimate this numerically?) – ARG Feb 06 '20 at 18:19
  • 9
    This is oddly difficult. The probability of a Yahtzee from exactly five 1's out of n dice is $$\binom{n}{5}, 5^{n-5}/\ 6^n.$$For integer $n$ this is maximized at both 29 and 30. So the decision between them has to be based on, e.g. the probability of simultaneous Yahtzee's with both 1s and 2s starting from 29 vs 30 dice. –  Feb 06 '20 at 18:32
  • Even adding one layer of inclusion-exclusion does not resolve the ambiguity: https://www.wolframalpha.com/input/?i=Table%5BN%5B6+Binomial%5Bn%2C5%5D+5%5E%28n-5%29%2F6%5En+-+15+Binomial%5Bn%2C5%5DBinomial%5Bn-5%2C5%5D+4%5E%28n-10%29%2F6%5En%2C15%5D%2C+%7Bn%2C29%2C30%7D%5D –  Feb 06 '20 at 18:39
  • 2
    @ARG 121212121 is a Yahtzee with 5 1's. There is no research angle on this problem. I thought of the problem during a drive, then proceeded to spend hours with some friends with Math degrees trying to model the probability. We haven't been able to model it yet. – Michael King Feb 06 '20 at 18:54
  • 1
    @MattF. My friend and I have gotten to the same point. We thought we had something at the equation you posted, but of course quickly realized that when you multiply it by 6, you start double counting Yahtzees. – Michael King Feb 06 '20 at 18:55
  • 1
    I wrote a blog post that discusses this and related questions in some detail and generality. For example, it says “Here's the corresponding graph and table for rolling the AABBCDEF pattern on eight dice … you are more likely to roll two pair with eight 11-sided dice than you are with eight of any other sort of dice.” The article is equipped with a tabulator that will produce tables of the probability of rolling various patterns with various types and numbers of dice. https://blog.plover.com/math/yahtzee.html – MJD Feb 08 '20 at 15:45
  • (Of course, in real Yahtzee, you don't get one roll, but three, which complicates the analysis tremendously.) – MJD Feb 08 '20 at 15:48

2 Answers2

56

By inclusion-exclusion, the full probability of Yahtzee is: $$\frac{1}{6^n}\sum_{k=1}^{\min(6,n/5)} (-1)^{k+1} \binom{6}{k} (6-k)^{n-5k} \prod_{j=0}^{k-1} \binom{n-5j}{5}.$$ If you prefer, write the product with a multinomial: $$\prod_{j=0}^{k-1} \binom{n-5j}{5}=\binom{n}{5k}\binom{5k}{5,\dots,5}.$$ Looks like $n=29$ is the uniquely optimal number of dice: \begin{matrix} n &p\\ \hline 28 &0.71591452705020 \\ 29 &0.71810623718825 \\ 30 &0.71770441391497 \\ \end{matrix} enter image description here

Here is the SAS code I used:

proc optmodel;
   set NSET = 1..100;
   num p {n in NSET} = 
      (1/6^n) * sum {k in 1..min(6,n/5)} (-1)^(k+1) 
      * comb(6,k) * (if k = 6 and n = 5*k then 1 else (6-k)^(n-5*k)) 
      * prod {j in 0..k-1} comb(n-5*j,5);
   print p best20.;
   create data outdata from [n] p;
quit;

proc sgplot data=outdata;
   scatter x=n y=p;
   refline 29 / axis=x;
   xaxis values=(0 20 29 40 60 80 100);
run;
RobPratt
  • 50,938
  • 1
    If you write the probability as $$\sum_{k=1}^{\min(6,n/5)} f(n,k)$$ then $f(29,k)=f(30,k)$ for $k=1,2,3,4,5$. So the only difference between the sums is that $f(29,6)$ is omitted while $f(30,6)$ is negative, i.e. with 30 dice you have to exclude some cases with 6 Yahtzees. Meanwhile, the reason for the equalities is that $f(30,k)/f(29,k)$ includes factors of $(6-j)/(5-j)$ for each of the binomials in the final product, and these telescope to $6/(6-k)$, balancing out the factor of $(6-k)/6$ outside the binomial product. –  Feb 07 '20 at 14:22
  • 7
    @WGroleau, the event is exactly five, not at least five. Otherwise, 25 dice would be enough to get 100% probability. – RobPratt Feb 07 '20 at 16:40
  • 22
    For those not so fluent with inclusion-exclusion: The $k$th term (without the minus sign) is the number of ways to get at least $k$ numbers exactly $5$ times. There are $6 \choose k$ ways to pick which numbers appear exactly $5$ times, and $(6-k)^{n-5k}$ ways to populate the rest of the roll with the other $6-k$ numbers. The product term is the number of ways to pick $5k$ locations for the $k$ five-of-a-kinds (this is the $n\choose 5k$ factor) multiplied by the number of ways to populate the chosen locations with $5$ copies of each of the $k$ chosen numbers (this is the multinomial factor). – Timothy Chow Feb 06 '20 at 22:14
  • I don’t understand these numbers. If you roll thirty six-sided dice, the probability of having five the same is 100%. – WGroleau Feb 07 '20 at 16:16
  • @RobPratt wow! I know what I'm about to ask wasn't required to answer the question, but would you be willing to show your work? It would be helpful to me as a learning tool!

    I'm going to have my friend look over what you just posted before I accept as the answer. :)

    – Michael King Feb 06 '20 at 20:16
  • Updated just now. – RobPratt Feb 06 '20 at 20:02
  • I added the SAS code just now. The formula itself arises from the principle of inclusion and exclusion. The summation index $k$ corresponds to the number of fives of a kind. – RobPratt Feb 06 '20 at 20:23
11

As an alternative approach, we can use the symbolic method to deduce that the generating function for the class of all rolls not containing a Yahtzee is given by

$$ f(z) = (e^z - z^5/5!)^6 $$

while the generating function for all rolls is

$$ g(z) = (e^z)^6. $$

The probability that a roll of $n$ dice yields a Yahtzee is given by

$$ 1-[z^n]f(z)/[z^n]g(z). $$

Using Mathematica:

f[z_] := (Exp[z] - z^5/5!)^6;
g[z_] := Exp[z]^6;
ans[n_] := 
  1 - SeriesCoefficient[f[z], {z, 0, n}]/
    SeriesCoefficient[g[z], {z, 0, n}];
DiscretePlot[ans[n], {n, 10, 40}]

enter image description here