6

Suppose I am rolling a die repeatedly, and I keep a tally of how many times each number has come up.

As soon as a number has come up 3 times, the game is over. It does not need to be 3 times in a row - the tally just needs to reach 3.

What is the expected number of rolls in a given game?

From simulation, I get an answer of approximately 7.29, but I'm trying to figure out how to solve it exactly.

I'm having trouble even beginning to frame this, so any help would be appreciated.

jwd
  • 473
  • 5
    Alternatively to the solution already posted, Poissonization yields $$\int_0^\infty 6 e^{-6t}\left(1+t+\tfrac12t^2\right)^6dt=\frac{4084571}{559872}.$$ More generally, the mean time to get at least $n+1$ times one number is $$\int_0^\infty 6 e^{-6t}\left(1+t+\tfrac12t^2+\cdots+\tfrac1{n!}t^n\right)^6dt.$$ – Did Feb 24 '15 at 18:43
  • 1
    @Did: this appeals to me, thanks. I guess I will have to learn what "Poissonization" means (: – jwd Feb 24 '15 at 19:30
  • @Did, do you mind posting that as an answer? I have been unable to understand your (obviously correct) logic – Cam.Davidson.Pilon May 25 '16 at 03:23
  • 1
    @Cam.Davidson.Pilon Consider that each possible result $i$ from a collection of $r$ (for the die, $r=6$) is produced according to an independent Poisson process with intensity $1$. Let $T$ denote the first time any result appeared at least $n+1$ times (in the question, $n=2$). Then $T=\min\limits_{1\leqslant i\leqslant r}T_i$ where $T_i$ denotes the $n+1$th time result $i$ happens hence, for every positive $t$, $P(T>t)=P(T_1>t)^r=P(N_t\leqslant n)^r$ where $N_t$ is a Poisson random variable with parameter $t$. Finally, the mean number of events until time $T$ is $rE(T)$ hence ... – Did May 25 '16 at 06:18
  • 3
    ... the mean number or rolls in a given game is $$rE(T)=r\int_0^\infty P(T>t)dt=r\int_0^\infty P(N_t\leqslant n)^rdt=r\int_0^\infty \left(e^{-t}\sum_{k=0}^n\frac{t^k}{k!}\right)^rdt.$$ – Did May 25 '16 at 06:18

3 Answers3

4

For $X = (x_1, x_2, \ldots, x_6) \in \{0,1,2,3\}^6$ let $F(X)$ be the expected number of rolls starting in a state where each number $i$ has appeared $x_i$ times. You want $F(0,\ldots,0)$. You have $F(x_1,\ldots,x_6) = 0$ if $\max(x_1, \ldots, x_6) = 3$, otherwise

$$F(x_1,\ldots, x_6) = 1 + \dfrac{1}{6} \left(F(x_1+1,x_2,\ldots,x_6) + \ldots + F(x_1,\ldots,x_5,x_6+1)\right) $$

Maple gives me $F(0,\ldots,0) = \dfrac{4084571}{559872} \approx 7.295544339$.

Robert Israel
  • 470,583
  • I'm having trouble connecting the dots, here. I understand that I want to find $F(0,\ldots,0)$. I also understand that $F(x_1,\ldots,x_6) = 0$ if $\max(x_1, \ldots, x_6) = 3$. It's the "otherwise..." part that I'm not following. Where does the leading $1 +$ come from, in the big formula? And how does the $\max(...)$ get incorporated into the final calculation? – jwd Feb 24 '15 at 19:26
  • 1
    If you don't already have three of some number, you take another roll. The $1+$ counts that roll. This roll takes you to one of the six states $(x_1+1, \ldots, x_6)$ to $(x_1, \ldots, x_6 + 1)$, and then your remaining expected number of rolls is given by $F$ of that state. – Robert Israel Feb 24 '15 at 22:45
1

Let me get you started.

After 0 rolls, you have 0 of any number.

After 1 roll, you have 1 of one number, guaranteed.

After 2 rolls, there's two possibilities: $1/6$ of the time you have 2 of one number, and $5/6$ of the time you have 1 of each of two numbers.

After 3 rolls, you can end the game ($1/6 \cdot 1/6 = 1/36$), you can 2 of one number and 1 of another ($1/6 \cdot 5/6 + 5/6 \cdot 2/6 = 15/36 = 5/12$), or you can have 1 each of three numbers ($5/6 \cdot 4/6 = 20/36 = 5/9$).

Proceed in this fashion and you will find how often it will end after each number of rolls.

Dan Uznanski
  • 11,488
1

Alternative Poissonization Solution

Note: I have no clue how @Did solved this problem using another Poissonization strategy, but I am very intrigued by its terseness.

Let $N$ be the number of rolls needed, and $X_i$ is the count of each die. The event $\{N > n\}$ is the same as the event $\{\max(X_i) \le 2\ | \; N=n \;\text{rolls} \}$, the latter which we denote $A_n$. So if $A_n$ occurs, the game is not yet over.

The equation for the expected value of $N$ is:

$E[N] = \sum^{12}_{n=0} P(N \gt n) = \sum^{12}_{n=0} P(A_n)$

(Why 12? The pigeon-hole principle states that the game can't go past 12 rounds)

Here's the Poissonization step: if we assume $N \sim \text{Poi}(\lambda)$, then $X_i$ are independent $\text{Poi}(\frac{\lambda}{6})$. (See notes in [1] for all this)

Then $P(A_n)$ is $n!$ times the coefficient of $\lambda^n$ in the expansion of $e^{\lambda}\left[e^{-\lambda}\left(1 + \frac{\lambda}{6} + \frac{\lambda^2}{2!6^2} \right)^6\right]$

Using Wolfram Alpha, we can easily get the coefficients of that expansion, copied here into Python:

from fractions import Fraction

coefs = [
    Fraction(1), 
    Fraction(1), 
    Fraction(1,2),
    Fraction(35, 216),
    Fraction(65, 1728),
    Fraction(17, 2592),
    Fraction(41, 46656),
    Fraction(17,186624),
    Fraction(65, 8957952),
    Fraction(35, 80621568),
    Fraction(1, 53747712),
    Fraction(1, 1934917632),
    Fraction(1, 139314069504),
]

Finally we compute the sum above:

from math import factorial
ev = 0
for i, coef in enumerate(coefs):
    ev += factorial(i) * coef

print ev
# 4084571/559872

[1] Probability for Statistics and Machine Learning, by DasGupta

  • The reasoning at the beginning is quite unclear but anyway, the identity $E[N] = \sum\limits^{12}_{i=1} P(A_N ;|; N)$ is incorrect, probably on several counts at the same time. Please explain carefully what you mean there. – Did May 25 '16 at 06:23
  • @Did I made some edits to the setup, what do you think? – Cam.Davidson.Pilon May 25 '16 at 12:21
  • That now the statement "The event ${N > n}$ is the same as the event ${\max(X_i) \le 2\ | ; N=n ;\text{rolls} }$" does not parse, and also that one cannot simultaneously assume that $N \sim \text{Poi}(\lambda)$ and define $N$ as the number of rolls needed. In the end, one really wonders what you call "Poissonization strategy" exactly and also, what is taken for granted and what is proven in your post. – Did May 25 '16 at 12:29
  • If I know the number of rolls $N$, is greater than 5, then I know that none of my $X_i$ counts is three or more after 5 tosses, which is the same as the max $X_i$ is less than or equal to 2 after 5 tosses. – Cam.Davidson.Pilon May 25 '16 at 12:32
  • Are you able to look at my [1]? The idea is also present in: http://www.stat.purdue.edu/~dasgupta/mult.pdf. – Cam.Davidson.Pilon May 25 '16 at 12:34
  • "If I know..." Yes, and this is how I explained the solution in my comment, but the trouble is that the sequence of characters you posted as an answer is quite far from being rigorous or even notationally coherent. For example, the chain of characters "${\max(X_i) \le 2\ | ; N=n ;\text{rolls} }$" is referring to nothing I can fathom, and in any case not to an event. – Did May 25 '16 at 12:43