1

I am trying to solve Collection Coupons (Problem 14) in Fifty Challenging Problems in Probability book which is given below:

Coupons in cereal boxes are numbered $1$ to $5$, and a set of one of each is required for a prize. With one coupon per box, how many boxes on the average are required to make a complete set?

I approached the given problem in the following way. Let's try to calculate the probability of getting all the coupons at the end of opening $n$th box. This means that previous $n-1$ boxes had one digit missing. Here for the last digit, we have $5$ choices and probability of getting the last digit as coupon in the $n$th box is $1/5$. We also need to make sure that apart from last box, no other box should have the last digit. Hence for each box other than last, probability of getting any digit other than last is $4/5$. Hence final probability of ending up with $n$ boxes to complete the set should be $$\left(\frac{4}{5}\right)^{n-1}\times\frac{1}{5}\times \binom 51.$$ Now we sum it from $n = 6$ to $\infty$ with additional term $5 \times \dfrac{5!}{5^5}$, where $\dfrac{5!}{5^5}$ is the probability of getting the all $5$ digits in the first $5$ boxes itself. After computing the series and adding the last term, I get the expected value of $16.576$. The correct expected value is $11.42$. Where am I going wrong?

Henry
  • 169,616
  • 1
    Can't follow your computation. To approach it this way, you would need to compute the probability that, after the $(n-1)^{st}$ draw, you had exactly $4$ of the coupons and I don't see where you did that. – lulu Sep 29 '24 at 13:32
  • To be precise: since there are more ways to fail than just "missing exactly one coupon", the probability that you have exactly $4$ distinct coupons after $n-1$ trials is stricty less than $\left(\frac 45\right)^{n-1}$ so your computation badly overestimates the desired result. – lulu Sep 29 '24 at 13:38
  • For future reference, always check that probabilities sum to $1$. Here, for instance, you are claiming that, for $n≥6$, the probability that it takes exactly $n$ trials to get all $5$ distinct coupons is $p_n=\left(\frac 45\right)^{n-1}$. But even ignoring the issue with $n=5$, which for some reason you treat separately, we have $\sum_{n=6}^{\infty}p_n=\frac {1024}{625}>1$. – lulu Sep 29 '24 at 13:51
  • For related questions, see https://math.stackexchange.com/questions/28905/expected-time-to-roll-all-1-through-6-on-a-die and the questions linked to it – Henry Sep 29 '24 at 17:05

1 Answers1

1

To summarize the discussion in the comments:

Conceptually, this approach is very hard to pull off. To do it, you would need to compute the probability, $p_n$, that it takes exactly $n$ trials to see all $5$ distinct coupons. Here, you simply declare that, at least for $n≥6$, we have $$p_n=\left( \frac 45\right)^{n-1}$$

But that is impossible since, even if we ignore $p_5$, we have $$\sum_{n=6}^{\infty}p_n=\frac {1024}{625}>1$$

Indeed, your computation of $p_n$ incorrectly assumes that the only way to have failed after $n-1$ trials is if you have seen exactly $4$ distinct coupons but of course this is not correct.

lulu
  • 76,951
  • Thanks @lulu for the prompt answer. I now get where I was going wrong. In my computation $p_{n}$ encapsulate a scenario where total distinct coupons by the end of $n$th trial can be 1 as well, hence overestimating the probability. I want to ask if there is any elegant approach to calculate the correct value of $p_{n}$? – basilisk608 Sep 29 '24 at 18:25
  • Not really. That's why people usually take other routes to get to the expected value – lulu Sep 29 '24 at 21:18