Expected samples to observe all unique observations

Question

Suppose I have a bag with: $6$ red balls $4$ blue balls $1$ black ball

If I sample one ball from the bag at a time, with replacement, what is the expected number of samples required before I observe at least one red, blue, and black ball. A similar scenario was described here: Expected time to roll all 1 through 6 on a die However, this scenario assumes the probability of sampling a given side of a die is the same, which is not the case here.

But the same techniques apply. Describe a state by the list of colors you have seen, so you start from the state $(0,0,0)$ and end in the state $(1,1,1)$. Then work out the expectation from each state (so E[(1,1,0)]=11$ for example. — lulu, Apr 13 '18 at 18:58
Okay, I think I understand now. To clarify, the expected samples for one of each ball is just the summation of the expected samples for sampling the individual balls? — taylor, Apr 13 '18 at 19:08
No. You do it recursively (backwards induction). You can read off $E[(1,0,1)]$ and $E[(0,1,1)]$ say. Now use that information to compute $E[(1,0,0)]$ and so on. — lulu, Apr 13 '18 at 19:22
The dice case is simpler since the only thing you need to know about a state is how many distinct values you have seen. Here, you need to keep track of exactly which values you have seen, not just the number of them. — lulu, Apr 13 '18 at 19:23
@lulu : (+1) Your approach seems best. A friendly clarification in your notation may be to define $T_{(i,j,k)}$ as the random time to end, given we start in state $(i,j,k)$, and then compute $E[T_{i,j,k}]$, for which $E[T_{1,1,0}] =1/P[black]=11$ indeed. (And $E[T_{1,1,1}]=0$ since state $(1,1,1)$ is the end.) — Michael, Apr 13 '18 at 19:23
I mention since, when I see $E[(1,1,0)]$, it looks like the expectation of a constant vector which is then $(1,1,0)$, which may be written "110" and/or confused with “11” that happens to be the correct answer under your intended interpretation. — Michael, Apr 13 '18 at 19:32
@lulu I guess it is not clear to me how to apply backward induction to determine the overall expected values based on your example (I do not have a strong background in this if you could not tell). Do you have a reference source you can refer me to? — taylor, Apr 13 '18 at 20:01

score 0 · Answer 1 · answered Apr 13 '18 at 20:22

For this situation, the possible states may be described as $S_{a,b,c}$ according to which colors have been seen. Here, the triple $(a,b,c)$ considers the colors (red, blue, black) and assigns a $1$ if you have seen the color and a $0$ other wise. Thus $S_{1,0,1}$ refers to the state in which you have seen red and black but not blue. Of course you start in $S_{0,0,0}$ and end in $S_{1,1,1}$.

Warning: What follows should be substantially correct but the arithmetic is messy and error prone, so it should be checked carefully.

We will proceed by backwards induction. Clearly $E\left[S_{1,1,1}\right]=0$.

It is easy to see that $$E\left[S_{1,1,0}\right]=11\quad E\left[S_{1,0,1}\right]=\frac {11}4\quad E\left[S_{0,1,1}\right]=\frac {11}6$$

Let's compute $E\left[S_{1,0,0}\right]$:

From state $S_{1,0,0}$ we see that we stay in that state with probability $\frac 6{11}$, we move to $S_{1,1,0}$ with probability $\frac 4{11}$ and we move to $S_{1,0,1}$ with probability $\frac 1{11}$. It follows that $$E\left[S_{1,0,0}\right]=1+\frac 6{11}\times E\left[S_{1,0,0}\right]+\frac 4{11}\times E\left[S_{1,1,0}\right]+\frac 1{11}\times E\left[S_{1,0,1}\right]$$ $$\implies E\left[S_{1,0,0}\right]=\frac {231}{20}$$

Similarly: $$E\left[S_{0,1,0}\right]=1 +\frac 6{11}\times E\left[S_{1,1,0}\right]+\frac 4{11}\times E\left[S_{0,1,0}\right]+\frac 1{11}\times E\left[S_{0,1,1}\right]$$ $$\implies E\left[S_{0,1,0}\right]=\frac {473}{42}$$

And: $$E\left[S_{0,0,1}\right]=1+\frac 6{11}\times E\left[S_{1,0,1}\right]+\frac 4{11}\times E\left[S_{0,1,1}\right]+\frac 1{11}\times E\left[S_{0,0,1}\right]$$ $$\implies E\left[S_{0,0,1}\right]=\frac {209}{60}$$

Finally we get the desired answer by noting that $$E=E\left[S_{0,0,0}\right]=1+\frac 6{11}\times E\left[S_{1,0,0}\right]+\frac 4{11}\times E\left[S_{0,1,0}\right]+\frac 1{11}\times E\left[S_{0,0,1}\right]=\frac {4919}{420}\approx \boxed {11.7119}$$

Thank you for taking the time to explain this! This appears straight forward and intuative. — taylor, Apr 13 '18 at 20:59
As I say, check it. I did it quite quickly and could easily have made a blunder or two. — lulu, Apr 13 '18 at 21:06

score 0 · Answer 2 · answered Apr 24 '18 at 14:31

I want to follow up with an additional solution I found on another page which solves my problem. They provide a general functional form for the coupon collector problem.

An explanation is provided here: Expected number of rolling a pair of dice to generate all possible sums

Expected samples to observe all unique observations

2 Answers2