1

You've got a discrete uniform distribution - what is the expected number of trials until each point is hit at least once?

I started my thinking with maybe a Geometric distribution representing each individual point - if your probability of success is 1/100 in the uniform distribution, then would the number of trials until first success be 100? That is for one point so to have 100 points = 10000 trials until you hit 100 points?? That doesn't sound right, because if you fail to hit one specific point, you've succeeded in hitting a different point, so it's not exactly a matter of first success at one point. What am I missing?

Follow up question: how about after 100 trials, what percentage of points are expected to have been hit?

Newtype
  • 11
  • 2
    Please see Wikipedia, Coupon Collector's Problem. – André Nicolas May 06 '16 at 06:10
  • Yes, it is this classical Coupon Collector's problem ; Yes, you are right when thinking to geometric distribution: The important point is that the total "waiting time" can be expressed as $T=\sum_{i=1}T_k$ where $T_i$ is the time needed to hit the $i$th point, where $T_i$ follows a geometric distribution with ( changing!) parameter $p=(n-i+1)/n$ (p=probability of success). The rest is well explained in the Wikipedia article. – Jean Marie May 06 '16 at 06:29

1 Answers1

1

Sound's like the Coupon Collector's Problem to me. Think of it this way: if you've hit a certain amount of points (let's call it n out of 100), the probability of hitting a new point is $\displaystyle\frac{100-n}{100}$. Hopefully you know that the expected number of trials until success (i.e. the expectation of a geometric distribution) is $\displaystyle\frac{1}{p}$. So here's the neat trick: the expected number of trials to hit all the points is the expected number of trials to hit the first point + expected to hit the second point +... And since each of these expectations is $\frac{1}{p}$, you end up seeing that expectation is the sum of the reciprocals of the probabilities to get a new point, which turns out to be $N$ (the total number of points) times the harmonic sum from $1$ to $N$.

kcborys
  • 594