Consider the following problem: every second we receive a random number from the set $ A = \{1, \ldots, n\} $. We stop when we have received all $ n $ numbers at least once. We want to know the expected value of the time it will take to receive all $ n $ numbers.
This can be modeled with the following algorithm:
ALGO(n)
1: cnt = 0
2: S = ∅
3: while |S| < n
4: j = RANDOM(1, n)
5: S = S ∪ {j}
6: cnt = cnt + 1
7: return cnt
Where RANDOM(1, n) returns a random value between 1 and $ n $ inclusive with uniform probability. Let's define the following random variables and the following statistical lemma:
- $ X $: Number of executions of line 4.
- $ X_i $: Number of executions of line 4 when $ |S| = i $.
Lemma: Given an experiment with a success probability $ p > 0 $, if we repeatedly perform the experiment independently, the expected number of attempts to get a success for the first time is $ 1/p $.
How can we determine the expected value of $ X $, the total number of executions of line 4 until all $ n $ numbers have been received at least once?
The answer I got was $ n \ln(n) $, but I'm not sure if that's correct.