If $n$ independent trials are made from a discrete uniform distr. with parameter $n$, what is the probability that $(n+1)^\rm{th}$ is the least one?

Question

My solution is that since the expected value of the discrete uniform distribution is $\frac{n+1}{2}$, the probability that the outcome of the second trial is the least of any two trials is $\frac{1}{2}$. Then the probability that the outcome of the $(n+1)^\text{th}$ trial being less than each of the outcomes of all the previous $n$ trials is $(\frac{1}{2})^n$.
But simulations of the experiment (in python) suggest that the correct answer is different. I have simulated the experiment for $n = 2 \text{ to } 100$, $10000$ times each, and summarized the final probabilities for various $n's$ in this graph (with $n$ on the x-axis and $p$ on the y-axis). The closest formula I could find for these results is $\frac{1}{2n}$, whose graph looks like this.
This is the python code that I ran the simulations with:

import numpy as np
def experiment(trials, n):
    count_min = 0
for _ in range(trials):
    # Generating n random observations from a discrete uniform distribution
    observations = np.random.randint(1, n+1, size=n)  # Discrete uniform distribution from 1 to n

    # Finding the minimum value among the observations
    min_observation = np.min(observations)

    # Generating the next observation from the same discrete uniform distribution
    next_observation = np.random.randint(1, n+1)

    # Checking if the next observation is the minimum among the previous ones
    if next_observation &lt; min_observation:
        count_min += 1

probability = count_min / trials
return probability


Setting the number of trials
trials = 10000  # Number of trials
results = []
for n in range(3,100):
    results.append(experiment(trials, n))
results = np.array(results)
print(results)

But how do you arrive at the right answer?

Alex · Accepted Answer · 2023-12-12T11:06:29.573

2

We compute \begin{align} P \left[\min \limits_{1 \leq i \leq n} X_i > X_{n+1} \right] &= E[ \, P[ \min \limits_{1 \leq i \leq n} X_i > X_{n+1} | X_{n+1} ] ] = E_{\omega} [ P [ \min \limits_{1 \leq i \leq n} X_i > X_{n+1}(\omega) ]] \\ &= \sum \limits_{k=1}^n P[X_{n+1}=k] \, P[ \min \limits_{1 \leq i \leq n} X_i > k] \\ &= \sum \limits_{k=1}^{n} \frac{1}{n} \, P[ \forall 1 \leq i \leq n: X_i > k] \\ &= \frac{1}{n} \sum \limits_{k=1}^{n-1} \, \left( \frac{n-k}{n} \right)^n = \frac{1}{n} \sum \limits_{k=1}^{n-1} \, \left( \frac{k}{n} \right)^n \end{align} Here, the second and fifth equality follow by independence.

Now this term does not have a closed form (see for example Faulhaber's formula), but an asymptotic expansion for $n \to \infty$: \begin{align} P \left[\min \limits_{1 \leq i \leq n} X_i > X_{n+1} \right] &= \frac{1}{n} \sum \limits_{k=1}^{n-1} \, \left( \frac{k}{n} \right)^n = \frac{1}{n} \left( \frac{1}{e-1}-\frac{1}{2n}\frac{e(e+1)}{(e-1)^3}+O\left(\frac{1}{n^2}\right) \right) \end{align}

For a numerical result, notice that indeed $\frac{1}{e-1} \approx 0.582$.

edited Dec 12 '23 at 11:06

answered Dec 12 '23 at 10:40

Alex

655

I am still trying to understand your answer, but it definitely seems to be correct, as is evident from the RMSE=0.0005637 calculated with 50,000 simulation trials for each n and the predictions by the above answer. – Ricky Dec 13 '23 at 00:14
Final simulation-prediction graph – Ricky Dec 13 '23 at 00:23
Does my formula, $(\frac{1}{2})^n$, mean anything? I mean, what would it be a solution for? – Ricky Dec 13 '23 at 07:53
@Ricky Is some part or notation unclear? I will gladly explain it.
Also, how do you arrive at the $\left( \frac{1}{2} \right)^n$? Do you reason that by the argument that for every $i$, $X_{n+1}$ has roughly the chance of $\left( \frac{1}{2} \right) ^n$ of being smaller than it?

This might seem reasonable, but it would mean that for every $i$, you had to draw a new $X_{n+1}$ which wins against $X_i$. This would be considerably more unlikely (as you said, $\left( \frac{1}{2} \right)^n$) than drawing one $X_{n+1}$ which has to be very big, but just once ...(1)
– Alex Dec 13 '23 at 08:24
@Ricky (2) ...Think of what happened if $X_1, \dots X_{n+1}$ were drawn from ${1, \dots, n+1 }$ with the restriction of being all different. Then $X_{n+1}$ would have a $\frac{1}{n+1}$ chance of just being the largest element in an array. – Alex Dec 13 '23 at 08:26
(3) Or as a different example: If $X_1, \dots X_{n+1}$ were continuously uniformly distributed (it makes the computation easier) on the interval $[0,1]$, we can calculate for $x \in [0,1]$, $$ P[\max_{1 \leq i \leq n} \leq x] = P[\forall 1 \leq i \leq n: X_i \leq x] = x^n,$$ so the density is $n x^{n-1} 1_{[0,1]}(x) $ and so, $$E[ \max_{1 \leq i \leq n} X_i ] = \int_0^1 x nx^{n-1} dx = \frac{n}{n+1}.$$ Thus, $X_{n+1}$ has to be roughly $\geq \frac{n}{n+1}$, which gives a chance of $ \frac{1}{n+1}$. – Alex Dec 13 '23 at 08:36
would you please answer this question of mine? – Ricky Dec 15 '23 at 06:06

If $n$ independent trials are made from a discrete uniform distr. with parameter $n$, what is the probability that $(n+1)^\rm{th}$ is the least one?

Setting the number of trials

1 Answers1